Beng Chin OOI

Distinguished Professor
Department of Computer Science
School of Computing
National University of Singapore
Computing 1, Computing Drive, Singapore 117417

ooibc AT
Tel: +65-6516 6465
Office: COM1, #03-22


Courses    Professional Activities    DBSystem Lab     Publications     Source Codes     Research Students     黄铭钧

Short Bio:

Beng Chin is a Distinguished Professor of Computer Science and NGS faculty member at the National University of Singapore (NUS). He is an adjunct Chair Professor at Zhejiang University, a visiting Distinguished Professor at Tsinghua University, and the director of NUS AI Innovation and Commercialization Centre at Suzhou, China. He obtained his BSc (1st Class Honors) and PhD from Monash University, Australia, in 1985 and 1989 respectively. Beng Chin is a fellow of the ACM 2011, IEEE 2009, and Singapore National Academy of Science (SNAS) 2016.

Beng Chin's research interests include database systems, distributed and blockchain systems, machine learning and large scale analytics, in the aspects of system architectures, performance issues, security, accuracy and correctness. He works closely with the industry (eg. National University Hospital, Jurong Health, Tan Tok Seng Hospital, Singapore General Hospital, KK Hospital on healthcare analytics and prediabetes prevention, and banks and investment firms on financial analytics), and exploits IT and technoogy such as 5G for disruption and innovation in various appplication domains, such as healthcare, finance and smart city. He initated Apache SINGA -- the first Apache Top Level Project on distributed deep learning. He has H-index of 83 and citations of 24,000+.

Beng Chin serves as a non-executive and independent director of ComfortDelgro (listed on SGX). He is a co-founder of yzBigData(2012) for Big Data Management and analytics, and Shentilium Technologies(2016) for AI- and data-driven Financial data analytics, Hangzhou MZH Technologies for Healthcare, and MediLot Technologies(2018) for blockchain based healthcare data management and analytics. He is a member of Hangzhou Government AI Development Committee (AI TOP 30) and Suzhou Industry Park AI Development Committee.

Beng Chin was the recipient of ACM SIGMOD 2009 Contributions award, a co-winner of the the recipient of 2012 IEEE Computer Society Kanai award, 2013 NUS Outstanding Researcher Award, 2014 IEEE TCDE CSEE Impact Award, 2016 China Computer Federation (CCF) Overseas Outstanding Contributions Award, and 2020 ACM SIGMOD EF Codd Innovations Award. He was a recipient of VLDB'14 and VLDB'19 Best Paper award, 2020 ACM SIGMOD Research Highlight Award.

Beng Chin has served as a PC member for international conferences such as ACM SIGMOD, VLDB, IEEE ICDE, WWW, and SIGKDD. He had served as Vice PC Chair for ICDE'00,04,06, PC co-Chair for SSD'93 and DASFAA'05, PC Chair for ACM SIGMOD'07, Core DB PC chair for VLDB'08, and PC co-Chair for IEEE ICDE'12, IEEE Big Data'15, BOSS'18, IEEE ICDE'8, VLDB'19 Industry Track, and ACM SoCC'20.

He was an associate editor of VLDB Journal, Springer's Distributed and Parallel Databases and, IEEE Transactions on Knowledge and Data Engineering, Editor-in-Chief of IEEE Transactions on Knowledge and Data Engineering (TKDE)(2009-2012), founding co-Editor-in-Chief of Elsevier Journal of Big Data Research (2013-2015). associate editor of IEEE Transactions on Cloud Computing (TCC) and the founding editor-in-chief of ACM /IMS Transactions on Data Science (2018-2020). He is serving as an editor of Communications of ACM (CACM).

He had served as a co-chair of the ACM SIGMOD Jim Gray Best Thesis Award committee 2008-2011, a trustee of VLDB endowment 2006-2017, as its secretary 2010-2013, and president 2014-2017, and as an Advisory Board Member of ACM SIGMOD, 2012-2017. He is serving as an Overseas Council Member of China Computer Federation.

Beng Chin had participated in the last three once-every-five-years database self assessement meetings: Claremont, Berkeley 2008, Beckman, Irvine 2013, Seattle 2018.

His view on administration and setting up a strong database group could be found in 2011 SIGMOD Record interview.

Research and Systems:

With the ubiquity of Big Data and fusion of applications and technologies, the projects are related in many aspects. Beng Chin approaches research problems and system design with the philosophy that all algorithms and structures should be simple, elegant and yet efficient so that they are implementable, maintainable and scalable in actual applications, and all systems must be efficient, scalable, extensible and easy to use. Beng Chin's ongoing/recent large system projects include:

  1. 5GEdgelet: He works on declrative network (network slicing and verification), federated learning and model compression, management of micro databases and applications in the context of edge computing and 5G mobile network.
  2. FabricSharp(2015-): FabricSharp is an open source blockchain system based on Hyperledger Fabric. It improves the original version on consensus model, execution engine, storage engine and smart contract checking. He has also designed and open sourced a comprehensive blockchain benchmarking framework called bLOCKBENCH (in 2016), and done a comprehensive survey on the performance issues of blockchain systems. FabricSharp is the backend of MediLOT, a healthcare blockchain system, which is patient centric and supports decentralized, personalized medicine and healthcare data analytics. He is the lead co-PI of Singapore Blockchain Innovation Programme (SBIP).
  3. ForkBase and ForkCloud (2015-): ForkBase is an efficient tamper-proof data storage system designed to provide efficient support and fast development of forking-enabled applications, such as "GIT-for-Data", tamper-evident Blockchain, collaborative analytics and OLTP with versioning. ForkBase is deployed as the storage engine of FabricSharp. ForkCloud is a GIT-for-Data system that encapsulates data cleansing, crowdsourcing, ML modelling and validating, and versioning to facilitate AI development on sensitive data.
  4. Apache SINGA(2014-): SINGA is a distributed Deep Learning library (indirectly funded by ASTAR, MOE and NRF CRP grants). Apache SINGA is an Apache Top Level Project, open source distributed platform for deep learning amd machine learning models, that has been designed based on four principles: usability, scalability, extensibility and elasticity. Apache SINGA v3.0.0 has AutoML features, and a Healthcare model zoo which contains deep learning models that have been used for healthcare research, and facility for porting other models onto SINGA. He highlighted the challenges and opportunities of exploiting AI/ML on improving database system usabability and performance in SIGMOD Record 2016. SINGA-lite, SINGA-easy and SINGA-db are upcoming releases.
  5. GEMINI (2011-): GEMINI is a healthcare AI stack. He works closely with a number of hospitals, understands their needs, and builds an end-to-end data processing and analytics stack. GEMINI end-to-end stack supports data cleansing (DICE), crowdsourcing (CDAS), ML-based predictive analytics (SINGA), cohort analysis (CohAna), and data versioning and management (ForkBase). He works with five hospitals on prediabetes prevention (eg. JurongHealth), and NUH and SGH on various disease specific predictive analytics (eg. DPM, AKI, readmission modelling). He is the lead co-PI of AI.SG explainable AI for healthcare grand challenge project.
  6. CIIDAA(2012-2018): a Comprehensive IT Infrastructure for Data-intensive Applications and Analysis is an CRP project funded by NRF (NRF-CRP8-2011-08) from 2013-2017. The main objective is to use cloud computing to address the Big Data problem. For specific applications, this approach has been shown to be effective, and systems such as Hadoop have become very popular. However, they have limitations (see ACM Computing survey paper on MapReduce based systems and IEEE TKDE Survey on in-memory systems), and are suitable only for a class of applications that have a structure amenable to fine-grain asynchronous parallelization. Furthermore, there remain many challenges in actually using cloud computing systems in practice, including issues of resource contention across multiple jobs being run concurrently. The aim of this project is to develop a platform for supporting real-time data integration and predictive real-time analytics in the area of web consumers (collaborating with Starhub) and healthcare (collaborating with NUH, National University Health System).
  7. epiC(2009-2013): an Elastic, Power-aware, data-Intensive Cloud platform, funded by an MOE grant (2010-2012). The objectives are to design and implement an efficient multi-tenancy cloud system for supporting high throughout low latency transactions and high performance reliable query processing, with online and interactive analytics capability. memepiC (2014-) is an extension of epiC project focusing on exploiting hardware features, multi-cores and large memory. Related earlier project: UTab.
  8. LogBase(2012-2016): a distributed log-structured data management system, funded by ASTAR (2013-2016). LogBase adopts log-only storage to handle high append and write load, such as Urban/Sensor information processing. Indexing, transaction management and query processing are the key issues that have been investigated and source codes have released. LogBase is related to an ongoing research on database support for Energy and Environmental Sustainability Solutions for Megacities.
  9. CDAS(2011-2015): a Crowdsourcing Data Analytics System that has been designed to improve the quality of query results and effectively reduce the processing cost at the same time. It is being built as a crowdsourcing system that provides primitive operators to facilitate composition of crowdsourcing tasks. Other key issues such as privacy and applicability, and various applications are being investigated.

This document, index.html, has been accessed 29104 times since 22-Oct-18 16:47:40 SGT. This is the 36th time it has been accessed today.

A total of 13696 different hosts have accessed this document in the last 929 days; your host,, has accessed it 1 times.

If you're interested, complete statistics for this document are also available, including breakdowns by top-level domain, host name, and date.