Beng Chin OOI

Chair Professor
Department of Computer Science
School of Computing
National University of Singapore
Computing 1, Computing Drive, Singapore 117417

ooibc AT
Tel: +65-6516 6465
Office: COM1, #03-22


Courses    Professional Activities    Projects     Publications     Source Codes     Research Students     Blogs (blog)     Baidu Baike Bio

Short Bio:

Beng Chin is a Distinguished Professor of Computer Science, NGS faculty member and Director of Smart Systems Institute (SSI@NUS) at the National University of Singapore (NUS), an adjunct Chang Jiang Professor at Zhejiang University, China, and the director of NUS AI Innovation and Commercialization Centre at Suzhou, China. He obtained his BSc (1st Class Honors) and PhD from Monash University, Australia, in 1985 and 1989 respectively.

Beng Chin's research interests include database systems, distributed and blockchain systems, machine learning and large scale analytics, in the aspects of system architectures, performance issues, security, accuracy and correctness. He works closely with the industry (eg. NUHS, Jurong Health, Tan Tok Seng Hospital, Singapore General Hospital, KK Hospital on healthcare analytics and prediebetes prevention), and exploits IT for efficiency in various appplication domains, including healthcare, finance and smart city. He is a co-founder of yzBigData(2012) for Big Data Management and analytics, and Shentilium Technologies(2016) for AI- and data-driven Financial data analytics, MediLot Technologies(2018) for blockchain based healthcare data management and analytics. an advisory council member of a RegTech company, Cynopsis Solutions, and an advisor to blockchain based KYC ICO. Beng Chin serves as a non-executive and independent director of ComfortDelgro and a member of Hangzhou Government AI Development Committee (AI TOP 30)

Beng Chin is a fellow of the ACM , IEEE, and Singapore National Academy of Science (SNAS).

He was the recipient of ACM SIGMOD 2009 Contributions award, a co-winner of the 2011 Singapore President's Science Award, the recipient of 2012 IEEE Computer Society Kanai award, 2013 NUS Outstanding Researcher Award, 2014 IEEE TCDE CSEE Impact Award, and 2016 China Computer Federation (CCF) Overseas Outstanding Contributions Award. He was a recipient of VLDB'14 Best Paper award.

Beng Chin has served as a PC member for international conferences such as ACM SIGMOD, VLDB, IEEE ICDE, WWW, and SIGKDD, and as Vice PC Chair for ICDE'00,04,06, PC co-Chair for SSD'93 and DASFAA'05, PC Chair for ACM SIGMOD'07, Core DB PC chair for VLDB'08, and PC co-Chair for IEEE ICDE'12 and IEEE Big Data'15. He is serving as a PC Chair for IEEE ICDE'18. He was an associate editor of VLDB Journal, Springer's Distributed and Parallel Databases and, IEEE Transactions on Knowledge and Data Engineering, Editor-in-Chief of IEEE Transactions on Knowledge and Data Engineering (TKDE)(2009-2012), Elsevier's founding co-Editor-in-Chief of Journal of Big Data Research (2013-2015), and a co-chair of the ACM SIGMOD Jim Gray Best Thesis Award committee. He served as a trustee of VLDB endowment 2006-2017, as its secretary 2010-2013, and president 2014-2017, and as an Advisory Board Member of ACM SIGMOD, 2012-2017. He is serving as an associate editor of IEEE Transactions on Cloud Computing (TCC) and Communications of ACM (CACM), and the founding editor-in-chief of ACM Transactions on Data Science (2018 -).

Beng Chin participated/participates in once-every-five-years database self assessement meetings: Claremont, Berkeley 2008, Beckman, Irvine 2013, Seattle 2018.

Research and Systems:

Beng Chin's ongoing/recent large system projects include:

  1. Hyperledger++s(2015-): He works on benchmarking, and performance issues of blockchain systems, in particular, on consensus model, execution engine and storage engine. His group designed a comprehensive blockchain benchmarking framework and open source called BLOCKBENCH. ForkBase (second version of UStore) is a distributed data storage system that has rich semantics and a set of properties for supporting next generation applications. It is a native storage engine designed to provide efficient support and fast development of forking-enabled applications, such as GIT-like versioning, tamper-evident Blockchain, Collaborative Analytics and OLTP with versioning, with a great degree of flexibility, high-level semantics and performance. It synthesizes various ideas from distributed systems, databases and security to support efficient forking and execution. Hyperledger++ is used to build MediLOT, a healthcare blockchain system, which is patient centric and supports decentralized, personalized medicine and healthcare data analytics.
  2. SINGA(2014-): a distributed Deep Learning platform (indirectly funded by an ASTAR grant and NRF CRP). Apache SINGA is an Apache Incubator open source, distributed training platform for deep learning amd machine learning models, and is designed based on three principles, namely, usability, scalability and extensibility. For usability, we make the programming model of SINGA easy to follow. Specifically, users construct their models by based on Layers and Tensors, and SINGA's runtime takes care of (and is optimized for) the distributed execution and communication between nodes. Scalability is achieved by partitioning both the training data and the model, and distributing the training over multiple nodes. We make the code of SINGA modular and extensible to support different types of deep learning models, optimization algorithms and training frameworks, on both CPU and GPU clusters. Apache SINGA (incubating) v1.0 has been released. It has a Healthcare model zoo which contains deep learning models that have been used for healthcare research, and also facility for porting Caffe models onto SINGA. We now work towards AI-as-a-Service platform to enable exploration, feature selection, and model tuning and validation. Co-Space is an earlier system designed for supporting cross-domain retrieval that led to the development of SINGA.
  3. CIIDAA(2012-2018): a Comprehensive IT Infrastructure for Data-intensive Applications and Analysis is an CRP project funded by NRF (NRF-CRP8-2011-08) from 2013-2017. The main objective is to use cloud computing to address the Big Data problem. For specific applications, this approach has been shown to be effective, and systems such as Hadoop have become very popular. However, they have limitations (see ACM Computing survey paper on MapReduce based systems and IEEE TKDE Survey on in-memory systems), and are suitable only for a class of applications that have a structure amenable to fine-grain asynchronous parallelization. Furthermore, there remain many challenges in actually using cloud computing systems in practice, including issues of resource contention across multiple jobs being run concurrently. The aim of this project is to develop a platform for supporting real-time data integration and predictive real-time analytics in the area of web consumers (collaborating with Starhub) and healthcare (collaborating with NUH, National University Health System).
  4. epiC(2009-2013): an Elastic, Power-aware, data-Intensive Cloud platform, funded by an MOE grant (2010-2012). The objectives are to design and implement an efficient multi-tenancy cloud system for supporting high throughout low latency transactions and high performance reliable query processing, with online and interactive analytics capability. memepiC (2014-) is an extension of epiC project focusing on exploiting hardware features, multi-cores and large memory. Related earlier project: UTab.
  5. LogBase(2012-2016): a distributed log-structured data management system, funded by ASTAR (2013-2016). LogBase adopts log-only storage to handle high append and write load, such as Urban/Sensor information processing. Indexing, transaction management and query processing are the key issues that have been investigated and source codes have released. LogBase is related to an ongoing research on database support for Energy and Environmental Sustainability Solutions for Megacities.
  6. CDAS(2011-2015): a Crowdsourcing Data Analytics System that has been designed to improve the quality of query results and effectively reduce the processing cost at the same time. It is being built as a crowdsourcing system that provides primitive operators to facilitate composition of crowdsourcing tasks. Other key issues such as privacy and applicability, and various applications are being investigated.
With the ubiquity of Big Data and fusion of applications and technologies, the projects are related in many aspects. Beng Chin approaches research problems and system design with the philosophy that all algorithms and structures should be simple, elegant and yet efficient so that they can be easily grafted into existing systems and they are implementable, maintainable and scalable in actual applications, and all systems must be efficient, scalable, extensible and easy to use. A good example would be his approach towards the design of new indexes; they are mainly B+-tree based -- simple and elegant in design, and efficient, robust and scalable in performance (eg. TP-index[ICDE1994], ST B-tree[DKE1995], iMinMax[PODS2000], iDistance[VLDB2001, TODS2005], B^x-tree[VLDB2004], GiMP[TODS2005], ST^2B-tree[SIGMOD2008, TODS2010], B^{ed}-tree[SIGMOD2010], String Indexing[TKDE2014]). Recently, due to the change in h/w architecture and capability, he has been working on an index that is more scalable and efficient for the environment ( PI[CoRR 2015]). Again, the index has to be simple, elegant and fast!

What goes around, comes around ....