General Info

I am currently a Ph.D candidate in School of Computing, National University of Singapore. I was pleased to join the database research group in August 2011 and is supervised by Professor Beng Chin Ooi.

I obtained my BSc from Department of Computer Science and Technology, Harbin Institute of Technology, China in 2011.


As a member of NUS Database Research Group, I am interested in solving database problem in large data scale. My work mainly focus on log-structured system, especially in indexing, query processing and data management.


UStore: A Universal Data Storage System

UStore is a distributed data storage system which has rich semantics and a set of properties that unifies and adds values to many classes of today's applications. By keeping the core properties within the storage, UStore is designed to support various applications with reduced development effort, with flexibility, high-level semantics and performance at hand.

SINGA: A Distributed Deep Learning Platform

SINGA is an Apache Incubator open source, distributed training platform for deep learning models, and is designed based on three principles, namely, usability, scalability and extensibility. A variety of popular deep learning models are supported. SINGA architecture is sufficiently flexible to run synchronous, asynchronous and hybrid training frameworks. I am one of the main developers since the project started in 2014.

LogBase: Scalable Log-structured Database

The LogBase project aims to develop a scalable log-structured database that supports very high write throughput in addtion to other functionalities including dynamic scalability, multi-version data access, transactional semantics for bundled read and write operations, and fast recovery from machine failures. I joined the LogBase project in August 2011.

epiC: Elastic Power-aware data Intensive Cloud

The epiC project aims to build an elastic, power-aware, data-intensive cloud computing platform for large-scale services, supporting high throughout low latency transactions and high performance reliable query processing. It is to bridge the performance gap between data intensive analytical jobs and online-transactions. I joined the epiC project in November 2011.


  • Sheng Wang, David Maier, Beng Chin Ooi. Fast and Adaptive Indexing of Multi-Dimensional Observational Data. Int'l Conference on Very Large Data Bases (VLDB), 2016.
  • Beng Chin Ooi, Kian-Lee Tan, Sheng Wang, Wei Wang, Qingchao Cai, Gang Chen, Jinyang Gao, Zhaojing Luo, Anthony K.H. Tung, Yuan Wang, Zhongle Xie, Meihui Zhang, Kaiping Zheng. SINGA: A Distributed Deep Learning Platform. ACM Multimedia, 2015.
  • Wei Wang, Gang Chen, Tien Tuan Anh Dinh, Jinyang Gao, Beng Chin Ooi, Kian-Lee Tan, Sheng Wang. SINGA: Putting Deep Learning in the Hands of Multimedia Users. ACM Multimedia, 2015.
  • Jinyang Gao, H.V. Jagadish, Beng Chin Ooi, Sheng Wang. Selective Hashing: Closing the Gap between Radius Search and k-NN Search. ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2015.
  • Sheng Wang, David Maier, Beng Chin Ooi. Lightweight Indexing of Observational Data in Log Structured Storage. Int'l Conference on Very Large Data Bases (VLDB), 2014.[code]
  • Sai Wu, Xiaoli Wang, Sheng Wang, Zhenjie Zhang, Anthony K.H. Tung: K-Anonymity for Crowdsourcing Database. IEEE Transactions on Knowledge and Data Engineering, 15 May 2013 (preprint).
  • Hoang Tam Vo, Sheng Wang, Divyakant Agrawal, Gang Chen, Beng Chin Ooi. LogBase: A Scalable Log-Structured Database System in the Cloud. Int'l Conference on Very Large Data Bases (VLDB), 2012.