Current Incubating Projects



Research inventions have the power to transform the World through knowledge sharing and translation. Our objective is to partner with universities, companies and SME to bring inventions to the World.



TezSign is a digital signing platform that aims to disrupt the signing experience and build the next-generation signing app on Web3.0. It provides an easy-to-use interface for users to sign and manage documents. Our underlying technologies are based on blockchain and decentralized identity, which offers flexible authentication, strong privacy protection, and smart document verification, bringing more trust to the signing process in the digital world.



Verazt is a collection of tools that aim to find bugs and vulnerabilities in smart contracts of the Ethereum, Solana, and Hyperledger Fabric blockchains. Our underlying technologies are based on the combination of static code analysis, dynamic testing through fuzzing. We also support verifying business logic of smart contracts using mathematical-logic-based formal specifications and automated deductive verification.

Current Projects

Structured Data


We are building a suite of tools for structured data analytics, from structured data regularization, to structured data modelling and NAS (auto generation of NN). The aim is to support easy-to-use, fast and accurate modeling of structured data using deep learning networks. Together with our earlier work on semi-structured healthcare data, we hope to help users/companies to tame the data for effective extraction of insights and knowledge.



Falcon is a privacy-preserving, efficient, and incentive-aware federated learning platform. It focuses on cross-silo data collaboration. We consider privacy protection the most important and design a set of privacy-preserving machine learning algorithms based on advanced techniques such as partially homomorphic encryption and secure multi-party computation.



MLCask is a Git-like end-to-end ML life-cycle management system. In real-world machine learning (ML) applications, maintaining an ML pipeline in a collaborative environment is significant and challenging. The costs of frequent retraining and asynchronous component update by different users need to be taken into consideration. MLCask supports both linear and non-linear version control semantics for efficient management of ML pipelines.



a distributed Deep Learning platform (indirectly funded by ASTAR, MOE and NRF CRP grants). Apache SINGA is an Apache Top Level Project, open source distributed training platform for deep learning amd machine learning models, and has been designed based on four principles, namely, usability, scalability, extensibility and elasticity. Apache SINGA v2.0.0 has AutoML features, and a Healthcare model zoo which contains deep learning models that have been used for healthcare research, and facility for porting other models onto SINGA. SINGA-lite, SINGA-easy and SINGA-db are upcoming releases.

Cohort Study


Cohort is a typical panel study to investigate individuals who share a common characteristic and is powerful in identifying the key factors when analyzing the causal relationship. We are conducting a series of AI- and Database-driven cohort study to efficiently and effectively discover diverse cohorts, analyze interpretable cohort patterns, and leverage significant cohort results for time series data. Cohort study can assist in diverse applications, including healthcare, fraud detection, finance analysis, etc. More AI-related works for cohort study are in coming.



GEMINI is a healthcare AI stack. Working closely with a number of hospitals, understands their needs, and builds an end-to-end data processing and analytics stack. GEMINI end-to-end stack supports data cleansing (DICE), crowdsourcing (CDAS), ML-based predictive analytics (SINGA), cohort analysis (CohAna), and data versioning and management (ForkBase). Collaboration with five hospitals on prediabetes prevention (eg. JurongHealth), and NUH and SGH on various disease specific predictive analytics (eg. DPM, AKI, readmission modelling)



ForkBase is an efficient tamper-proof data storage system designed to provide efficient support and fast development of forking-enabled applications, such as "GIT-for-Data", tamper-evident Blockchain, collaborative analytics and OLTP with versioning. ForkBase is deployed as the storage engine of Hyperledger++. ForkCloud is a GIT-for-Data system that encapsulates data cleansing, crowdsourcing, ML design and testing, and versioning to facilitate AI development on sensitive data



Working on benchmarking, and performance issues of blockchain systems, in particular, on consensus model, execution engine and storage engine. The group designed a comprehensive blockchain benchmarking framework and open source called BLOCKBENCH. FabricSharp is the blockchain backend of MediLOT, a healthcare blockchain system, which is patient centric and supports decentralized, personalized medicine and healthcare data analytics



Food(lg) is an efficient and easy-to-use food journaling, nutrition tracking, and analysis platform. Powered by a deep-learning framework, and based on standard nutritional guidelines, Food(lg) is designed to record daily nutrient estimates with journal entries for achieving a well-balanced diet. Food(lg) maintains a daily record of users' eating diet and activities. In addition, Food(lg) provides a quick and easy way for reviewing dietary intake, eating plans, and smart suggestions based on the records. Food(lg) also supports social networks to browse, discover and share food.



As most of today's data lives in the cloud, security is crucial because the cloud databases are run by potentially malicious third parties. We are building verifiable database systems to satisfy the demand of emerging applications, such as blockchains, collaborative analytics, and crowdsourcing, where the integrity of the data and operations on the data must be guaranteed. In addition to the security guarantee, the systems we built target high performance, scalability and usability.



In the wake of rapid advancements in artificial intelligence (AI), we stand on the brink of a transformative leap in data systems. We are building NeurDB, our next-generation data system designed to fully embrace AI design in each major system component and provide in-database AI-powered analytics.



Amid groundbreaking AI advancements, NewsLLM emerges as a pivotal innovation in financial analytics. This advanced stock prediction system leverages AI to transform news and financial data into precise trading decisions and stock forecasts, revolutionizing market analysis and empowering investors.

Past Projects



AI- and Data-driven Financial Management and Analytics is a collective of FinTech projects started from 2016 by leading research groups from National University of Singapore. The objective is to design and implement various algorithms and systems for facilitating better finance applications in digital banking. The featured projects include secure, immutable, and verifiable storage; federated learning and analytics with privacy protection; investment and risk management; and fraud detection.



CDAS is a Crowdsourcing Data Analytics System. The objectives are to design and implement an effective system to exploit the crowd intelligence for improving the performance of different data analytics jobs