We are building a suite of tools for structured data analytics, from structured data regularization, to structured data modelling and NAS (auto generation of NN). The aim is to support easy-to-use, fast and accurate modeling of structured data using deep learning networks. Together with our earlier work on semi-structured healthcare data, we hope to help users/companies to tame the data for effective extraction of insights and knowledge.
The fifth-generation (5G) mobile communication technologies are on the way to be adopted as the next standard for mobile networking. We examine the impact of 5G on both traditional and emerging technologies, and explore research challenges and opportunities on computing and data management.
AI- and Data-driven Financial Management and Analytics is a collective of FinTech projects started from 2016 by leading research groups from National University of Singapore. The objective is to design and implement various algorithms and systems for facilitating better finance applications in digital banking. The featured projects include secure, immutable, and verifiable storage; federated learning and analytics with privacy protection; investment and risk management; and fraud detection.
Discover is a static analysis tool that can automatically find security bugs and vulnerabilities in program written in both general-purpose programming languages, like C, C++, Java, Go, or domain-specific languages, like Solidity (for writing smart contract). Discover is developed using the OCaml programming language.
MLCask is a Git-like end-to-end ML life-cycle management system. In real-world machine learning (ML) applications, maintaining an ML pipeline in a collaborative environment is significant and challenging. The costs of frequent retraining and asynchronous component update by different users need to be taken into consideration. MLCask supports both linear and non-linear version control semantics for efficient management of ML pipelines.
a distributed Deep Learning platform (indirectly funded by ASTAR, MOE and NRF CRP grants). Apache SINGA is an Apache Top Level Project, open source distributed training platform for deep learning amd machine learning models, and has been designed based on four principles, namely, usability, scalability, extensibility and elasticity. Apache SINGA v2.0.0 has AutoML features, and a Healthcare model zoo which contains deep learning models that have been used for healthcare research, and facility for porting other models onto SINGA. SINGA-lite, SINGA-easy and SINGA-db are upcoming releases.
Cool is an online cohort analytical processing system that supports various types of data analytics,including cubequery, ice-berg query and cohort query. The objective of Cool is to provide high performance (near real-time) analytical response for emerging data warehouse domain.
GEMINI is a healthcare AI stack. Working closely with a number of hospitals, understands their needs, and builds an end-to-end data processing and analytics stack. GEMINI end-to-end stack supports data cleansing (DICE), crowdsourcing (CDAS), ML-based predictive analytics (SINGA), cohort analysis (CohAna), and data versioning and management (ForkBase). Collaboration with five hospitals on prediabetes prevention (eg. JurongHealth), and NUH and SGH on various disease specific predictive analytics (eg. DPM, AKI, readmission modelling)
ForkBase is an efficient tamper-proof data storage system designed to provide efficient support and fast development of forking-enabled applications, such as "GIT-for-Data", tamper-evident Blockchain, collaborative analytics and OLTP with versioning. ForkBase is deployed as the storage engine of Hyperledger++. ForkCloud is a GIT-for-Data system that encapsulates data cleansing, crowdsourcing, ML design and testing, and versioning to facilitate AI development on sensitive data
Working on benchmarking, and performance issues of blockchain systems, in particular, on consensus model, execution engine and storage engine. The group designed a comprehensive blockchain benchmarking framework and open source called BLOCKBENCH. FabricSharp is the blockchain backend of MediLOT, a healthcare blockchain system, which is patient centric and supports decentralized, personalized medicine and healthcare data analytics
CDAS is a Crowdsourcing Data Analytics System. The objectives are to design and implement an effective system to exploit the crowd intelligence for improving the performance of different data analytics jobs
CIIDAA is a large scale, Comprehensive IT Infrastructure for Data-intensive Applications and Analysis