Research



Overview


My research interests lie at the intersection of information theory, machine learning, and high-dimensional statistics, with ongoing areas of interest including the following:
  • Information-theoretic understanding of statistical inference and learning problems
  • Adaptive decision-making under uncertainty (e.g., Bayesian optimization, bandits)
  • Scalable algorithms for large-scale inference and learning (e.g., group testing, graph learning)
  • Robustness considerations in machine learning
For further details, see the outlines below, and my publications page.

If you have been admitted to the NUS PhD program and are looking for a supervisor, feel free to email me to arrange a meeting. Other prospective PhD applicants are also welcome to get in touch, but I apologize that I may not reply to most enquiries. Admission to NUS can be done through the Department of Computer Science, the Department of Mathematics, or the Institute of Data Science.

If you would like to apply for a post-doc or research assistant position, please send me your CV and an outline of your research interests. Applicants should have a strong track record in an area related to my research interests, such as machine learning, information theory, statistics, statistical signal processing, or theoretical computer science.


Research Group

  • Zhaoqiang Liu (post-doc)
  • Daming Cao (post-doc)
  • Qiaoqiao Zhou (post-doc)
  • Thach Bui (post-doc)
  • Eric Han (PhD student)
  • Xu Cai (PhD student)
  • Arpan Losalka (PhD student)
  • Sun Yang (PhD student)
  • Zihan Li (PhD student)
  • Ivan Lau (research assistant)
Former members: Lan Truong (U. Cambridge), Anamay Chaturvedi (Northeastern), Selwyn Gomes (UCSD)


Research Funding

  • (May 2019 - May 2024) Robust Statistical Model Under Model Uncertainty, Singapore National Research Foundation (NRF) Fellowship ($2.29M)
  • (Nov. 2018 - Oct. 2021) Information-Theoretic Methods in Data Science, NUS Early Career Research Award ($500k)
  • (Jan. 2018 - Jan. 2021) Theoretical and Algorithmic Advances in Noisy Adaptive Group Testing, NUS Startup Grant ($180k)

Ongoing Research Projects


Some potential research projects that students and post-docs could pursue with me are listed below; this list is far from exhaustive.

1) Information-theoretic limits of statistical inference and learning problems


The field of information theory was introduced as a means for understanding the fundamental limits of data compression and transmission, and has shaped the design of practical communication systems for decades. This project will pursue the emerging perspective that information theory is not only a theory of communication, but a far-reaching theory of data benefiting diverse inference and learning problems such as estimation, prediction, and optimization. This perspective leads to principled mathematical approaches to certifying the near-optimality of practical algorithms, and steering practical research towards where the greatest improvements are possible.

Selected relevant publications:

2) Modern methods for high-dimensional estimation and learning


Extensive research over the last 1-2 decades has led to a variety of powerful techniques for high-dimensional estimation and learning, with the prevailing approach being to introduce low-dimensional modeling assumptions such as sparsity, low-rankness, and graphical model structure. Recently, there has been a paradigm shift towards data-driven techniques, including the replacement of explicit modeling assumptions by implicit generative models based on deep neural networks. In comparison to traditional approaches, this line of works remains in its infancy; this project explores this exciting new research avenue from both a theoretical and practical perspective.

Selected relevant publications:

3) Robustness considerations in machine learning


Robustness requirements pose many of the most important unsolved challenges in modern machine learning, arising from sources of uncertainty such as mismatched modeling assumptions, corrupted data, and the presence of adversaries. For instance, large distributed learning systems must be able to deal with individual node failures, robotics tasks learned in a simulated environment should be designed to degrade as little as possible when transferred to a real environment, and robustness against adversarial attacks remains a considerable unsolved challenge in deep learning, just to name a few examples. This project will seek to better understand some of the most practically pertinent sources of uncertainty and develop new algorithms that are robust in the face of this uncertainty, with rigorous guarantees.

Selected relevant publications:

4) Theory and algorithms for group testing


Group testing is a classical sparse estimation problem that seeks to identify "defective" items by testing groups of items in pools, with recent applications including database systems, communication protocols, and COVID-19 testing. A recent line of works has led to significant advances in the theory of group testing, including the development of precise performance limits and practical algorithms for attaining them. This project seeks to push these advances further towards more challenging settings that better account for crucial practical phenomena, including noisy outcomes, testing constraints, prior information, and non-binary measurements.

Selected relevant publications:

5) Theory and algorithms for Bayesian optimization


Bayesian optimization (BO) has recently emerged as a versatile tool for optimizing "black-box" functions, with particular success in automating machine learning algorithms by learning a good set of hyperparameters (e.g., used in the famous AlphaGo program), as well as other applications such as robotics and materials design. A widespread modeling assumption in BO is that the function is well-approximated by a Gaussian process, whose smoothness properties are dictated by a kernel function. This project seeks to advance the current state-of-the-art theory and algorithms for BO with an emphasis on practical variations that remain lesser-understood, including model misspecification, adversarial corruptions, and high dimensionality.

Selected relevant publications: