WONG Lim Soon

KITHCT Chair Professor
Deputy Dean, NUS Graduate School
Director, Integrative Sciences and Engineering Programme

  • B.Sc. (Engineering)(Computing Science) with First Class Honours (Imperial College London)
  • Ph.D. (Computer & Information Science, University of Pennsylvania)

Wong Limsoon is Kwan-Im-Thong-Hood-Cho-Temple Professor in the School of Computing at the National University of Singapore (NUS). He was also a professor (now honorary) of pathology in the Yong Loo Lin School of Medicine at NUS. Before coming to NUS, he was the Deputy Executive Director for Research at A*STAR's Institute for Infocomm Research. He currently works mostly on knowledge discovery technologies and their application to biomedicine. He has also done, in the earlier part of his career, significant research in database query language theory and finite model theory, as well as significant development work in broad-scale data integration systems. Limsoon is a Fellow of the ACM, inducted for his contributions to database theory and computational biology. Some of his other awards include the 2003 FEER Asian Innovation Gold Award for his work on treatment optimization of childhood leukemias, the 2006 Singapore Youth Award Medal of Commendation for his sustained contributions to science and technology, and the ICDT 2014 Test of Time Award for his work on naturally embedded query languages. He co-founded Molecular Connections in India, and has served as its chairman for over a decade, seeing the company growing progressively to some 2,000 information curators, software engineers, research scientists.

RESEARCH INTERESTS

  • Database Theory and Systems

  • Bioinformatics and Computational Biology

  • Knowledge Discovery and Datamining

RESEARCH PROJECTS

Enabling more sophisticated proteomic profile analysis

Quantitative comparison of samples is central to proteomics. However, biomarkers identified in one batch are quite often not consistent and not reproducible in another batch of samples. We developed techniques based on biological networks to more reproducibly and consistently identify biomarkers and achieve more reliable proteomic-based diagnosis.


Dealing with confounders in omics analysis

Universality and reproducibility problems are commonly encountered in analyzing omics data due to etiology and human variability, but also batch effects, poor experiment design, inappropriate sample size, and misapplied statistics. Here, we explore a deeper rethink on the mechanics of applying statistical tests, and design analysis techniques that are robust on omics data.


Transcription factor interaction prediction and classification

Regulatory mechanisms often involve several transcription factors (TF), binding together and attaching to the DNA as a single complex. But only a fraction of the regulation partners of each TF is currently known. We developed techniques for predicting the physical interaction between TFs, as well as for predicting the nature of their interactions (i.e. co-operative, competitive, or others).


From iteration on multiple collections in synchrony to fast general interval joins

Synchrony iterator captures a programming pattern for synchronized iterations. It is a conservative extension that enhances the repertoire of algorithms expressible in comprehension syntax. In particular, efficient general synchronized iterations, e.g. linear-time algorithms for low-selectivity database non-equijoins, become expressible naturally in comprehensinon syntax.

TRL 4

Recovering missing proteins based on biological complexes

We propose a novel ranking strategy for missing-protein recovery based on protein complexes. We postulate that protein complexes provide a good context for making inference of a protein's presence and its abundance. Notably, it is applicable for predicting whether a protein is present even when there is only one sample.

TRL 4

RESEARCH GROUPS

TEACHING INNOVATIONS

SELECTED PUBLICATIONS

  • Limsoon Wong. A dichotomy in the intensional expressive power of nested relational calculi augmented with aggregate functions and a powerset operator.Proceedings of 32nd ACM Symposium on Principles of Database Systems, pages 285-295, New York, June 2013.
  • Stefano Perna, Val Tannen, Limsoon Wong. Iterating on multiple collections in synchrony. Journal of Functional Programming, 32:e9, July 2022.
  • Wilson Wen Bin Goh, Limsoon Wong. Advancing clinical proteomics via analysis based on biological complexes: A tale of five paradigms.Journal of Proteome Research, 159:3167-3179, July 2016.
  • Weijia Kong, Bertrand Jernhan Wong, Huanhuan Gao, Tiannan Guo, Xianming Liu, Xiaoxian Du, Limsoon Wong, Wilson Wen Bin Goh. PROTREC: A probability-based approach for recovering missing proteins based on biological networks. Journal of Proteomics, 250:104392, January 2022.
  • Wilson Wen Bin Goh, Limsoon Wong. Why breast cancer signatures are no better than random signatures explained. Drug Discovery Today, 2311:1818--1823, November 2018.
  • Wilson Wen Bin Goh, Chern Han Yong, Limsoon Wong. Are batch effects still relevant in the age of big data? Trends in Biotechnology, 40(9):1029--1040, September 2022.
  • Mohammad Neamul Kabir, Limsoon Wong. EnsembleFam: Towards more accurate protein family prediction in the twilight zone. BMC Bioinformatics, 23:90, March 2022.

AWARDS & HONOURS

  • Fellow of the ACM, 2013

  • ICDT 2014 Test of Time Award, 2014 (with Peter Buneman and Val Tannen)

  • Asian Innovation Award (Gold), 2003 (with Allen Yeoh, Huiqing Liu, and Jinyan Li)

MODULES TAUGHT

CS2220
Introduction to Computational Biology
CS6222
Advanced Topics in Computational Biology