My research interests fall under the areas of digital
libraries, natural language processing, information retrieval,
human-computer interaction. Specifically, they include document
structure acquisition, verb analysis, digital library resource
annotation and and applied text summarization. My research goal
aims to investigate how natural language processing and
information retrieval can be applied to improve scholarly
publication and knowledge discovery.
I run the Web, Information Retrieval /
Natural Language Processing Group (WING) at SoC. We are not
the only group dealing with these topics — Web, DL, IR and
NLP — and our research isn't limited just to these topics,
but it is a good description of the research we do. We have lots
of demos, projects and corpora there, including ones that I have
had a direct hand in coding such as those on webpage
classification, document structure and
reference string parsing. There's also plenty of newer work
done by the students in WING: including Twitter retweet
predictor and classifier, and the
Chaptrs photograph organizing and sharing app (for Macs), the
largest SMS corpus (please contribute!). and the world's
best summarization system.
WING is currently affiliated with the China Singapore Institute of Digital
Media (CSIDM) and the NUS-Tsinghua Extreme
Search Centre (NExT).
I'm also a member and potential
supervisor for students in the NUS Graduate School for
Integrative Sciences and Engineering (NGS).
I also lead our group in collecting resources used to do such
research. Visit the Natural
Language Processing / Information Retrieval research framework
webpage (cte/sunfire) to see what tools we have available and
installed for related research directions and projects.
Conversely, if you're currently doing an FYP or UROP, I've
written some notes on what it's like to
grade them and what you should be doing as students to try to
optimize your grade. If you are doing a thesis proposal as a
Ph.D. student, you might want to read this
- Jin Zhao, Domain Specific Information Retrieval
- Jesse Prabawa Gozali, Intra-event Photo Organization
- Jun Ping Ng, Exploring Temporal Relations in NLP
- Aobo Wang, Informal Chinese Language Processing
- Tao Chen, Weibo Processing
- Xiangnan He, Web 2.0 IR
- Bamdad Bahrani, Reëxamining Slide Alignment
My group also hosts the occasional postgraduate intern from
collaborative projects or one-off internships, which are not
I list WING's graduated graduate students (MS, Ph.D.) here. I
have also directly supervised more than 50 undergraduate projects
and theses. More accurate about the current affiliations of our
alumni be found in our LinkedIn
group (viewable only by members). A more complete list of
past alumni (including undergraduates and system staff), see WING.
- Dr Ziheng Lin, Discourse Parsing, graduated 2012, now a Researcher with SAP
- Dr Yee Fan Tan, Cost-Sensitive Web-Based Information Acquisition for Record Matching, graduated 2011, now a Chief Systems Architect at KAI Square, Singapore.
- Cong Duy Vu Hoang, Automatic Related Work Summarization, graduated 2010, now a Research Scientist at the Institute of Infocomm Research (I2R), Singapore.
- Dr Long Qiu, Scenario Template Generation, graduated April 2009, now a Research Fellow with the Institute of Infocomm Research (I2R), Singapore
- Dr Hendra Setiawan, Gapped Constituency Phrase-Based Machine Translation, graduated 2008, now with IBM Research, Watson Labs, New York, NY (formerly a postdoc at the University of Maryland, College Park).
- Dr Hang Cui, Soft Pattern Matching, graduated July 2006, now a Software Engineer - Search Quality with Google, previously with Yahoo! Engineering, Sunnyvale, CA, USA.
I have proposed, managed and collaborated on a number of research grants in Singapore. Here's a non-exhaustive listing of some of my research endeavors. Funding in terms of Singapore dollars.
- PI, "Data Mining for Supporting Critical Reviews in Evidence Based Nursing" - 98K (2010-2012)
- Co-PI, joint with Philip S Cho (NUS, ARI), Ben Sovacool (NUS, LKYSPP), "Mapping the Technological and cultural landscape of scientific development in Asia" - 225K (2010-2013), from Global Asia Institute
- PI, "Co-training NLP systems and Language Learners" - 234.5K (2008-2014) CSIDM phases I and II
- Co-PI, joint with Tat-Seng Chua and Chew Lim Tan (NUS) - "Interactive Media Search" - 1.9M (2007-2010), NRF MDA grant
- Co-PI, joint with Yin Leng Theng, Chunyan Miao (NTU), Ai Chee Tang (SMU) - "Empirical Usability Studies with E-Learning Systems: Towards Executable Cognitive User Models as Design and Usability Evaluation Aids" - 24K (2007), from A*STAR HFE pilot grant
- PI, "Mathematical Equation Indexing, Search and Retrieval" - 39K (2006-2007)
- Co-PI, joint with Chew Lim Tan and Danny Poo (NUS), "Document Information Mining for Digital Libraries" - 23K (2006-2008), from HP Labs
- PI, "Natural Language Query Analysis for Web Queries" - 41K (2006-2007)
- Recipient of 60K (2004), NUS Interdisciplinary Technology Equipment Grant
- PI, "Corpus-Based Query Expansion in Online Public Access Catalogs" - 31K (2003-2006)
- PI, "Towards multi document indicative summarization via automated metadata extraction", 23K (2003-2006)