Research Areas
    Activities
   Conferences Hosted by SoC
  
Professional Services
    Research Funding
    Research Publications
    Internship
    Useful Links

 

  Home > Research
   
  Artificial Intelligence
 

Artificial Intelligence (AI) may be considered one of the newest disciplines in the history of science. The name Artificial Intelligence was coined in 1956, soon after World War II. Today, AI encompasses a wide range of sub-disciplines, from general topics such as learning and perception to specific areas such as game playing, theorem proving, medical diagnosis, robotic control, as well as image and text processing. In a sense, AI aims to emulate human intelligence in various problem solving endeavours involving knowledge processing, such as learning new knowledge, dealing with uncertainty in knowledge, and finding knowledge presented in different formats including texts and images. The above examples of knowledge processing represent the five areas of AI interests in the School of Computing:

  • Machine Learning
  • Uncertainty Management
  • Image Mining
  • Document Analysis
  • Digital Libraries

Machine Learning

Learning is one of the fundamental capabilities of intelligent systems that have been studied since the dawn of AI. Alan Turing proposed machine learning as a possible way to construct machines that may be able to pass the Turing Test in his landmark 1950 paper “Computing Machinery and Intelligence”. One of the earliest success stories in AI, Samuel’s checkers player, was constructed with the help of machine learning.

Machine learning methods currently give the best performance in many practical problems in areas such as computer vision, speech, and natural language processing and robot control. Along with impressive practical successes, much has been achieved in our understanding of the fundamental issues involved in machines that learn. Learning theory characterises the conditions under which learning can be successful in two main ways: probabilistically, along the lines of Valiant’s Probably Approximately Correct (PAC) framework, and in the limit using recursion theoretic methods (inductive inference), as initiated by Gold in 1967.

Our work spans both the theoretical and applied fronts of machine learning. On the theory side, we have been working on non U-shaped learning, which is currently a much investigated topic in inductive inference. We have also been investigating new approaches such as the inversion of operators in a learning-theoretic context and the modes of data presentation for function learning. Towards the applied side, we have been looking at the use of unlabeled data in learning, including semi-supervised and unsupervised learning methods. In addition, we have been investigating the use of machine learning for applications such as activity recognition and natural language processing. In natural language processing, our entry in the evaluation conference, SemEval 2007, has been ranked first in the word sense disambiguation lexical sample task.


Uncertainty Management

Computational technologies for uncertainty management facilitate the identification of relevant information, the acquisition of useful insights, and the implementation of timely actions in complex decision problems. Many recent global events of significant and sometimes catastrophic impact, e.g., business globalisation, disease outbreaks, terrorism attacks, earthquakes and tsunamis, etc., have highlighted the need for decision support systems that can efficiently and systematically help manage uncertainties and minimise surprises. Current technologies cannot adequately address the major challenges brought by substantial contextual, informational and temporal changes in such decision situations. These are the main challenges being addressed at the Medical Computing Laboratory in SoC, with an emphasis in the biomedical domains. Our research group is part of the multidisciplinary Biomedical Decision Engineering (BIDE) team in the University, comprising faculty members, researchers and students from NUS, and various local and overseas institutions and healthcare organisations.

Our long-term research agenda focuses on developing advanced computational technologies to support complex decision making in dynamic environments. Our current research focuses on developing a comprehensive decision making framework that: 1) supports adaptive reasoning in changing, uncertain environments; 2) integrates and learns from distributed information sources; and 3) provides timely decision recommendations with limited resources.

This framework includes: 1) languages for effective specification of decision factors and objectives in a context-sensitive manner, 2) methodologies for reasoning with and learning of decision information from human experts and online databases, and 3) techniques for adapting responses and managing surprises under resource constraints.

Our on-going research projects focus on techniques and systems that provide support for complex medical decisions where relevant information from distributed multimodal sources is integrated to provide recommendations in a timely manner. Our recent foci are:

Decision Modelling with Multimodal Information: Projects in this area aim to combine multimodal information, such as text-based, structured, and/or image data, in support of biomedical decision making. The objective of an initial project is to develop an intelligent human organ segmentation system using 3D medical magnetic resonance images (MRIs) to support medical decision making. We have proposed hybrid image processing algorithms, and evaluated them on a set of kidney images. We continue to explore and develop other image processing algorithms and decision support system models.

Advanced Techniques in Probabilistic Graphical Models: Projects in this area explore and develop analytical techniques in the realm of probabilistic graphical models and influence diagrams. We are currently working on context-aware probabilistic reasoning, multiple-level probabilistic game representation and knowledge discovery using Bayesian learning. The results are being evaluated in selected prototype applications in biomedicine and e-commerce.

Time Critical Decision Modelling: Projects in this area investigate methodologies and build computer-based tools for managing complex decisions under limited resources. Such techniques take into account the dynamic nature of the problem, uncertainties, preferences of decision makers, as well as the time criticality of the problem. They help ensure that the decision models being built are of optimum size for timely recommendations of effective actions. Our ongoing work includes outcome and risk profile analysis, guideline implementation, and learning from imbalanced data in various critical care domains.

Highlights of research achievements and potential for commercialisation:

Project ResEasy: Funded by the Infocomm Development Authority (IDA) and The Enterprise Challenge (TEC) in Singapore, this translational research project adopts a new collaborative approach by facilitating a research team, an engineering team, and a clinical team to speed up trial implementation and productisation of our previous research results. The objective is to trial the effectiveness and feasibility of an open, adaptive workbench which implements decision support applications based on a set of generic information Image Mining Advances in image acquisition and storage have given rise to huge image databases. Retrieving salient information from these images is a daunting task. Image mining aims to address this problem. Image mining research at the School of Computing stems from our group’s original interests in data mining. Although it is usually used in relation to the analysis of data, data mining, like artificial intelligence, is an umbrella term and is used with varied meaning in a wide range of contexts. Thus, in the context of images, image mining deals with the extraction of knowledge, image data relationship, or other patterns not explicitly stored in the images. It is an interdisciplinary effort that draws on expertise in image processing, information retrieval, data mining, machine learning, database and artificial intelligence. management toolboxes that support integration, visualisation, analysis, security and reporting of relevant information. The toolboxes can be generalised to other diseases or conditions directly, and adapted and deployed in multiple sites simultaneously. The ResEasy project initially focuses on facilitating best practices in process management, outcome analysis, and guideline execution in two areas: prospective care of asthma patients, and acute care of Acute Respiratory Distress Syndrome (ARDS) patients. Collaborators and partners include the Singapore National Asthma Program, Gleneagles Hospital, Hewlett Packard and other engineering firms.

Image Mining

Advances in image acquisition and storage have given rise to huge image databases. Retrieving salient information from these images is a daunting task. Image mining aims to address this problem. Image mining research at the School of Computing stems from our group’s original interests in data mining. Although it is usually used in relation to the analysis of data, data mining, like artificial intelligence, is an umbrella term and is used with varied meaning in a wide range of contexts. Thus, in the context of images, image mining deals with the extraction of knowledge, image data relationship, or other patterns not explicitly stored in the images. It is an interdisciplinary effort that draws on expertise in image processing, information retrieval, data mining, machine learning, database and artificial intelligence.

The image mining research that is being carried out in the School of Computing focuses on medical images, particularly retina images and brain CT scan images. The main objective is to allow the machine to capture salient features in such images with the view to mine useful information pertaining to medical anomalies.

Retina images provide a window into what is happening inside the human body. Subtle changes in the eye’s retinal vessels can serve as warnings as to whether the patient may be heading towards a stroke. Computers can help trace and track these vessels accurately and quantify the changes in them over time. The RETINA image mining group in the School has been working on retina images over the years and has developed a computer aided screening system for use in polyclinics.

A new image mining application involving brain CT scan images has recently been studied. The current research aims to investigate techniques for fast retrieval of brain CT scan images based on the image content of medical anomalies as well as other textual information associated with the medical conditions. Machine learning paradigms are explored to enable automatic classification of medical images based on image contents and textual data. In addition, text and image mining techniques are also investigated for the training of the machine to do automatic interpretation of image contents.

In the Retina image mining project, we have developed increasingly accurate and robust algorithms to grade the vascular or blood-vessel structure in retinal images automatically. Our approach incorporates techniques from wavelet analysis, texture analysis, and curvature ridge/trench analysis to attain the desired clinical sensitivity. In collaboration with the Department of Ophthalmology and the Singapore Eye Research Institute (SERI), we have developed a user-friendly system called SIVA (Singapore “I” Vessel Assessment) to extract vascular structure information and derive quantitative measures for the description of retinal vessels’ characteristics (Figure 1). The robust system is also flexible and intuitive in gathering feedback for enhancing the accuracy of vessel measurement. We are currently validating the system on approximately 6,000 retinal images from the Singapore Prospective Cohort Study, conducted by the Department of Ophthalmology at NUS, SERI and the Singapore General Hospital.

In addition, we have also designed new data mining techniques for the discovery of interesting changes over time. These include: a dense periodic pattern miner, a scalable graph miner, and a progressive confident rule miner. In particular, the progressive rule miner is able to look for rules that capture the state changes of objects leading to a certain end state with increasing confidence. An initial application of this algorithm on the retinal dataset shows that the algorithm is able to increase the predictive accuracy of occurrences of maculapathy in diabetic patients.



Figure 1: The SIVA (Singapore “I” Vessel Assessment) System

Document Analysis

Document analysis is the task of examining document content in order to acquire an understanding of its intended meaning. The document content can be in the form of texts, tables, charts, graphics and photographs. Documents may be in electronic texts or digitised images. Electronic texts are pervasive in the information world today and they can be easily processed by the machine, but digitised images are also becoming very popular following recent advances in digital publishing technology. While historically, works in text processing and document image analysis were done quite independently of each other, in recent years, there have been common interests between the two communities: The field of information retrieval which has traditionally dealt with text has since also been looking into information in document images. On the other hand, we begin to see works on web and text documents reported in conferences that traditionally dealt with document images.

The document analysis group in the School of Computing has interest in both electronic texts and images, with the objectives of retrieving relevant documents and extracting textual contents from the documents. We have developed techniques in processing documents across different formats, including pure texts, charts, texts converted from optical character recognition (OCR) with errors, text images with noise and distortion, and multilingual documents.

Text representation is the task of transforming the content of a textual document into a compact representation of its content so that the document may be recognised and classified by a computer. In this research, we have developed a new term weighting scheme based on a Relevance Frequency measure for text classification.

In the area of document image analysis, we have developed techniques to correct distorted images caused by perspective distortion or warped document surfaces. Two main approaches have been investigated. One is based on the textual content of the document. Here, a curving grid is superimposed over the document in alignment with the distorted text line. The grid is then regularised, and the textual content straightened in the process. The other approach works by shading information in the document page to model the 3D surface of the document. The 3D model is then subject to a transformation process to achieve a flattened rendition of the document page (Figure 2). The distortion correction improves text recognition and hence provides for more accurate document information retrieval and extraction.

Another stream of research in document image analysis focuses on identifying the language and script of document content. The identification is based on some statistical measures of the image features of characters and scripts. The language/ script identification is useful in optical character recognition (OCR) involving multilingual documents. The automated language/script identification allows documents to be directed to the respective OCR engines for correct text conversion.

In addition to document text processing, recognition of charts is also being examined in our group. The techniques developed allow us to extract textual information as well as graphical components in charts so that we may derive meaningful interpretation of the charts. A question and answering system is being developed that allows chart information to be incorporated into document textual information to answer questions that involve data implicitly represented in charts.

Highlights of research achievements and potential for commercialisation:

Our technique for text classification based on Relevance Frequency has attracted industry interest. A start-up company is keen to use the technique to perform machine classification of biomedical literature from the PubMed document database.

Our language/script identification technique has been used by a company to develop a document processing system for a government organisation that deals with a large amount of incoming multilingual documents.

Another company is currently exploring the use of our document image analysis techniques to process scanned images of U.S. patent documents. The processing tasks include correction of document skews and distortion, detection and recognition of graphical components, and retrieval of relevant patent documents.

Our text classification method based on Relevance Frequency has achieved the best performance in BioCreative II evaluation (Critical Assessment for Information Extraction in Biology) in the proteinprotein Interaction Article Selection sub-task, in terms of F Score. The evaluation was held in conjunction with a workshop known as “Second BioCreAtIvE Challenge Workshop”.

Our group has clinched a prestigious grant – the HP Digital Publishing for University Teaching and Learning grant worth more than US$58,000. The grant is targeted at research activities in developing advanced techniques for document text recognition, storage and retrieval. Only 14 institutions from around the world were selected in 2006 to receive the grant.

Figure 2. (a) Original distorted images; (b) Real shading; (c) Reconstructed shape; (d) Uniform sampled mesh; (e) Textured mesh; (f) Restored image

Digital Libraries

Digital libraries aim to transform the way knowledge is created, transmitted and stored. It is a diverse area with contributors from library sciences, databases, natural language processing, multimedia and information retrieval. Previous efforts in the area have examined the problems of storage and access of vast quantities of human knowledge. While scalability of the storage and retrieval of large data remains a problem, recent efforts are centred on how today’s knowledge workers use and most efficiently access information. Properly integrating information from the World Wide Web into peer-reviewed, manually-selected and authoritative sources available in the digital library is a continued focus.

Our current research applies techniques in machine learning and natural language processing to solve problems in digital libraries and applied information retrieval. While digital library research is very diverse, our focus is in building tools and platforms for the automated, large-scale digital library. We examine the problem of name authority and attribution (e.g., Which of the 11 known Wei Wang’s is the author of this work?) and large scale scholarly digital library implementation and fielding. Query analysis is another continuing area of interest, where we bring statistical and syntactic analyses to bear on user queries in the context of library catalogues as well as the Web. Our research on user interface design in the context of analysis of search and browsing interfaces also helps us develop and understand next-generation methods of information presentation (e.g., using Web 2.0 technologies).

Highlights of research achievements and potential for commercialisation:

Our holistic view of digital library research leads directly to implementation of toolkits and fielded digital library technology. Our research on automated backend implementation in record linkage and terminology handling helps us to conduct more meaningful analysis of scholarly data. Our research on query analysis and user interface design allows us to build real-world, simple and workable interfaces for key application areas in digital libraries such as public access catalogues (Figure 3).

Figure 3: Re-designed prototype library catalogue featuring tabs and overview+detail user interface design

Our current implementation efforts aim to build scholarly libraries for mathematics research, where problems with terminology and equation retrieval arise, and for scholarly presentations, in which alignment and synthetic visual images play a significant role.

Our research has led to two invention disclosures that have attracted industry attention. Our research on synthetic image classification has led to licensing talks with international document processing companies.

While our digital library group is young, we have already established our specialty of cross-disciplinary research. Our publications feature automated analysis of digital data in various modes: web data, image, and query and metadata analysis. We have also established and headed a working group of investigators from several international universities examining issues in scholarly digital anthologies.


The faculty members involved in artificial intelligence research are:

  • HSU Wynne
  • JAIN Sanjay
  • KAN Min Yen
  • LEE Mong Li
  • LEE Wee Sun
  • LEONG Tze Yun
  • STEPHAN Frank
  • TAN Chew Lim


© Copyright 2001-08 National University of Singapore. All Rights Reserved