Research
Seminars
Title : Learning
to Classify Texts Using Positive and Unlabeled Data
Speaker : Dr Li Xiaoli
Date : 3 November 2003
Time : 11am to 12noon
Venue : Video Conference Room, S15-04-30, School
of Computing, NUS
Abstract
In traditional text classification, a classifier is built using
labeled training documents of every class. Now we study a different
problem. Given a set P of documents of a particular class (called
positive class) and a set U of unlabeled documents that contains
documents from class P and also other types of documents (called
negative class documents), we want to build a classifier to classify
the documents in U into documents from P and documents not from
P. The key feature of this problem is that there is no labeled negative
document, which makes traditional text classification techniques
inapplicable. In this paper, we propose an effective technique to
solve the problem. It combines the Rocchio method and the SVM technique
for classifier building. Experimental results show that the new
method outperforms ex-isting methods significantly.
Biography
Li Xiaoli is a research fellow in the National University of Singapore
under the Singapore-MIT Alliance(SMA). He has received the Ph.D.
degrees in Computer Software Theory from Chinese Academy of Sciences
in 2001. His research interests include Knowledge Discovery and
Data Mining (Text and WEB Mining), Machine Learning, Information
Retrieval, Bioinformatics, etc. Now he is under the direction of
Associate Professor Leong Tze Yun.
Title
: Towards Region Inference
for Java
Speaker : Dr Qin Shengchao
Date : 27 October 2003
Time : 11am to 12noon
Venue : Video Conference Room, S15-04-30, School
of Computing, NUS.
Abstract
Region-based memory management offers several important advantages
over garbage-collected heap, including real-time performance, better
data locality and efficient use of limited memory. The concept of
regions was first introduced for a call-by-value functional language
by Tofte and Talpin, and has since been advocated for imperative
and object-oriented languages. Scope memory, a lexical variant of
regions, is now a core feature in a recent proposal on Real-Time
Specification for Java (RTSJ).
Recent works in region-based programming for Java have focused
on region-checking which requires manual effort in choosing regions
with appropriate lifetimes. In this paper, we make a first attempt
at providing an automatic region-inference type system for a core
subset of Java. To provide an inference method that is both precise
and practical, we support classes and methods that are region-polymorphic;
and with region-polymorphic recursion for methods. One challenging
aspect of our inference rules is to ensure safe region programming
(without dangling references) in the presence of class subtyping,
method overriding and downcast operations. Our set of region inference
rules can handle these features safely. We provide solutions for
these in a setting that uses global dependency analysis to support
modular compilation.
Biography
Qin Shengchao is a research fellow in the National University of
Singapore under the Singapore-MIT Alliance (SMA). He has received
his BSc and PhD from Peking University, China. He is currently working
on Real-Time Java and bounded scope memory, under the direction
of Assoc. Prof. Chin Wei Ngan. More information can be found from
his homepage http://www.comp.nus.edu.sg/~qinsc
Title :
Harnessing Peers for Managing Distributed
Data
Speaker : Mr Ng Wee Siong
Date : 20 October 2003 (Monday)
Time : 11am to 12noon
Venue : Video Conference Room, S15-04-30 (School
of Computing)
Abstract
Peer-to-peer (P2P) computing is the sharing of computer resources,
services and information by direct negotiation and exchange between
autonomous and heterogeneous systems. In the talk, we examine the
issues of peer-to-peer (P2P) distributed data sharing systems, and
their possible applications. We will look at the architecture of
BestPeer, which is a generic P2P platform. We then present the design
and evaluation of PeerDB, a peer-to-peer (P2P) distributed data
sharing system that has been built on top of BestPeer. PeerDB distinguishes
itself from existing P2P systems in several ways. First, it a full-fledge
data management system that supports fine-grain content-based searching.
Second, it combines the power of mobile agents into P2P systems
to perform operations at peers' sites. Third, PeerDB network is
self-configurable, i.e., a node can dynamically optimize the set
of peers that it can communicate directly with based on some optimization
criterion. By keeping peers that provide most information or services
in close proximity (i.e., direct communication), the network bandwidth
can be better utilized and system performance can be optimized.
Fourth, to the end-user, it provides a keyword-based frontend for
searching data without knowing the database schema.
Biography
Ng Wee Siong is a research fellow in the National University
of Singapore under the Singapore-MIT Alliance (SMA). His current
research interests cover Peer-to-Peer data management, distributed
query processing and database performance issues. The major results
of his research works have been published in conferences like SIGMOD,
ICDE and WWW. He has received BIT (Bachelor of Information Technology)
from University Malaysia Sarawak (UNIMAS).
Title : Algorithmic
Issues in Container Terminal Operations
Speaker : Dr Hu Yahong
Date : 6 October 2003
Time : 11am to 12 noon
Venue : Video Conference Room, S15-04-30, School
of Computing, NUS.
Abstract
Container port is very important for Singapore. In this talk, the
project titled “Development of High Capacity Terminal Simulation
System to Handle Mega-Container Vessels” is introduced. The
objective of this project is to evaluate different new container handling
technologies, so that mega-containers can be handled efficiently.
Components of the simulation system are described.
Storage yard plays a crucial role in container terminals. Automated Storage/Retrieval
Systems (AS/RS) are introduced to store containers in the yard in
order to meet the throughput requirements of mega-vessels. As conventional
AS/RS is not capable of handling heavy sea containers, we proposed
a design of AS/RS with separate vertical and horizontal movement
mechanism. This so-called split-platform AS/RS (SP AS/RS) can offer
high throughput, better fault tolerance and enables flexible AS/RS
rack configurations. Because SP AS/RS is entirely new compared with
conventional ones, many algorithmic issues deserve further research.
Here, load shuttling algorithms are described in detail.
Biography
Hu Yahong is a Research Fellow in Singapore-MIT Alliance Program,
National University of Singapore. She received her Ph.D, M.S and
B.S from Xi’an Jiaotong University, China in 1999, 1995 and
1992 respectively. Her research interests include modeling and simulation,
distributed resources share and remote collaborative design environment.
Title : Incremental
Counting Satisfiability in Real-Time Systems
Speaker : Dr Andrei Stefan
Date : 29 September 2003
Time : 10am to 11.30am
Venue : Video Conference Room, S15-04-30, School
of Computing, NUS.
Abstract
In this paper, we embed the incremental computation of the number
of truth assignments of a clausal formula in the verification of
timing constraints of a real-time system. This will tell us how
"far away" is the current specification from satisfying
the safety assertion. The modification of the specification and/or
safety assertions is useful for incremental debugging, in which
bugs in problematic areas are fixed one at a time until the system
is safe. To illustrate this, the very well-known example of the
railroad crossing will be considered.
Biography
Stefan Andrei is a research fellow in the National University of
Singapore under the Singapore-MIT Alliance (SMA). He has received
his B.Sc. and M.Sc. in Computer Science from Lasi University, Romania
and PhD in Natural Science (Computer Science) from the Hamburg University,
Germany. He got the following academic awards (scholarships): May
1997-July 1997: DAAD scholarship, May 1998-June 1998: TEMPUS S_JEP
11168-96 scholarship and September 1998-August 2000: World Bank
Joint Japan Graduate Scholarship Program at Fachbereich Informatik,
Hamburg Universitaet, Germany. He is currently working on formal
languages, compilers and real-time systems, under the direction
of Associate Professor Chin Wei Ngan. More details about Andrei
can be found at http://www.infoiasi.ro/~stefan
Title
: Hierarchical Multi-Bottleneck
Classification Method And Its Application to Gene Microarray Dataesearch
Issues In Question Answering
Speaker : Dr Xiong Xuejian
Date : 22 September 2003
Time : 11am to 12noon
Venue : Video Conference Room, S15-04-30, School
of Computing, NUS
Abstract
The recent development of DNA microarray technology is
creating a wealth of gene expression data. Typically these datasets
have high dimensionality and a lot of varieties. Analysis of DNA
microarray expression data is a fast growing research area that
interfaces various disciplines such as biology, biochemistry, computer
science and statistics. It is concluded that clustering and classification
techniques can be successfully employed to group genes based on
the similarity of their expression patterns. Here, a hierarchical
multi-bottleneck classification method is proposed, and it is applied
to classify a publicly available gene microarray expression data
of budding yeast Saccharomyces cerevisiae.
Biography
Xuejiang Xiong obtained her Ph.D. degree in Information
System, School of Electrical & Electronic Engineering, Nanyang
Technological University, Singapore in 2003. She is now a research
fellow of Singapore-MIT Alliance (SMA) at National University of
Singapore, under the supervision of Associate Professor Tan Kian
Lee . Her research interests are in bioinformatics, machine learning,
data mining, and pattern recognition.
Title : Research
Issues In Question Answering
Speaker : Dr Zhang De
Date : 10 September 2003
Time : 10am to 11am
Venue : Video Conference Room,
S15-04-30, School of Computing, NUS
Abstract
What a current information retrieval system or search engine such
as Google can do is just "document retrieval", i.e., given
some keywords it only returns the relevant documents that contain
the keywords. However, what a user really wants is often a precise
answer to a question. For example, given the question "Who
was the first American in space?", what a user really wants
is the answer "Alan Shepard", but not to read through
lots of documents that contain the words "first", "American"
and "space" etc. The focus of current question answering
research is a fully-automatic open-domain question answering system,
which can answer factual questions based on very large document
collections such as the Web.
Biography
Dell Zhang is a research fellow in the National University of Singapore
under the Singapore-MIT Alliance (SMA). He has received his BEng
and PhD in Computer Science from the Southeast University, Nanjing,
China. He is currently working on information retrieval, machine
learning and data mining, under the direction of Associate Professor
Lee Wee Sun.
|