==================================== List of Research Projects Leong Hon Wai, RAS-Group, SoC, NUS ==================================== Contact: leonghw at comp.nus.edu.sg url: http://www.comp.nus.edu.sg/~leonghw/ We do both fundamental and applied research in combinatorial optimization. The application areas are in computational biology, transportation logistics, resource allocation and scheduling, and other optimization problems. We look for students who are algorithmically/mathematically inclined and who can do good software development. We have a very strong team culture and look for students who are team players. This list contains some of the projects. There are more projects of a similar nature. Email me if interested. Title: ------ Algorithms for peptide sequencing via tandem mass spectrometry (computational biology) Short Description: ------------------ Peptide sequencing is the problem of identification of proteins (determining the sequence of a protein) and recent technological advances in tandem mass spectrometry has made it the method of choice for high-throughput identification of proteins. (See [1], [2], [3] and others for quick introduction.) Recently, we initiated a project on multi-charge peptide sequencing (MCPS) that focusses on mass spectra with multiple charge (> 2). We showed in [1] that significant performance gain can be achieved by considered multi-charge peaks during the peptide sequencing process. There are several possible PhD projects in this area: (1) One project deals with improved algorithms for de novo sequencing for multi-charge mass spectra. (2) Another project deals with more precise peak annotation in multi-charged mass spectra. (3) Another project deals with the important problem of detection of post translation modifications (PTMs). This problem has important implications in the study of translational medicine. A collaborative project with an overseas partner is currently being pursued for this project. We are looking for students who are strong in design of algorithms and mathematical analysis. Currently, two PhD students are working in this area in my research group. This is joint research with Prof. Pevzner at UCSD and Prof Haixu Tang at Indiana University. References: [1] KetFah Chong, Kang Ning and Hon Wai Leong, "Characterization of Multi Charge Mass Spectra for Peptide Sequencing", APBC-2006, (2006), pp. 109-119. [2] Vineet Bafna, Nathan Edwards: "On de novo interpretation of tandem mass spectra for peptide identification.", RECOMB 2003: 9-18 [3] Ari Frank and P.A. Pevzner, "De Novo Peptide Sequencing via Probabilistic Network Modelling." Anal. Chem., 77, pp. 964-973, 2005. Number of RS needed: 2 Title: ------ Fragment Assembly using very short fragments (Computational Biology) Short Description: ------------------ Fragment assembly is an important genome sequencing problem in computational biology. A highly successful approach to fragment assembly is recently proposed by Pevzner et al in [1] and [2]. This project deals with consider fragment assembly using very short fragments (about 100bp each) produced by new fragment generation technology. For these short fragments, frequent repeats in DNA sequences causes problems in the assembly. In this project, we aim to seek efficient solution to this problem. We are looking for students who are strong in design of algorithms and mathematical analysis. Currently, two students are working in related area in my research group. This project is a joint project with Prof Haixu Tang in Indiana University. References: [1] Pevzner PA, Tang H, Waterman MS., "An Eulerian path approach to DNA fragment assembly." Proc Natl Acad Sci, USA 2001 Aug 14;98(17):9748-53. [2] Pevzner PA, Tang H, "Fragment Assembly with Double-barreled Data," BioInformatics, Vol 17, Supp 1, (2001), pp. S225-S233. Number of RS needed: 1 Title: ------ Rearrangements in genomes with unequal content (computational biology) Short Description: ------------------ Whole-genome rearrangement studies are typically separated into two steps: (i) identification of large blocks of sequence shared by the set of genomes; and (ii) comparison of the respective arrangements of these blocks. When studying many very different genomes simultanously, it becomes difficult to identify large blocks in step (i) simply because there are limited number of elements common to all genomes. In other words, many of the similarities that are identified by pairwise comparisons are dropped in step (i) when we restrict to elements common to all genomes. In the past, we have worked on an algorithm to compare the respective arrangements of blocks in multiple genomes (the program is called MGR, Bourque and Pevzner 2002). The algorithm is a heursitic that seeks the most parsimonious rearrangement scenario that best explains the observed arrangements. MGR relies on a polynomial time algorithm to compute the pairwise distance between 2 multichromosomal genomes (Pevzner and Hannenhalli 1995, Tesler 2002). Instead of working on a set of blocks common to all genomes, we would like to adapt the algorithm to work on the different sets of pairwise blocks. Given that MGR already only relies on pairwise comparisons to identify rearrangements implies that this modification is very accessible. The main challenge is to describe how these pairwise blocks are modified by the rearrangements and how to deal with boundary ambiguities since a rearrangement is defined for different sets of blocks on the same genome. One approach would be to add virtual markers to account for missing markers in some of the genomes. Not restricting to common blocks will retain much more information and we expect the recovered scenarios to be more accurate. It will also open new applications to include more genomes, distant genomes and genomes with low quality map or sequence. These developments would be useful not only in conjunction with MGR but also for any other multiple genome rearrangement study. Number of RS needed: 1 Title: ------ Algorithms for Phylogenetic Tree Reconstructions (in Computational Biology) Short Description: ------------------ A phylogenetic tree (or evolutionary tree) is a tree representation of the evolutionary history of a set of species. Over the past decade, there has been intensive reesarch in algorithms for reconstructing phylogenetic tree. A number of different techniques have been used. In this project, we aim to design efficient algorithms for phylogenetic tree reconstruction. The project starts with an up-to-date survey of existing techniques for phylogenetic tree reconstruction and a comparative study of their relative strengths and weaknesses. Number of RS: 1 For other related research topics in computational biology, please come by to see me in S16 06-01. (For more projects, see http://www.comp.nus.edu.sg/~leonghw, ---> under Research Project) Title: ------ Efficient Algorithms for Berth Allocation Planning Problems Short Description: ------------------ The complex operations involved in the management and operation of a world-class container transshipment port in Singapore are complex and involves many different resources and systems. In this project, we undertake to study the problem of planning of some of these operations to improve their effectiveness. Initial study with the berth allocation planning system (BAPS) has produced interesting research problems and results. Some examples of these are the berth partitioning problem, the berth assignment problems. This project aims to extend the current BAPS research with the study of more efficient algorithms for solving these and other related optimization problems in container port operation. Number of RS: 1 Title: ------ Multi-Agent Approach to Combinatorial Optimization (2 students) Short Description: ------------------ The majority of past approaches to combinatorial optimization use a centralized decision making framework where the algorithm is aware of all relevant problem parameters. In this project, we study the an interesting new approach to combinatorial optimization -- the multi-agent approach in which a distributed decision making framework is employed. Such a distributed framework is more flexible and is able to incorporate new changes and new constraints in the dynamic business environment of today. In our multi-agent approach, we model the entities in the problem using software agents, each of which has only a restricted picture of the entire state of the system. Thus, the agents cooperate as well as compete for scarce resources while optimizing some global objective functions. The multi-agent approach has been successfully applied to two logistics problems -- the vehicle routing problem and the inventory routing problem. In this project, we develop a multi-agent algorithm for solving a scheduling problem that involves multiple entities and scarce, shared resources. The challenge will be to appropriate model the various entities using appropriate agents with associated properties, believes and operations. We then explore issues such as the influence of local information, and the appropriate distributed decision making framework. This project will also involve the implementation of the multi-agent approach and benchmarking against the state-of-the-art approaches. Two starting reference: Yizhi Lao, and Hon Wai Leong, "A Multi-Agent Based Approach to the Inventory Routing Problem", Pacific-Rim Intl. Conf. on Artificial Intelligence (PRICAI-02), (2002), LNAI-2417, pp 345-354. Hon Wai Leong and Ming Liu, "A Multi-Agent Algorithm for Vehicle Routing Problem with Time Windows", ACM Symposium on Applied Computing (SAC-2006), AIMS Track, (2006), pp. 106-111. Number of RS: 1 For other related research topics in logistics, please come by to see me in S16 06-01. (For more projects, see http://www.comp.nus.edu.sg/~leonghw, ---> under Research Project) Research Scholars Currently Supervised: --------------------------------------- Ng Hoong Kee, PhD (Jan 2000 -- ) email: nghoong Topic: Multi-Point Range Query: Algorithms and Applications Chong Ket Fah, PhD (Jan 2002 -- ) email: chongket Topic: Computational Proteomics Ning Kang, PhD (Jul 2002 -- ) email: ningkang Topic: Computational Proteomics, SCS related problems Recently Graduated Students: ---------------------------- Tan Jing Song, MSc (Jul 2002 -- Jul 2003) (now doing PhD at U-Penn) Topic: Efficient Algorithms for Dynamic Route Advisory Problems Li Shuai Cheng, MSc (July 2001 -- Jul 2002) (now doing PhD at Waterloo) Topic: Efficient Algoritms for the Berth Assignment Problem Lao Yizhi, MSc (July 2000 -- July 2001) Topic: A Multi-Agent Based Approach to the Inventory Routing Problem. Ong Tat Wee, MSc (2000) Topic: A Graph Partitioning Algorithm for the Berth Allocation Problem