• Next generation sequencing enables us to rapidly obtain billions of DNA sequences resulting in huge genomic dataset. At the same time, there is a need to efficiently search for arbitrary patterns in DNA sequence dataset for effective analysis. To achieve this, we have developed algorithms that allows the indexing of data structure that is compressed. We have developed two general aligners BatMis and BatAlign...
    Read More
  • Phylogenetic tree is used to study the evaluation relationship among a set of taxa and has been widely used in many biological areas. We have developed novel methods to construct and compare phylogenetic trees and networks...
    Read More
  • The most direct mechanism to control the expression of genes is through controlling the initiation of the transcription process. One crucial point in this process is the binding of transcription factors to DNA. We have collaborated with biologists to analyze the next generation sequencing data to identify bidning sites of different TFs, predict the interaction of genomic regions, as well as predict the 3D chromat in structure of our genome...
    Read More
  • Mass spectrometry-based proteomics tend to have consistency and coverage issues that need to be urgently addressed. In this project, we aim to deal with these two challenges by proposing approaches that analyze proteomic profiles in the context of biological networks...
    Read More
  • Plant metabolites are compounds synthesized by plants for essential functions, such as growth and development, and specific functions, such as pollinator attraction and defense against herbivores...
    Read More
  • Existing gene expression profiling works fall short on several issues. Hence, we envision an advanced integrated framework to provide biologicaly inspired solutions...
    Read More
  • There is a critical need to address the emergence of drug resistant varieties of pathogens for several infectious diseases. We propose a system-based approach to analyze and counter drug resistance in pathogens, with M. tuberculosis as a test case...
    Read More
  • Interaction data obtained by high-throughput assays may contain as much as 50% false positives and false negatives. Further progress is needed to distinguish between permanent and trainsient interactions, to distinguish proten complexes from functional modules...
    Read More
  • Our current research include microarray probe design, pattern searching, genome annotation, protein structure prediction, etc..
    Read More