Jun SUN     

กก

Research Staff

School of Computing
National University of Singapore
Computing 1 (COM1)
13 Computing Drive
Singapore 117417 
Republic of Singapore

Tel: (65)65162784
Email: sunjun (AT) comp.nus.edu.sg

Bio
Project
Publications
Professional Activities
Misc

BIO
I am a Research Staff in School of Computing (SoC), National University of Singapore (NUS), working with Prof. Chew Lim Tan. I received Ph.D in the same school in 2011 and B.Sc. in Computer Science in Harbin Institute of Technology in 2006. My research interests include Machine Translation, Natural Language Processing and Statistical Machine Learning.

Project
  • Kernel Engineering on Parse Trees (Current)
    We have verified the effectiveness of tree sequence structure in translation equivalence modeling in our ACL2009 paper. In this work, we verify the tree sequence based features for more NLP applications. We propose tree sequence based kernels, which can additionally capture the structure of a subtree sequence, both contiguous and non-contiguous, other than the single subtree features explored by traditional tree kernels. This study tends to bring novel views of structure features in NLP.

    For more details about it, see Tree Sequence Kernel for Natural Language. and my doctoral dissertation (coming soon).
  • Syntactic Structure Alignment for SMT (April. 2009 - Dec. 2010)
    Most of current work in SMT obtains Translational Equivalences by initially conducting word alignment on the plain parallel corpus and extracting the Translational Equivalences which are consistent with the word alignment. Therefore, a decent word alignment is required as a prerequisite. Such pipeline approach to get Translational Equivalences is argued to be vulnerable to the errors from the initial stage of word alignment. Currently, researchers address this problem by mainly focusing on how to improve word alignment. Alternatively, we attempt to directly conduct syntactic structure alignment to obtain the syntactic Translational Equivalences.

    For more details about it, see Exploring Syntactic Structural Features for Sub-Tree Alignment using Bilingual Tree Kernels. and Discriminative Induction of Sub-Tree Alignment using Limited Labeled Data..
  • Pisces decoder (August. 2007 - March. 2009)
    We proposed a series of Synchronous Grammars (STSG, STSSG, SncTSSG) based decoder and implement in the framework of Pisces.

    For more details about it, see A Tree-to-Tree Alignment-based Model for Statistical Machine Translation and A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation.
  • IWSLT2007 (May. 2007 - July. 2007)
    During my internship at Institute for Infocomm Research (I2R) in 2007, I contributed to I2R's effort in the IWSLT-2007 competition, which produced the 1st place in the Chinese-English task out of 15 participants, and won the second position by more than 3 Bleu-score.

    For more details about it, see I2R Chinese-English Translation System for IWSLT 2007.


Publication
Professional Activities
Misc
Locations of visitors to this page Last modified @ Jan 18 12:40 2012