School of Computing
 
 


Research Topic - Semi-structured Data Repository and Query Processing


1) There has been increased interest in semi-structured data recently with the introduction of XML and related languages and technologies. Semi-structured data is claimed to be self-describing, making it important to define a schema for the data. The data models that have been proposed specifically for semi-structured data, (e.g. DOM, Dataguides, etc.), capture limited semantics in the data, and do not model the semantics expressed in the schema.

Although XML documents could have rather complex internal structures, they can generally be modeled as ordered trees. In most XML query languages, the structures of XML documents are expressed by twig patterns, while the values of XML elements are used as part of selection predicates. Efficiently matching all twig patterns in an XML database is a major concern of XML query processing. Among them, holistic twig join approach has been taken as an efficient way to match twig pattern since it has shown effectiveness by reducing the intermediate result.

The project so far achieved the following:

  1. We defined a data model called ORA-SS that captures richer semantics. Using this semantically rich data model, we investigate and experiment with more efficient storage mechanisms, identify valid views of the base data, define efficient view maintenance algorithms, and propose methods for efficiently querying a semi-structured data repository.
  2. We extended the existing Dewey node labelling technique and use it to improve twig pattern query processing.
  3. Existing node labelling techniques used by twig pattern processing are designed for static XML data. We developed new node labelling techniques to label dynamic XML data without re-labelling any node labels when the XML data is updated.

 

2) Some publications

Jiaheng Lu, Tok Wang Ling, Chee Yong Chan, Ting Chen: From Region Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern Matching. VLDB 2005: 193-204

Ting Chen, Jiaheng Lu, Tok Wang Ling: On Boosting Holism in XML Twig Pattern Matching using Structural Indexing Techniques. SIGMOD Conference 2005: 455-466

Changqing Li, Tok Wang Ling: QED: a novel quaternary encoding to completely avoid re-labeling in XML updates. CIKM 2005: 501-508

Changqing Li, Tok Wang Ling, Jiaheng Lu, Tian Yu: On reducing redundancy and improving efficiency of XML labeling schemes. CIKM 2005: 225-226

Changqing Li, Tok Wang Ling, Min Hu: Efficient Processing of Updates in Dynamic XML Data. ICDE 2006.

 

3) List of Collaborations with:

Gillian Dobbie, University of Auckland, New Zealand

 

4) Names of the Faculty Members in the research area

Ling Tok Wang


 

National University Of Singapore School Of Computing Main Page Search Our Site Sitemap Contact Us Intranet Legal Statement  

Contact Webmaster: Bao Zhifeng, Xu Liang