Research Topic -
Semi-structured Data
Repository and Query Processing
1)
There has been increased interest in semi-structured data recently with
the introduction of XML and related languages and technologies.
Semi-structured data is claimed to be self-describing, making it
important to define a schema for the data. The data models that have
been proposed specifically for semi-structured data, (e.g. DOM,
Dataguides, etc.), capture limited semantics in the data, and do not
model the semantics expressed in the schema.
Although XML documents could have rather complex internal structures,
they can generally be modeled as ordered trees. In most XML query
languages, the structures of XML documents are expressed by twig
patterns, while the values of XML elements are used as part of selection
predicates. Efficiently matching all twig patterns in an XML database is
a major concern of XML query processing. Among them, holistic twig join
approach has been taken as an efficient way to match twig pattern since
it has shown effectiveness by reducing the intermediate result.
The project
so far achieved the following:
-
We defined a data model called ORA-SS that captures richer
semantics. Using this semantically rich data model, we investigate
and experiment with more efficient storage mechanisms, identify
valid views of the base data, define efficient view maintenance
algorithms, and propose methods for efficiently querying a
semi-structured data repository.
-
We extended the existing Dewey node labelling technique and use it
to improve twig pattern query processing.
-
Existing node labelling techniques used by twig pattern processing
are designed for static XML data. We developed new node labelling
techniques to label dynamic XML data without re-labelling any node
labels when the XML data is updated.
2) Some publications
Jiaheng Lu, Tok Wang
Ling,
Chee Yong Chan,
Ting Chen: From Region
Encoding To Extended Dewey: On Efficient Processing of XML Twig Pattern
Matching.
VLDB 2005: 193-204
Ting Chen,
Jiaheng Lu, Tok Wang
Ling: On Boosting Holism in XML Twig Pattern Matching using Structural
Indexing Techniques.
SIGMOD Conference 2005:
455-466
Changqing Li, Tok Wang
Ling: QED: a novel quaternary encoding to completely avoid re-labeling
in XML updates.
CIKM 2005: 501-508
Changqing Li, Tok Wang
Ling,
Jiaheng Lu,
Tian Yu: On reducing
redundancy and improving efficiency of XML labeling schemes.
CIKM 2005: 225-226
Changqing Li, Tok Wang
Ling,
Min Hu: Efficient
Processing of Updates in Dynamic XML Data.
ICDE 2006.
3) List of Collaborations with:
Gillian Dobbie, University of
Auckland, New Zealand
4) Names of the Faculty
Members in the research area
Ling Tok Wang |