CCIPX: Curating and Crowdsourcing Information using Probabilistic XML

What is CCIPX?

CCIPX is a collaborative research project between Institut Mines-Telecom, Hanoi University of Science and Technology, the National University of Singapore. CCIPX is funded by the French Ministry of Foreign Affairs under the STIC-Asia program.


With the increasing difficulty in making transparent decisions on a global scale, organizations turn to resources made available by the information infrastructure: Web sources and crowds. Yet, no tool is readily available to support the task of collecting, integrating, annotating, validating, analyzing, and querying such heterogeneous and disparate data. For a practical example of application, imagine a tool that would allow the creation and maintenance of an open database of information about maritime data (ships, harbors, international traffic, etc.) used by industrials and researchers for data analysis. Data about countries, harbors, schedules, etc. could be gathered from existing Web sources and completed, updated, and evaluated by individuals (in a Wikipedia-like model). The collection, annotation, and validation of data either collected from autonomous online resources or from crowds for the creation of curated database requires data models, tools, and techniques that offer the flexibility necessary to manipulate semi-structured and uncertain data. This is the idea underlying probabilistic XML data models that store in a compact manner semi-structured information together with probabilistic assessment of the uncertainty in the data.

In this project we plan to study the requirements that the application scenario described above puts on the choice of the appropriate probabilistic XML model. We plan to design and implement algorithms, techniques, and tools for the creation, extraction, integration, annotation, validation, analysis, and search of probabilistic XML data. The objective is to design and implement a prototype platform for the creation and maintenance of databases curated by experts and by crowds.


Pierre Senellart

Bogdan Cautis

Talel Abdessalem

Mouhamadou Lamine Ba

Trinh Vu Tuyet

Nguyen Hong Phuong

Stéphane Bressan

Ling Tok Wang

Tang Ruiming

Sebastien Montenez

Antoine Amarilli

Dongxu Shao



First Meeting in Singapore from March 14th to March 15th, 2013

Second Meeting at SoICT 2013, in Da Nang, Vietnam in December 2013

Third Meeting at Uncrowd 2014 in Bali Indonesia, April 2014