CS4221 Course Project

Deadlines

Proposals due (1 page, softcopy) 5 Feb 2015 (Thurs) (Week 4) Project submissions (report & ppt) due 26 Mar 2015 (Thurs) (Week 10) Presentations and Demos if any April 2, 9, 16 (Thurs Weeks 11, 12, and 13) Late Penalty for all submissions 3 marks a day
This is a team project with 4 students in a team. Each project team needs to submit a short proposal (softcopy) which includes the following information by 5 Feb 2015 (Thurs of Week 4) to dcsltw@nus.edu.sg for approval:
  1. A project title.
  2. Project team members' names and matriculation numbers.
  3. Type of the project, i.e. implementation, survey and comparison, or preliminary research study.
  4. An abstract which briefly describes the project with some references if any.

Projects

You can pick one of the three types of projects as your course project:

  1. Implementation/Programming. The end result of an implementation project should be a self contained system/component that should be thoroughly tested and can be of real-use. You're welcome to work on systems of your own interests with the consent of the lecturer. You are required to implement and evaluate known algorithms. For example, you can implement a CASE tool for database schema design.

    The deliverables: a 15 page report (double spacing and 12 point font, hardcopy) + code (in CD-ROM) + PowerPoint presentation slides (softcopy) + a 20 minute presentation and demo.

  2. Survey and Comparison. You can also perform an EXTENSIVE survey by reviewing existing work on a topic. You are expected to be critical in your analysis, rather than reiterating what the papers have presented. You should only choose this type of projects if you are quite familiar with the subfield you wish to survey; otherwise, you're advised not to do it. A survey should follow the style in ACM Computing Surveys. Extensiveness, comprehensibility, technical worthiness are major considerations. The survey should be in such a shape that can lead to a tutorial material or a journal paper.

    The deliverables: a 15 to 20 page report (double spacing and 12 point font, hardcopy) + PowerPoint presentation slides (softcopy) + a 20 minute presentation.

  3. Preliminary Research Study. You can also choose to work on a research topic. Here, you are expected to be original in your ideas (variations/enhancements of existing methods are acceptable). Due to time constraint, you need not perform an extensive study, but a preliminary study (analytical, simulation or experimental) is necessary. You should discuss with the lecturer about the novelty and significance of your ideas or methods.

    The deliverables: a 15 page report (double spacing and 12 point font, hardcopy) + code (if any, in CD-ROM) + PowerPoint presentation slides (softcopy) + a 20 minute presentation.

Note: Project submission (report & ppt file) due on 26 Mar 2015 (Thurs of Week 10). All project presentation slide files will be posted on the course website by 31 Mar 2015 (Tue).

Please send soft copies of your project report and presentation slides, etc. to dcsltw@nus.edu.sg. For implementation projects, please send the code (in CD-ROM) to Ling Tok Wang (room COM2 03-01). Please also include your project number in your email Subject, your project report, and ppt file.


Finding a Project Topic

There are many topics that you may choose, such as CASE tool implementation, schema translation between different data models, semantics discovery in databases, data/schema integration, view updates, XML query processing, keyword search in relational databases and XML databases, etc. A list of suggested topics is shown below, but this is not an exhaustive list.

  1. Relational database schema design CASE tool (e.g. implement Bernstein's Algorithm with some enhancements, e.g. LTK Algorithm)
  2. ER Diagram CASE Tool for database schema design
  3. Discover Semantics in Relational Databases
  4. Translating Relational Database Schemas with database instances to ER Schema Diagrams
  5. Converting relational databases with schemas into object relational databases
  6. Data/schema integration on relational databases
  7. Semantics-based data/schema integration on relational databases
  8. Materialized view maintenance (e.g. Sze Eng Koon's papers)
  9. View updates on relational databases
  10. Comparison on commercial object-relational database systems
  11. Keyword search methods in relational databases with and without ORM-semantics (papers by Zeng Zhong, etc.).
  12. Comparison on semistructured data models (e.g. DOM, OEM, DataGuide, ORA-SS, etc.)
  13. Comparison on XML schema definition languages (e.g. DTD, XML Schema, ORA-SS, etc.)
  14. Graphical query languages for relational databases and XML data (e.g. XML-GL and GLASS from Ni Wei's papers, etc.
  15. View generation on XML data (e.g. Chen Yabing's papers)
  16. XML updates, dynamic XML data, XML view updates
  17. Semantics discovery in data centric XML documents (i.e. to obtain ORS-SS like schemas)
  18. Semantics-based XML database integration
  19. Normalization theory for semistructured data
  20. XML database design CASE tool
  21. Translating relational databases and schemas into XML databases and schemas
  22. Translating XML databases and schemas based using ORA-SS into relational databases and schemas.
  23. Implement a CASE tool for extracting semistructured schemas from XML documents. The semistructured schema can be expressed by a schema summary, DTD, XML Schema, or ORA-SS schema.
  24. Comparison on the methods for storing and querying XML data in XML database systems: XML-enabled database systems and native XML database systems
  25. Survey on commercial XML systems (e.g. Oracle XML-SQL, IBM DB2 XML Extender, Microsoft SQL Server 2000, etc.)
  26. XML query processing (e.g. PathStack, Twig Pattern, Ordered Twig Pattern, Twig Pattern with Negation, etc.) (e.g. Lu Jiaheng's and Yu Tian's papers)
  27. Comparison on node labeling schemes for static/dynamic XML data (e.g. papers by Li Changqing and Xu Liang, etc.)
  28. Survey, implementation, or research on XML keyword search methods based on LCA, SLCA, or MLCA, etc. and with or without considering IDREFs (e.g. papers by Chen Bo, Bao Zhifeng, and Wu Huayu, Zeng Yong, etc.)
  29. XML keyword search with ORA-semantics (e.g. papers by Thuy Ngoc).
  30. Translating SQL integrity constraints into triggers
  31. The Universal Relation as a User Interface (ref. Ullman Vol. 2 Chapter 17)
  32. A realistic generic dataset generator
  33. ...

To choose a topic, you can consult the following publications:

  1. ACM Transactions on Database Systems
  2. IEEE Transactions on Knowledge and Data Engineering
  3. IEEE Transactions on Software Engineering
  4. IEEE Computer
  5. Data & Knowledge Engineering
  6. Proceedings of the Very Large Data Bases Conference
  7. Proceedings of the ACM SIGMOD Conference
  8. Proceedings of the International Conference on Data Engineering
  9. Proceedings of the International Conference on Conceptual Modeling (ER)
  10. DBLP Bibliography Server
  11. ACM SIGMOD Anthology
  12. World Wide Web Consortium (W3C)
  13. OLAP Council
  14. Object Database Management Systems Portal (ODBMS.ORG)
  15. National University Digital Library
  16. CS4221 webpage
  17. Papers on Object-Relationship-Attribute Model for Semistructured Data (ORA-SS):
  18. ...