Detailed Schedule for CS 6210 (Semester 1 - 2003/2004)

Digital Libraries and Computing for the Humanities

[ * ] - Note: Occasionally, we will hold group discussions during class on readings. Make sure to do the starred readings before class begins. These readings will be explicitly discussed.

Quick Links: [ Home ] [ IVLE ] [ Project Info ] [ Schedule ] [ Details ] [ HW 1 ] [ HW 2 ]

(This timetable is subject to change, see bottom for last update).
Week No. Topics and readings Milestones / assignments
S0.
6 Aug
Orientation
Course information, policies and scope. Discussion: What is a digital library?

Readings:

  • Vannevar Bush (1945) As we may think, The Atlantic Monthly (selected parts during class)
  • Lesk (1997), Chapter 1, Evolution of Libraries.
Survey on IVLE.
S1.
13 Aug
Introduction to Library Sciences, Part I
Overview, history and the multifaceted aspects of the physical library, including from a services perspective, from a research perspective. Information Seeking Processes

Readings:

Propose topic for survey paper.
S2.
20 Aug
Introduction to Library Sciences, Part II
Finishing up the Information seeking process. Reference Interviews and Library Evaluation.

Readings:

  • Papers in your selected survey area.
S3.
27 Aug
Practical Aspects of Information Retrieval
Term Frequency and Inverse Document Frequency. Vector Space Model for Information Retrieval. Practical aspects and implementation details of text retrieval engines.

Readings:

  • Lesk (1997), Chapter 2, Text Access Methods.
  • Lesk (1997), Chapter 5, Sections 5.5-5.8, Knowledge Representation Methods.
  • Witten, Moffat and Bell (1999), Chapter 2, Section 3, Huffman Encoding
  • Witten, Moffat and Bell (1999), Chapter 3.1 - 3.3 (up to Nonparameterized models) & 3.7
  • Witten, Moffat and Bell (1999), Chapter 4.4 - 4.6 (Query processing)
  • Witten, Moffat and Bell (1999), Chapter 5.1 - 5.2 (Index Construction)
Finalize survey paper area.
S4.
3 Sep
Multi-
Exploring the difficulties in exchange and crosswalking information between formats. Incongruities, granularity differences. Video, audio, geospatial and temporal collections. Domain specific digital libraries and interface needs: law, medicine, botany, astronomy, literature, and scholarly research. Approaches to OCR. Formats: TEI, SGML, XML.

Readings:

  • David Bainbridge, Craig G. Nevill-Manning, Ian H. Witten, Lloyd A. Smith, Rodger J. McNab, (1999) Towards a Digital Library of Popular Music (available from the ACM Digital Library or LINC).
  • Lesk (1997), Chapter 3, Images of Pages.
  • Lesk (1997), Chapter 4, Multimedia Storage and Access.
  • Witten, Moffat and Bell (1999), Chapter 6.1 - 6.2, 6.5 (GIF/PNG section only), 6.6 (JPEG section
  • Witten, Moffat and Bell (1999), Chapter 7, Section 1
  • Witten, Moffat and Bell (1999), Chapter 8.
S5.
10 Sep
Traditional and automated approaches and cataloguing/indexing services
Principles of cataloging. Historical development and current approaches to corcordances (KWIC/QWIC). Studies of existing cataloging standards: Dewey Decimal, Colon Classification, Library of Congress' AACR2. Also: Multimedia, multilingual aspects of cataloging.

Readings:

  • Lesk (1997), Chapter 5, Sections 5.1-5.3, Knowledge Representation Methods.
Extended deadline: Survey paper due.
S6.
17 Sep
Metadata creation and management
Using metadata. Metadata standards: Dublin Core, AACR2. Warwick Framework. Semantic Web. Also: Multimedia aspects of metadata (MPEG 7).

Readings:

1-page project proposal or project discussion appointment due by Saturday, the 20th.
Homework 1 distributed: Library Evaluation.
S7.
24 Sep
Introduction to bibliometrics
Citation analysis. Social networks, propagation of weights. Pagerank and HITS algorithms, hubs and authorities. Case study: Citeseer algorithm. Filtering and evaluative aspects of the traditional library and computational approaches.

Readings:

Survey paper graded and returned.
S8.
1 Oct
Usability of OPACs and retrieval engines
Modes of information access: browsing, searching, serendipity and berry-picking search. Behavior of the library patron and differences with web users. Graphical representations of large data sets.

Readings:

  • Lesk (1997), Chapter 7, Usability and Retrieval Evaluation.
  • Hearst, Marti A. (1999) User Interfaces, In Baeza-Yates and Ribeiro-Neto (eds.), Modern Information Retrieval, 1999.
S9.
8 Oct
Computational literary analysis and Digital Library Policy
Federalist papers (Madison and Jefferson). Fingerprinting and watermarking. Stylistics and text genre classification. Plagiarism detection.
The digital divide, and efforts to undo it. Open source licensing. Skywriting and self-archiving. Online publishing and alternatives and cost models.

Readings:

  • Shivakumar, Narayanan and Garcia-Molina, Hector (1995), SCAM: A Copy Detection Mechanism for Digital Documents, Proc. of DL 95.
  • Mosteller, Frederick and Wallace, David L. (1963) Inference in an Authorship Problem, J. of American Statistical Association, 58(3) pp. 275-309.
  • Supplemental: Smith, Peter D. (1990) Authorship Studies, In An Introduction to Text Processing, Chapter 12.
  • Supplemental: Biber, Douglas (1989) A typology of English Texts, In Linguistics 27, pp. 3-43.
  • Supplemental: Karlgren, Jussi and Cutting, Douglas, Recognizing text genres with Simple Metrics Using Discriminant Analysis, In Proc. of COLING-94.
  • Lesk (1997), Chapters 9-10, Economics and Intellectual Property Rights.
Homework 1 collected.
Homework 2 distributed: Authorship detection.
S10.
15 Oct
Student Paper Presentations 1
  • Slot 1 (4:00-4:25) - Web Information Seeking (Hui and Shuqiao)
  • Slot 2 (4:25-4:50) - Peer to Peer (Artem and Yingguang)
  • Slot 3 (5:00-5:25) - Clustering (Li and Swee Song)
  • Slot 4 (5:25-5:50) - Intelligent Agents (Wee Hyong and Long)
Homework 1 graded.
S11.
22 Oct
Student Paper Presentations 2
  • Slot 1 (4:00-4:25) - Spatial and Temporal (Steve and Yee Seng)
  • Slot 2 (4:25-4:50) - Music and Speech in DL (Gang and Wendong)
  • Slot 3 (5:00-5:25) - Multimedia Mining and summarization (Edward and Hendra)
  • Slot 4 (5:25-5:50) - Metadata Extraction (Xi and Xiaohang)
Homework 2 collected.
S12.
29 Oct
Project Presentations
No class. Poster presentations in lieu of class.
Final project poster presentation on 8 Nov. Homework 2 graded.

Quick Links: [ Home ] [ IVLE ] [ Project Info ] [ Schedule ] [ Details ] [ HW 1 ] [ HW 2 ]


Min-Yen Kan <kanmy@comp.nus.edu.sg>
Created on: Fri Jul 25 09:46:17 2003 | Version: 1.0 | Last modified: Thu Oct 16 09:05:57 2003