[ IVLE ]
[ >Overview ]
[ Syllabus ]
[ Grading ]
[ Homework ]
[ Misc. ]
We will be using the Integrated Virtual Learning
Environment (IVLE) for forum discussions, announcements, and other
temporally-sensitive materials. Basic course administration, lecture
and tutorial notes will be available on this publicly-accessible
See the syllabus page for class notes
and topics covered -- the schedule above will show the most up-to-date
schedule for the class. I will try to announce any changes in class
scheduling through IVLE.
Module Aims and Objectives: A huge quantity of information
is available on the web and the amount is increasing every day. Most
on the web information is still currently in the form of free text
encoded in HTML and PDF formats. A growing trend is towards
high-level and semantic encoding of such textual information in the
form of XML, and towards the integration of wired and wireless web
There is a need to classify and summarize this information for
display on wide variety of devices with wide ranging processing power
and display capabilities, such as PCs and mobile phones.
We need to develop tools to process and manage such information
effectively. This module introduces the concepts and techniques for
the analysis, representation, retrieval, classification and
summarization of unstructured textual information. By the end of this
course, students should have the expertise and competence to design
and implement text processing and mining systems and search engines on
- Modular credits: 4.
- CS 3242 Hypermedia Technologies: this pre-requisite is
not enforced for postgraduate students.
- This course does require programming competency
and some familiarity with the Unix/Linux environment in
its assignments. You are required to have this
knowledge as a prerequisite.
- Teaching Staff: Min-Yen Kan, firstname.lastname@example.org.
Office: AS6 05-12 (x1885). Office hours Wednesdays 5:00-6:00 pm (before
class, starting on Week 3). Emails to me as a default are assumed to
be public, and my replies and your anonymized email will be posted to
IVLE. Please let me know if you do not want the contents of
your email posted; I will be happy to honor your requests.
- Workload: 2 lecture hours, 8 hours
preparation per week. Occasional tutorials will be offered on
subjects, timing TBA. We will try to get lectures archived
(they won't be broadcasting it live). See IVLE or ask the
webcast technicians for help if you have trouble accessing the
- We will be reading primary materials (e.g., recent conference
papers) for the second half textbook sources. You should have
quick access to all of the texts, in order of relevance to the
- Manning, Schuetze, and Raghavan (2008) Introduction
to Information Retrieval, Cambridge Univ. Press. A
new textbook that was in draft form earlier two years
ago when we ran this course. Thanks to the authors, the
whole book (PDFs too) is online. Still, it's helpful to
have the textbook and I would recommend buying this book
most among all the textbooks listed here.
[ Check LINC for this book ] (not yet available as of 26 Aug 08)
- Baldi, Frasconi and Smyth (2003) Modeling the
Internet and the Web: Probabilistic Methods and
Algorithms. Wiley, ISBN: 0-470-84906-1 (http://ibook.ics.uci.edu).
This book will be put into the Central Library RBR.
[ Check LINC for this book ]
- Baeza-Yates and Ribeiro-Neto (1999) Modern Information
Retrieval. A well-respected and often used IR retrieval
book for teaching the fundamentals of IR, slightly dated
but well-written holistic overview of IR. This textbook
is in Central Library RBR.
[ Check LINC for this book ]
Note to NUS-external visitors: Welcome! If you're a fellow
course instructor looking for lecture material, you can see the
syllabus menu item on the left for a preview. Please contact me if
you'd like to use any of my material. Thanks!
This document, index.html, has been accessed 14603 times since 13-Jul-08 00:16:17 SGT.
This is the 4th time it has been accessed today.
A total of 5197 different hosts have accessed this document in the
last 3631 days; your host, 18.104.22.168, has accessed it 1 times.
Min-Yen Kan <email@example.com>
Sun Jul 13 00:15:27 SGT 2008
| Version: 1.0
| Last modified:
Tue Aug 26 16:54:49 2008