|
1
|
- Orientation
- Module 0 Min-Yen
KAN
|
|
2
|
- A place set apart to contain books for reading, study, or reference.
- (Not applied, e.g. to the shop or warehouse of a bookseller.)
- A building … containing a collection of books for the use of the public
or of some particular portion of it, or of the members of some society
or the like;
- a public institution or establishment, charged with the care of a
collection of books, and the duty of rendering the books accessible to
those who require to use them.
|
|
3
|
- A private commercial establishment for the lending of books, the
borrower paying either a fixed sum for each book lent or a periodical subscription.
- a great mass of learning or knowledge;
- the objects of a person's study, the sources on which he depends for
instruction.
- Computers. An organized collection of routines, esp. of tested routines
suitable for a particular model of computer
- Biology. a collection of sequences of DNA … that represent the genetic material of
a particular organism or tissue
|
|
4
|
- Bush’s “As we may think”
- Writes this at the end of WW II
- _____ was the first computer, born to compute ballistic tables fast
- ______ just invented 5 years ago
- ________ (“display technology”) still a less than perfect process.
- _______ (“storage technology”) was a mature and stable technology.
|
|
5
|
- Director of the Office of Scientific Research and Development
- lead 6000 scientists in R&D for WWII
- Predicted many technological advances
- the “memex” is one whose spirit we are implementing
- the purpose was to provide scientists the capability to exchange
information; to have access to the totality of recorded information
|
|
6
|
|
|
7
|
- Integrated computer, keyboard, and desk
- “mechanized private file and library”
- remove drudgery from information retrieval
- suggested implementation was microfilm
- various user operations are
suggested
- ________________ was the main purpose
- “the process of tying two items together is the important thing”
- prelude to hypertext...
|
|
8
|
- Information could come pre-associatively indexed, but the key point was
___ ___________
- ____ still does not provide that today
- Bush observes that tools change our way of doing, and expand the
horizons before us
- full impact of WWW and DLs still not known
|
|
9
|
- “a collection of information that is both digitized and organized”
(Lesk)
- there are numbers of alternate definitions, but this seems fair enough
- no mention of ________, __________, __________, etc.
- It is not just to reform the current library system, rather, we aim to
- organize and access the “information overload”
|
|
10
|
- Introduction to libraries √
- Course administration
- Reading and writing research
- To think about
|
|
11
|
- Teaching staff
- Web sites
- Objective
- Syllabus
- Assessment overview
- Homework and discussions
- Survey paper and project
- Any questions?
|
|
12
|
- Lecturer:
- Min-Yen Kan (“Min”)
- kanmy@comp.nus.
edu.sg
- Office: S15 05-05
- 6875-1885
- Hours:
|
|
13
|
- http://ivle.nus.edu.sg/
- Discussion forum
- Any questions related to the course should be raised on this forum
- Please do not send emails except urgent or personal matters
- Announcements
- Work bin: Lecture notes (incomplete!)
- http://www.comp.nus.edu.sg/~cs6210
- Homework specification
- Other supplementary content
|
|
14
|
- Building, using and maintaining large volumes of information
- Contrast computational approaches with traditional library science
methods
- Who?
- Advanced undergraduates and beginning graduate students. Centered
towards IS/CS or by permission.
|
|
15
|
- (S0. 6 Aug and S1. 13 Aug)
M0: Orientation; and M1: LIS crash course.
- (S2. 20 Aug)
M2: Multi-(media, lingual, access, needs).
- (S3. 27 Aug)
M3: Cataloging/indexing services.
- (S4. 3 Sep)
M4: Metadata creation and management.
- (S5. 10 Sep and S6. 17 Sep)
M5: Fundamentals of information retrieval.
- (S7. 24 Sep)
M6: Introduction to bibliometrics.
|
|
16
|
- (S8. 1 Oct) Usability of OPACs and retrieval engines.
- (S9. 8 Oct)
M8: Computational literary analysis.
- (S10. 15 Oct)
M9: The problem of synonymy.
- (S11. 22 Oct)
M10: Topics in digital library policy.
- (S12. 29 Oct)
Final project poster presentations.
|
|
17
|
- Required textbook:
- Lesk (1999) Practical Digital Libraries
- Will be supplemented by readings and excerpts from the following books:
- Baeza-Yates and Ribeiro-Neto (1999) Modern Information Retrieval
- Witten, Bell and Moffat (2003) Managing Gigabytes.
- Chakrabhati (2003) Mining the Web.
- Arms (2003) Digital Libraries.
|
|
18
|
- Class participation is very important. There are no “dumb” questions.
You will only be penalized for “no” questions / comments.
- Possibilities:
- Name tags
- Cold calls
- Small group discussion and presentation
|
|
19
|
- Collaboration is acceptable
- To assure that all collaboration is on the level, ________________________________________________________________________________________
- You will be assessed for the parts for which you claim is your own
contribution.
|
|
20
|
- You are free to meet with fellow students(s) and discuss assignments
with them.
- Writing on a board or shared piece of paper is acceptable during the
meeting; however, you ___________________________________________________________________.
- After the meeting, do something else for at least a half-hour (watch an
episode of Gilligan's Island), before working on the assignment.
- This will assure that you are able to reconstruct what you learned from
the meeting, by yourself, using your own brain.
|
|
21
|
- Homeworks (2) @ 10% =20%
- Discussion participation 10%
- Survey paper 20%
- Final project
- Presentation 10%
- Write-up / Deliverables 40%
|
|
22
|
- Homework
- Practical aspect of the course
- Two assessments, both individual:
- HW 1: query analysis
- HW 2: authorship detection
- Discussions
- Participation is key
- Come prepared (read ahead of time!)
|
|
23
|
- Each student will pick an area of study to survey 3-5 papers in detail.
- Must be interesting to you
- Journal or conference papers from an authority list
- Limit to 8 pages, better if just 5
- Individual work only
- Give your perspective on area’s future
|
|
24
|
- Students will self-organize into groups for the final projects, shortly
after the survey papers are due.
- Requires original work
- Cooperation and coordination
- Report as a conference submission
- Poster presentation to the public
- Sample topics on the web page
|
|
25
|
- Introduction to libraries √
- Course administration √
- Reading and writing research
- To think about
|
|
26
|
- References:
- http://www.cse.ogi.edu/~dylan/
efficientReading.html
- ftp://fast.cs.utah.edu/pub/writing-papers.ps
- This section partially from Surendar Chandra
of University of Notre Dame.
|
|
27
|
- Understand and learn new contributions
- However…
- Not all papers are “good”
- Not all papers are “interesting”
- Not all papers are “worthwhile” for you
- You have to learn to identify a good paper and spend your time wisely
|
|
28
|
- What is this paper about?
- Read the title and the abstract
- If you still don’t know what this paper is about, then this is a
poorly-written paper.
- Read the conclusion
- Are you now sure you know what this paper is about? If not, paper.
- Read the _________
- Read the ___________________
- Read _________________________
|
|
29
|
- See who wrote it, where it was published, when was it written
(credibility)
- Skim references
- Are authors are aware of relevant related work?
- Do you know the work that they cite?
- Do you know other work that they should have cited?
|
|
30
|
- Approach with scientific skepticism
- Examine the assumptions. Are
they:
- Rely on any uncertain trends?
- Reasonable?
- e.g., “Let’s assume that there are billions of powerful computers,
connected by a high speed network, spread across the world, our system
will …”
- e.g., “Our system functions in real-time on a 33Mhz Intel 386 with
640K main memory running Windows 98”
|
|
31
|
- Examine the methods:
- Did they measure what they claim?
- Can they explain what they observed?
- Want an analysis of why the system behaves a certain way, not raw
data.
- Did they have adequate controls?
- Were tests carried out in a standard way? Were the performance metrics
standard?
- If not, do they explain their metrics clearly?
|
|
32
|
- Examine the statistics:
“Lies, d*mned lies and statistics”
- Appropriate statistical tests applied properly?
- Did they do proper error analysis?
- Are the results statistically significant?
- Common mistake: “We performed our experiment once at 4 am and noticed
a ten fold improvement. Thus we conclude that our system is better”
- Be very careful with percentages
- Method A: 0.01 seconds, our Method: 0.005 seconds
- Our method shows 100% improvement over method A!!
|
|
33
|
- Examine the conclusions:
- Do the conclusions follow logically from the conclusions
- We performed our experiments with 8 palm pilots and saw a 10 fold
improvement. Hence we conclude that our system will scale to millions
of palm pilots
- What other explanations are there for the observed effects
- What other conclusions or correlations are there in the data that they
did not point out
- Earlier work performed experiments using a 2 Mbit wireless network.
Our system (incidentally) used a 11 Mbit network and saw a 5 fold
improvement. So our technique works!!
|
|
34
|
- Take notes
- Highlight major points
- React to the points in the paper
- Place this work with your own experience
- If you doubt a statement, note your objection
- Summarize what you read
- Good practice: maintain your own bibliography of all papers that you
ever read
|
|
35
|
- Write it such that anyone who reads it using the method we just
discussed understands the idea
- Clearly explain what problem you are solving, why it is interesting and
how your solution solves this problem
- Be crisp. Explain what your contributions are, what your ideas are and
what are others’ ideas
|
|
36
|
- Introduction to libraries √
- Course administration √
- Reading and writing research √
|
|
37
|
- Go to IVLE and help me determine the needs for this course
- Your background and knowledge
- Expectations on what you want to learn
- Optimal office hours
- Please complete before next lecture!
|
|
38
|
- What are the functions of a traditional library?
- Are these same functions in the digital library?
- How is the digital library different from:
|