Digital Libraries

What is a library?

What is a library?

Introduction

Vannevar Bush (1890-1974)

Design for Memex (c. 1945)

Memex

Memex

What is a Digital Library (DL)?

Outline for today

Course administration

Teaching staff

Course web sites

Objective

Hey min, go over the website!

Discussions

Midterm and Final

Literature survey

Final project

Outline for today

Reading and writing research papers

Why do you read a paper?

Reading a research paper

How to read a paper

How to read a paper - depth

How to read a paper - depth

How to read a paper - depth

How to read a paper - depth

How to read a paper - react

How to write a research paper

Any questions?

To think about for discussion

Coffee Break

Digital Libraries

Slide 35

What is information retrieval?

Searching in books

Information retrieval

What to index?

Trading precision for size

Indexing output

Trading precision for size, redux

Is fine-grained indexing worthwhile?

Inverted file compression

Building the index – Memory based inversion

Sort-based inversion

Sort based inversion: example

Using a first pass for the lexicon

Lexicon-based inversion

Inversion – Summary of Techniques

Query Matching

Query Matching

Boolean Model

Deciding ranking

Term Frequency

Inverse Document Frequency

Inverse Document Frequency

This is TF*IDF

Calculating Similarity

Cosine Similarity

Calculating the ranked list

Accumulator Storage

Selecting r entries from accumulators

To think about