Fundamentals of
Information Retrieval

What is information retrieval?

Searching in books

Information retrieval

What to index?

Trading precision for size

Indexing output

Trading precision for size, redux

Is fine-grained indexing worthwhile?

Inverted file compression

Building the index – Memory based inversion

Sort-based inversion

Sort based inversion: example

Using a first pass for the lexicon

Lexicon-based inversion

Inversion – Summary of Techniques

Query Matching

Query Matching

Boolean Model

Deciding ranking

Term Frequency

Inverse Document Frequency

Inverse Document Frequency

This is TF*IDF

Calculating Similarity

Cosine Similarity

Calculating the ranked list

Accumulator Storage

Selecting r entries from accumulators

To think about