10 Aug 2004
CS 5244: Orientation
49/32
Lexicon-based inversion
¡Partition inversion as |I|/|M| = k smaller problems
lbuild 1/k of inverted index on each pass
l(e.g., a-b, b-c, …, y-z)
lTuned to fit amount of main memory in machine
lJust remember boundary words
¡
¡Can pair with disk strategy
lCreate k temporary files and write tuples (t,d,fd,t) for each partition on first pass
lEach second pass builds index from temporary file
l