nWhat features best for web
searches?
nDiscriminate using Mutual
Information for 2+ word queries
nP(x,y) / P(x) P(y) – collocation
corrected for chance
nHigh MI corresponds to navigational
task
n
nNavigational (Known item, Home page finding)
nRelevant pages are mostly entry (root) pages
nAnchor text and URL information
n
nInformational (Topic relevance)
nRelevant pages are mostly nested pages
nContent information (e.g., TF ´ IDF)