24 Aug 2004
CS 5244: Indexing / Classification
49
Focused Probing: Sampling
ˇTransform each rule into a query
ˇFor each query:
lSend to database
lRecord number of matches
lRetrieve top-k matching documents
ˇAt the end of round:
lAnalyze matches for each category
lChoose category to focus on
l
Sampling proceeds in rounds:
In each round, the rules associated with each node are turned into queries for the database
q Representative document sample
q Actual frequencies for some “important” words
Output: