11 Oct 2005
CS 5244 - Computational
Document Analysis
14
Putting the constraints together
Document Frequency Ratios
(coverage of term to genre or genre+subject)
Use these to define the weight
Where
σ
is a penalty
(“deviation”) factor for
terms that are spread
widely over different
subjects
What are some
negative aspects of
this approach?