Putting the constraints together
Document Frequency Ratios
(coverage of term to genre or genre+subject)
What are some
negative aspects of
this approach?
Use these to define the weight
Where σ is a penalty
(“deviation”) factor for
terms that are spread
widely over different
subjects