27 Aug 2003
CS 6210: Module 3
25/31
Calculating Similarity
¡Euclidean Distance - bad
lM(Q,Dd) = sqrt (Σ |wq,t – wd,t|2)
lDissimilarity Measure; use reciprocal
lHas problem with long documents, why?
¡
¡Actually don’t care about vector length, just their direction
lWant to measure difference in direction