CS6210: Document clustering in DLs
8
Distance Functions
w
Single Link -- O(
n
2
)
Distance = minimum document distance
between 2 clusters.
w
Complete Link -- O(
n
3
)
Distance = maximum distance between 2
clusters.
w
Group Average – O(
n
2
)
Distance = average document distance between
2 clusters.
w
Distance function -- cosine measure
Cosine(
d
1
,
d
2
) = (
d
1
•
d
2
) / ||
d
1
|| ||
d
2
||
–
w