CS6210: Document clustering in DLs
12
Evaluation
wAgglomerative hierarchical clustering more superior to k-means.
wSpeed is important.
wFast algorithm preferred
*Bisecting k-means
wSuffix tree
*Linear time complexity
*Suffix tree built incrementally
*O. Zamir and O. Etzioni. Web document clustering: A feasibility demonstration. In SIGIR, 1998.