CS6210: Document clustering in DLs
9
K-means Clustering
wTakes input parameter k, and partitions a set of n documents into k clusters.
wIntracluster similarity is high.
wIntercluster similarity is low.
wCluster similarity is measured in regard to the mean value of the documents in a cluster, known as centroids.