K-means Clustering
w Takes input parameter k, and partitions a set
of n documents into k clusters.
w Intracluster similarity is high.
w Intercluster similarity is low.
w Cluster similarity is measured in regard to the
mean value of the documents in a cluster,
known as centroids.