 |
 |
 |
 |
 |
 |
 |
 |
w |
Takes input parameter k, and
partitions a set
|
|
|
of n documents into k
clusters.
|
|
|
w |
Intracluster similarity is high.
|
|
|
w |
Intercluster similarity is low.
|
|
|
w |
Cluster similarity is measured in regard to the
|
|
mean value of the documents in a cluster,
|
|
|
known as centroids.
|
|