Partitioning Algorithms a Partitioning method: Construct a partition of n data into a set of k clusters a Given: a set of data and the number k Find: a partition of K clusters that optimizes the chosen partitioning criterion ◆ Globally optimal a Intractable for many objective functions a Ergo, exhaustively enumerate all partitions e Effective heuristic methods: K-means and K- medoids algorithms
Partitioning Algorithms ◼ Partitioning method: Construct a partition of n data into a set of K clusters ◼ Given: a set of data and the number K ◼ Find: a partition of K clusters that optimizes the chosen partitioning criterion ◆ Globally optimal Intractable for many objective functions Ergo, exhaustively enumerate all partitions ◆ Effective heuristic methods: K-means and Kmedoids algorithms
K-Means a Assumes data points are real- valued vectors a Clusters based on centroids(aka the center of gravity or mean) of points in a cluster, c u(c) x x∈C a Reassignment of instances to clusters is based on distance to the current cluster centroids o(Or one can equivalently phrase it in terms of similarities)
K-Means ◼ Assumes data points are real-valued vectors. ◼ Clusters based on centroids (aka the center of gravity or mean) of points in a cluster, c: ◼ Reassignment of instances to clusters is based on distance to the current cluster centroids. ◆ (Or one can equivalently phrase it in terms of similarities) = x c x c | | 1 μ(c)
K-Means Algorithm Select K random data C1, C2,.CK] from data X1, 2… XN as seeds a Until clustering converges(or other stopping criterion) /partitioning For each point X Assign X; to the cluster c such that dist(x c)is minimal //NeXt, update the centroid of each cluster For each cluster c=(c)
K-Means Algorithm ◼ Select K random data {c1 , c2 ,… cK} from data {x1 , x2 ,… xN} as seeds. ◼ Until clustering converges (or other stopping criterion): //partitioning For each point xi : Assign xi to the cluster cj such that dist(xi , cj ) is minimal. //Next, update the centroid of each cluster For each cluster cj cj = (cj )
K Means Example(K-2) Pick seeds Reassign clusters Compute centroids Reassign clusters X Compute centroids Reassign clusters Converged
K Means Example (K=2) Pick seeds Reassign clusters Compute centroids x x Reassign clusters x x Compute centroids Reassign clusters Converged!
Example Assign Gate each the objects°23 cluster to most means similar reassign reassign K=2 Arbitrarily choose K object as initial cluster center Update the means 012345678910
Example 0123456789 10 0 1 2 3 4 5 6 7 8 9 10 0123456789 10 0 1 2 3 4 5 6 7 8 9 10 0123456789 10 0 1 2 3 4 5 6 7 8 9 10 0123456789 10 0 1 2 3 4 5 6 7 8 9 10 0123456789 10 0 1 2 3 4 5 6 7 8 9 10 K=2 Arbitrarily choose K object as initial cluster center Assign each objects to most similar center Update the cluster means Update the cluster means reassign reassign