The K-Means clustering Method EXample Assign Update ch the objects cluste 56 10 to most means center reassign reassign Arbitrarily choose K oject as initia cluster center pdate the cluster means 012345678910 16
16 The K-Means Clustering Method ◼ Example 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 K=2 Arbitrarily choose K object as initial cluster center Assign each objects to most similar center Update the cluster means Update the cluster means reassign reassign
R-Means Consider the following 6 two-dimensional data points x1:(0,0)2x2(1,0),x3(1,1,x4(2,1),x5(3,1),x6(3, If k-2, and the initial means are(0, 0)and(2 1) (using Euclidean distance) a Use K-means to cluster the points 17
K-Means ◼ Consider the following 6 two-dimensional data points: ◼ x1: (0, 0), x2:(1, 0), x3(1, 1), x4(2, 1), x5(3, 1), x6(3, 0) ◼ If k=2, and the initial means are (0, 0) and (2, 1), (using Euclidean Distance) ◼ Use K-means to cluster the points. 17
R-Means Now we know the initial means Mean one(0, 0)and mean two(2, 1), We are going to use euclidean distance to calculate the distance between each point and each mean For example Point xl(0,0), point xl is exactly the initial mean one, so we can directly put xl into cluster one
18 K-Means ◼ Now we know the initial means: ◼ Mean_one(0, 0) and mean_two(2, 1), ◼ We are going to use Euclidean Distance to calculate the distance between each point and each mean. ◼ For example: ◼ Point x1 (0, 0), point x1 is exactly the initial mean_one, so we can directly put x1 into cluster one
R-Means Next we check Point x2 (1,0) Distance a and mean one s? (1-0)2+(0-0)2=1 Distance? x2 and mean two=v(1-2)2+(0-1)2=2 Distancel<Distance2, so point x2 is closer to mean one Thus x2 belongs to cluster one 19
19 K-Means ◼ Next we check Point x2 (1, 0) Distance1: x2_and_mean_one = 2 (1 − 0) 2+(0 − 0) 2 = 1 Distance2: x2_and_mean_two = 2 (1 − 2) 2+(0 − 1) 2 = 2 2 Distance1<Distance2, so point x2 is closer to mean_one, Thus x2 belongs to cluster one
R-Means Similarly for point x3(1,1), Distance x3 and mean one=y(1-0)2+(1-0)2=32 Distance X3 and mean two=V(1-2)2+(1-1) Distancel>Distance, so point x3 is closer to mean two Thus x3 belongs to cluster two
20 K-Means Similarly for point x3 (1, 1), Distance1: x3_and_mean_one = 2 (1 − 0) 2+(1 − 0) 2 = 2 2 Distance2: x3_and_mean_two = 2 (1 − 2) 2+(1 − 1) 2 = 1 Distance1>Distance2, so point x3 is closer to mean_two, Thus x3 belongs to cluster two