Complete versus partial A complete clustering assigns every object to a cluster. a partial clustering does not assign every object to a cluster. The motivation of partial clustering is that some objects in a data set may not belong to well-defined groups Instead they may represent noise or outliers 8/20/2016 PATTERN RECOGNITION
Complete versus partial A complete clustering assigns every object to a cluster. A partial clustering does not assign every object to a cluster. The motivation of partial clustering is that some objects in a data set may not belong to well-defined groups. Instead, they may represent noise or outliers. 8/20/2016 PATTERN RECOGNITION 11
K-means K-means is a prototype-based clustering technique which creates a one-level partitioning of the data objects Specifically, K-means defines a prototype in terms of the centroid of a group of points K-means is typically applied to objects in a continuous n- dimensional space 8/20/2016 PATTERN RECOGNITION
K-means K-means is a prototype-based clustering technique which creates a one-level partitioning of the data objects. Specifically, K-means defines a prototype in terms of the centroid of a group of points. K-means is typically applied to objects in a continuous ndimensional space. 8/20/2016 PATTERN RECOGNITION 12
K-means The basic K-means algorithm is summarized below 1. Select K points as initial centroids 2. Repeat a. Form k clusters by assigning each point to its closest centroid b Recompute the centroid of each cluster. 3. Until centroids do not change 8/20/2016 PATTERN RECOGNITION
K-means The basic K-means algorithm is summarized below 1. Select K points as initial centroids 2. Repeat a. Form K clusters by assigning each point to its closest centroid. b. Recompute the centroid of each cluster. 3. Until centroids do not change. 8/20/2016 PATTERN RECOGNITION 13
K-means We first choose k initial centroids where k is a user-defined parameter, namely, the number of clusters desired Each point is then assigned to the closest centroid Each collection of points assigned to a centroid is a cluster. The centroid of each cluster is then updated based on the points assigned to the cluster. We repeat the assignment and update steps until the centroids remain the same 8/20/2016 PATTERN RECOGNITION
K-means We first choose K initial centroids, where K is a user-defined parameter, namely, the number of clusters desired. Each point is then assigned to the closest centroid. Each collection of points assigned to a centroid is a cluster. The centroid of each cluster is then updated based on the points assigned to the cluster. We repeat the assignment and update steps until the centroids remain the same. 8/20/2016 PATTERN RECOGNITION 14
K-means These steps are illustrated in the following figures Starting from three centroids, the final clusters are found in four assignment-update steps 8/20/2016 PATTERN RECOGNITION
K-means These steps are illustrated in the following figures. Starting from three centroids, the final clusters are found in four assignment-update steps. 8/20/2016 PATTERN RECOGNITION 15