当前位置：和泉文库 > 计算机 > 浏览文档

香港科技大学：Clustering（PPT讲稿）

文件格式：PPT，文件大小：1.45MB，售价：16.74元

文档详细内容（约61页）

The K-Means Clustering Method Example Assign Gate each the objects°23 cluste to most means similar reassign reassign K=2 Arbitrarily choose K object as initial cluster center Update the means 012345678910 11

11 The K-Means Clustering Method ◼ Example 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 K=2 Arbitrarily choose K object as initial cluster center Assign each objects to most similar center Update the cluster means Update the cluster means reassign reassign

Comments on the k-means method Strength: Relatively efficient: atkn), where n is objects, kis clusters, and t is iterations. Normally k, t<<n Comment: Often terminates at a /ocaloptimum Weakness Applicable only when mean is defined, then what about categorical data? Need to specify k the numberof clusters in advance Unable to handle noisy data and outliers too wel/ Not suitable to discover clusters with non-convex shapes 12

12 Comments on the K-Means Method ◼ Strength: Relatively efficient: O(tkn), where n is # objects, k is # clusters, and t is # iterations. Normally, k, t << n. ◼ Comment: Often terminates at a local optimum. ◼ Weakness ◼ Applicable only when mean is defined, then what about categorical data? ◼ Need to specify k, the number of clusters, in advance ◼ Unable to handle noisy data and outliers too well ◼ Not suitable to discover clusters with non-convex shapes

Robustness Ⅹ 2 2 4 101.5 2.75 1000 13

13 Robustness 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 1 10 100 1000 X X Y 1 2 2 4 3 3 400 2 101.5 2.75

Variations of the k-means method A few variants of the k-means which differ in Selection of the initial k means Dissimilarity calculations Strategies to calculate cluster means Handling categorical data: k-modes(huang 98) Replacing means of clusters with modes Using new dissimilarity measures to deal with categorical objects Using a frequency-based method to update modes of clusters A mixture of categorical and numerical data k-prototype method 14

14 Variations of the K-Means Method ◼ A few variants of the k-means which differ in ◼ Selection of the initial k means ◼ Dissimilarity calculations ◼ Strategies to calculate cluster means ◼ Handling categorical data: k-modes (Huang’98) ◼ Replacing means of clusters with modes ◼ Using new dissimilarity measures to deal with categorical objects ◼ Using a frequency-based method to update modes of clusters ◼ A mixture of categorical and numerical data: k-prototype method

K-Modes: See J X. Huang s paper online (Data Mining and Knowledge Discovery Journal, Springer) 3. The k-means algorithm The k-means algorithm(Mac Queen, 1967; Anderberg, 1973), one of the mostly used clustering algorithms, is classified as a partitional or nonhierarchical clustering method (ain and Dubes, 1988). Given a set of numeric objects X and an integer number k(<n), the k-means algorithm searches for a partition of X into k clusters that minimises the within groups sum of squared errors (WGSS). This process is often formulated as the following mathematical program problem P(Selim and Ismail, 1984; Bobrowski and Bezdek, 1991) Minimise p(,Q)=∑∑md(x,Q) subject to <I<n ∈{0,1},1≤i≤n,1≤l≤k where W is an n x k partition matrix, 0=(01, 02, .., 0k is a set of objects in the same object domain, and d(, )is the squared euclidean distance between two objects 15

15 K-Modes: See J. X. Huang’s paper online (Data Mining and Knowledge Discovery Journal, Springer)

点击进入文档下载页（PPT格式）

共61页，可试读20页，点击继续阅读 ↓↓

您可能感兴趣的文档

电子科技大学：《计算机操作系统》课程教学资源（PPT课件讲稿）第三章处理机的调度和死锁
《图像处理与计算机视觉 Image Processing and Computer Vision》课程教学资源（PPT课件讲稿）Chapter 11 Bundle adjustment Structure reconstruction SFM from N-frames
同济大学：《大数据分析与数据挖掘 Big Data Analysis and Mining》课程教学资源（PPT课件讲稿）关联规则 Association Rule
《程序设计基础》课程教学资源：实验教学大纲
白城师范学院：《数据库系统概论 An Introduction to Database System》课程教学资源（PPT课件讲稿）第二章关系数据库（2.4 关系代数 2.5 关系演算 2.6 小结）
安徽工贸职业技术学院：《计算机组装与维护》课程教学资源（PPT课件讲稿）项目五微型计算机维护
曙光：并行程序设计简介（PPT讲座）
《单片机原理与应用》课程教学资源（PPT课件讲稿）第7章显示与开关/键盘输入及微型打印机接口设计
数据结构与算法（PPT课件讲稿）Data Structures and Algorithms
四川大学：《计算机操作系统 Operating System Principles》课程教学资源（PPT课件讲稿）第5章死锁
四川大学：《Java面向对象编程》课程PPT教学课件（Object-Oriented Programming - Java）Unit 1.1 Java Applications 1.1.1 Applications in Java（熊运余）
厦门大学：《大数据技术原理与应用》课程教学资源（PPT课件讲稿，2016）第8章流计算
上海交通大学：TLS/SSL Security（PPT课件讲稿）
山东大学计算机学院：《人机交互技术》课程教学资源（PPT课件讲稿）第7章 Web界面设计
山东大学：《微机原理及单片机接口技术》课程教学资源（PPT课件讲稿）第三章 IAP15W4K58S4单片机的硬件结构
南京大学：《面向对象技术 OOT》课程教学资源（PPT课件讲稿）面向方面的编程 Aspect Oriented Programming
武昌首义学院：Word的基本操作与技巧（PPT讲稿，主讲：张旋子）
《VB程序设计》课程教学资源（PPT课件讲稿）第八章过程
湖南生物机电职业技术学院：《电子商务概论》课程教学资源（PPT课件）第五章网络信息搜索
《电子商务》课程教学资源（PPT课件讲稿）第十章网络营销
广西外国语学院：《计算机网络》课程教学资源（PPT课件讲稿）第7章传输层协议——TCP与UDP
九州大学（日本国立综合大学）：烟花算法爆炸因子分析及改良（艺术工学府：余俊）
图像视频编码与表达的理论与方法（PPT讲稿）图像压缩标准JPEG
中国科学技术大学：《计算机视觉》课程教学资源（PPT课件讲稿）第九章单幅图像深度重建 Depthmap Reconstruction Based on Monocular cues

点击购买下载（PPT）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录