当前位置：和泉文库 > 计算机 > 浏览文档

香港科技大学：Clustering（PPT讲稿）

文件格式：PPT，文件大小：1.45MB，售价：16.74元

文档详细内容（约61页）

Nominal attributes a generalization of the binary variable in that it can take more than 2 states, e.g. red yellow, blue green Method 1: Simple matching m:# of matches, p: total of variables d(i,j)=Prm Method 2: use a large number of binary variables creating a new binary variable for each of the M nominal states

6 Nominal Attributes ◼ A generalization of the binary variable in that it can take more than 2 states, e.g., red, yellow, blue, green ◼ Method 1: Simple matching ◼ m: # of matches, p: total # of variables ◼ Method 2: use a large number of binary variables ◼ creating a new binary variable for each of the M nominal states p p m d i j − ( , )=

Other measures of cluster distance Minimum distance d(Ci, ci)=min peCi,p,eg IP-PI Max distance maX p∈Ci,p∈C Mean distance (C1,C1)=m Avarage distance d(C2C)=∑∑|P 7

7 Other measures of cluster distance ◼ Minimum distance ◼ Max distance ◼ Mean distance ◼ Avarage distance ( , ) min | '| d Ci Cj = pCi, p'Cj P − P ( , ) max | '| d Ci Cj = pCi, p'Cj P − P d Ci Cj = mi − mj ( , )     = − p C i p C j P P n n d C C i j i j | '| 1 ( , )

Major clustering methods u Partition based(K-means) Produces sphere-like clusters Good when know number of clusters Small and med sized databases Hierarchical methods(agglomerative or divisive) Produces trees of clusters Fast Density based(dbscan Produces arbitrary shaped clusters Good when dealing with spatial clusters(maps) Grid-based Produces clusters based on grids Fast for large, multidimensional databases Model-based Based on statistical models Allow objects to belong to several clusters 8

8 Major clustering methods ◼ Partition based (K-means) ◼ Produces sphere-like clusters ◼ Good when ◼ know number of clusters, ◼ Small and med sized databases ◼ Hierarchical methods (Agglomerative or divisive) ◼ Produces trees of clusters ◼ Fast ◼ Density based (DBScan) ◼ Produces arbitrary shaped clusters ◼ Good when dealing with spatial clusters (maps) ◼ Grid-based ◼ Produces clusters based on grids ◼ Fast for large, multidimensional databases ◼ Model-based ◼ Based on statistical models ◼ Allow objects to belong to several clusters

The K-Means Clustering method: for numerical attributes Given k, the k-means algorithm is implemented in four steps Partition objects into k non-empty subsets Compute seed points as the centroids of the clusters of the current partition the centroid is the center, i. e. mean point, of the cluster) Assign each object to the cluster with the nearest seed point go back to Step 2, stop when no more new assignment

9 The K-Means Clustering Method: for numerical attributes ◼ Given k, the k-means algorithm is implemented in four steps: ◼ Partition objects into k non-empty subsets ◼ Compute seed points as the centroids of the clusters of the current partition (the centroid is the center, i.e., mean point, of the cluster) ◼ Assign each object to the cluster with the nearest seed point ◼ Go back to Step 2, stop when no more new assignment

The mean point Y 2.5 2.75 The mean point can be a virtual point 10

10 The mean point 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 0 1 2 3 4 5 X X Y 1 2 2 4 3 3 4 2 2.5 2.75 The mean point can be a virtual point

点击进入文档下载页（PPT格式）

共61页，可试读20页，点击继续阅读 ↓↓

您可能感兴趣的文档

电子科技大学：《计算机操作系统》课程教学资源（PPT课件讲稿）第三章处理机的调度和死锁
《图像处理与计算机视觉 Image Processing and Computer Vision》课程教学资源（PPT课件讲稿）Chapter 11 Bundle adjustment Structure reconstruction SFM from N-frames
同济大学：《大数据分析与数据挖掘 Big Data Analysis and Mining》课程教学资源（PPT课件讲稿）关联规则 Association Rule
《程序设计基础》课程教学资源：实验教学大纲
白城师范学院：《数据库系统概论 An Introduction to Database System》课程教学资源（PPT课件讲稿）第二章关系数据库（2.4 关系代数 2.5 关系演算 2.6 小结）
安徽工贸职业技术学院：《计算机组装与维护》课程教学资源（PPT课件讲稿）项目五微型计算机维护
曙光：并行程序设计简介（PPT讲座）
《单片机原理与应用》课程教学资源（PPT课件讲稿）第7章显示与开关/键盘输入及微型打印机接口设计
数据结构与算法（PPT课件讲稿）Data Structures and Algorithms
四川大学：《计算机操作系统 Operating System Principles》课程教学资源（PPT课件讲稿）第5章死锁
四川大学：《Java面向对象编程》课程PPT教学课件（Object-Oriented Programming - Java）Unit 1.1 Java Applications 1.1.1 Applications in Java（熊运余）
厦门大学：《大数据技术原理与应用》课程教学资源（PPT课件讲稿，2016）第8章流计算
上海交通大学：TLS/SSL Security（PPT课件讲稿）
山东大学计算机学院：《人机交互技术》课程教学资源（PPT课件讲稿）第7章 Web界面设计
山东大学：《微机原理及单片机接口技术》课程教学资源（PPT课件讲稿）第三章 IAP15W4K58S4单片机的硬件结构
南京大学：《面向对象技术 OOT》课程教学资源（PPT课件讲稿）面向方面的编程 Aspect Oriented Programming
武昌首义学院：Word的基本操作与技巧（PPT讲稿，主讲：张旋子）
《VB程序设计》课程教学资源（PPT课件讲稿）第八章过程
湖南生物机电职业技术学院：《电子商务概论》课程教学资源（PPT课件）第五章网络信息搜索
《电子商务》课程教学资源（PPT课件讲稿）第十章网络营销
广西外国语学院：《计算机网络》课程教学资源（PPT课件讲稿）第7章传输层协议——TCP与UDP
九州大学（日本国立综合大学）：烟花算法爆炸因子分析及改良（艺术工学府：余俊）
图像视频编码与表达的理论与方法（PPT讲稿）图像压缩标准JPEG
中国科学技术大学：《计算机视觉》课程教学资源（PPT课件讲稿）第九章单幅图像深度重建 Depthmap Reconstruction Based on Monocular cues

点击购买下载（PPT）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录