当前位置：和泉文库 > 计算机 > 浏览文档

Data Mining Association Analysis——Basic Concepts and Algorithms Chapter 6 Introduction to Data Mining

文件格式：PPT，文件大小：1.73MB，售价：14.16元

文档详细内容（约65页）

Association Rule Mining Task Given a set of transactions T, the goal of association rule mining is to find all rules having support 2 minsup threshold confidence> minconf threshold Brute-force approach List all possible association rules Compute the support and confidence for each rule Prune rules that fail the minsup and minconf thresholds Computationally prohibitive n Steinbach. Kumar Introduction to Data Mining 4/18/2004

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#› Association Rule Mining Task Given a set of transactions T, the goal of association rule mining is to find all rules having – support ≥ minsup threshold – confidence ≥ minconf threshold Brute-force approach: – List all possible association rules – Compute the support and confidence for each rule – Prune rules that fail the minsup and minconf thresholds  Computationally prohibitive!

Mining association Rules Example of Rules TID tems Bread. milk MMilk, Diaper>Beer](s=0.4, C=0.67) Bread, Diaper, Beer, eggs MMilk, Beer] >Diaper)(s=0.4, C=1.0) Milk, Diaper, beer, Coke [Diaper, Beer]->Milk(s=0.4, C=0.67) [Beer]->Milk, Diaper](s=0.4, C=0.67) Bread, Milk, Diaper, Beer [Diaper]->Milk, Beer](s=0.4, C=0.5) Bread, Milk, Diaper, Coke MMilk>Diaper, Beer)(s=0.4, C=0.5) Observations All the above rules are binary partitions of the same itemset MIlk, Diaper, Beer] Rules originating from the same itemset have identical support but can have different confidence Thus, we may decouple the support and confidence requirements O Tan, Steinbach, Kumar Introduction to Data Mining 4/18/2004

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#› Mining Association Rules Example of Rules: {Milk,Diaper} → {Beer} (s=0.4, c=0.67) {Milk,Beer} → {Diaper} (s=0.4, c=1.0) {Diaper,Beer} → {Milk} (s=0.4, c=0.67) {Beer} → {Milk,Diaper} (s=0.4, c=0.67) {Diaper} → {Milk,Beer} (s=0.4, c=0.5) {Milk} → {Diaper,Beer} (s=0.4, c=0.5) TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Observations: • All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer} • Rules originating from the same itemset have identical support but can have different confidence • Thus, we may decouple the support and confidence requirements

Mining association Rules TWo-step approach 1. Frequent Itemset Generation Generate all itemsets whose support minsup 2. Rule generation Generate high confidence rules from each frequent itemset, where each rule is a binary partitioning of a frequent itemset Frequent itemset generation is still computationally expensive O Tan, Steinbach, Kumar Introduction to Data Mining 4/18/2004

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#› Mining Association Rules Two-step approach: 1. Frequent Itemset Generation – Generate all itemsets whose support  minsup 2. Rule Generation – Generate high confidence rules from each frequent itemset, where each rule is a binary partitioning of a frequent itemset Frequent itemset generation is still computationally expensive

Frequent Itemset Generation null BD BE ABC)(ABD)(ABE)(ACD)(ACE ADE BCD BCE BDE(CDE ABCD ABCE ABDE ACDE BCDE Given d items. there are 2a possible ABCDE candidate itemsets n Steinbach. Kumar Introduction to Data Mining 4/18/2004

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#› Frequent Itemset Generation null AB AC AD AE BC BD BE CD CE DE A B C D E ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE Given d items, there are 2d possible candidate itemsets

Frequent Itemset Generation Brute-force approach Each itemset in the lattice is a candidate frequent itemset Count the support of each candidate by scanning the database Transactions List of Candidates TID tems Bread. milk Bread, Diaper, Beer, Eggs 2345 Milk, Diaper, Beer, Coke Bread, Milk, Diaper, beer Bread, Milk, Diaper, Coke W atch each transaction against every candidate Complexity -O(NMw)=> Expensive since M=2d! ! O Tan, Steinbach, Kumar Introduction to Data Mining 4/18/2004

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 ‹#› Frequent Itemset Generation Brute-force approach: – Each itemset in the lattice is a candidate frequent itemset – Count the support of each candidate by scanning the database – Match each transaction against every candidate – Complexity ~ O(NMw) => Expensive since M = 2d !!! TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke N Transactions List of Candidates M w

点击进入文档下载页（PPT格式）

共65页，可试读20页，点击继续阅读 ↓↓

您可能感兴趣的文档

《信息安全与管理》课程教学资源（PPT课件讲稿）第六章公开密钥设施PKI
《计算机应用基础》课程教学资源（PPT课件讲稿）第一章计算机基础知识
《Computer Networking：A Top Down Approach》英文教材教学资源（PPT课件讲稿，3rd edition）Chapter 5 Link Layer
西安电子科技大学：《微机原理与接口技术》课程教学资源（PPT课件讲稿）第六章存储器设计
《编译原理》课程教学资源（PPT课件讲稿）第五章类型检查
《网络搜索和挖掘关键技术 Web Search and Mining》课程教学资源（PPT讲稿）Lecture 10 Query expansion
北京师范大学现代远程教育：《计算机应用基础》课程教学资源（PPT课件讲稿）第一章计算机常识
中国科学技术大学：《网络信息安全 NETWORK SECURITY》课程教学资源（PPT课件讲稿）UNIX/LINUX 操作系统
哈尔滨工业大学：《语言信息处理》课程教学资源（PPT课件讲稿）机器翻译 I Machine Translation I（主讲：张宇）
《操作系统 Operating System》课程教学资源（PPT课件讲稿）概述 Overview
《计算机网络》课程教学大纲（计算机科学与技术、网络工程专业）
《计算机组装维修》课程PPT教学课件（实训教程）第3章主板
《计算机组成原理》课程教学资源（PPT课件讲稿）第五章存储器层次结构
电子科技大学：《Unix操作系统基础》课程教学资源（PPT课件）第一章 UNIX操作系统概述、第二章 UNIX使用入门
中国水利水电出版社：《单片机原理及应用》课程PPT教学课件（C语言版）第2章 MCS-51单片机基本结构
《数据结构》课程教学资源（PPT课件讲稿）第三章栈和队列
《网络安全 Network Security》教学资源（PPT讲稿）Topic 3 User Authentication
《C++语言基础教程》课程电子教案（PPT教学课件）教学资源（PPT课件）第2讲 C++语言基础
长春大学：《计算机应用基础》课程教学资源（PPT课件讲稿）第二章操作系统
南京大学：《数据结构 Data Structures》课程教学资源（PPT课件讲稿）第二章线性表
浪潮公司：并行程序、编译与函数库简介、应用软件的调优
《C程序设计》课程电子教案（PPT课件讲稿）第二章基本数据类型及运算
安徽理工大学：《汇编语言》课程教学资源（PPT课件讲稿）第四章汇编语言程序格式
清华大学：《网络安全 Network Security》课程教学资源（PPT课件讲稿）Lecture 01 Introduction

点击购买下载（PPT）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录