Closed patterns and max-Patterns ■ Exercise.DB={<a1…a1003 1/ ago> 50 a Min_sup=1 What is the set of closed itemset? ■<a1…,a100>:1 <a aco>:2 What is the set of max-pattern <a1n…ta100 What is the set of all patterns? February 4, 2021 Data Mining: Concepts and Techniques 8
February 4, 2021 Data Mining: Concepts and Techniques 8 Closed Patterns and Max-Patterns ◼ Exercise. DB = {<a1 , …, a100>, < a1 , …, a50>} ◼ Min_sup = 1. ◼ What is the set of closed itemset? ◼ <a1 , …, a100>: 1 ◼ < a1 , …, a50>: 2 ◼ What is the set of max-pattern? ◼ <a1 , …, a100>: 1 ◼ What is the set of all patterns? ◼ !!
Chapter 5: Mining Frequent Patterns Association and correlations Basic concepts and a road map Efficient and scalable frequent itemset mining methods Mining various kinds of association rules From association mining to correlation analysIs Constraint-based association mining Summary February 4, 2021 Data Mining: Concepts and Techniques
February 4, 2021 Data Mining: Concepts and Techniques 9 Chapter 5: Mining Frequent Patterns, Association and Correlations ◼ Basic concepts and a road map ◼ Efficient and scalable frequent itemset mining methods ◼ Mining various kinds of association rules ◼ From association mining to correlation analysis ◼ Constraint-based association mining ◼ Summary
Scalable methods for Mining frequent patterns The downward closure property of frequent patterns a Any subset of a frequent itemset must be frequent If ibeer, diaper, nuts is frequent, so is ibeer diaper i. e, every transaction having beer, diaper, nuts also contains (beer, diaper Scalable mining methods: Three major approaches Apriori (agrawal srikant@VLDB94 Freq. pattern growth(ePgrowth-Han, Pei yin @SIGMOD00) Vertical data format approach(Charm-Zaki Hsiao @SDM02) February 4, 2021 Data Mining: Concepts and Techniques 10
February 4, 2021 Data Mining: Concepts and Techniques 10 Scalable Methods for Mining Frequent Patterns ◼ The downward closure property of frequent patterns ◼ Any subset of a frequent itemset must be frequent ◼ If {beer, diaper, nuts} is frequent, so is {beer, diaper} ◼ i.e., every transaction having {beer, diaper, nuts} also contains {beer, diaper} ◼ Scalable mining methods: Three major approaches ◼ Apriori (Agrawal & Srikant@VLDB’94) ◼ Freq. pattern growth (FPgrowth—Han, Pei & Yin @SIGMOD’00) ◼ Vertical data format approach (Charm—Zaki & Hsiao @SDM’02)
Apriori: a candidate Generation-and-Test approach Apriori pruning principle: If there is any itemset which is infrequent, its superset should not be generated /tested (Agrawal srikant @VLDB 94, Mannila, et al.@ KDD 94) Method Initially, scan DB once to get frequent 1-itemset Generate length( k+1)candidate itemsets from length k frequent itemsets Test the candidates against dB Terminate when no frequent or candidate set can be generated February 4, 2021 Data Mining: Concepts and Techniques 11
February 4, 2021 Data Mining: Concepts and Techniques 11 Apriori: A Candidate Generation-and-Test Approach ◼ Apriori pruning principle: If there is any itemset which is infrequent, its superset should not be generated/tested! (Agrawal & Srikant @VLDB’94, Mannila, et al. @ KDD’ 94) ◼ Method: ◼ Initially, scan DB once to get frequent 1-itemset ◼ Generate length (k+1) candidate itemsets from length k frequent itemsets ◼ Test the candidates against DB ◼ Terminate when no frequent or candidate set can be generated
The apriori algorithm-An Example Supin =2 ItemsetIsup Database TDB Itemset I sup IAN 2 Tid Items B}3 IAS 4C}3 {B} A, C D B, C. E can IC 20 2333 30A, B, C, E 但}3 3 40 B, E 2L Itemset sup 2 Itemset L,「 ItemsetIsup {A,B} {A,C} 匚AC}2 2nd scan A, B {A,C} tB, C 2 B,C}2 {A,E} 1B, Ey 3 {B, {B,C} C, El 2 {B,E} C. Itemset 3rd scan Itemset I sup AB, CE B, C,E] 2 February 4, 2021 Data Mining: Concepts and Techniques 12
February 4, 2021 Data Mining: Concepts and Techniques 12 The Apriori Algorithm—An Example Database TDB 1 st scan C1 L1 L2 C2 C2 2 nd scan C3 3 L3 rd scan Tid Items 10 A, C, D 20 B, C, E 30 A, B, C, E 40 B, E Itemset sup {A} 2 {B} 3 {C} 3 {D} 1 {E} 3 Itemset sup {A} 2 {B} 3 {C} 3 {E} 3 Itemset {A, B} {A, C} {A, E} {B, C} {B, E} {C, E} Itemset sup {A, B} 1 {A, C} 2 {A, E} 1 {B, C} 2 {B, E} 3 {C, E} 2 Itemset sup {A, C} 2 {B, C} 2 {B, E} 3 {C, E} 2 Itemset {B, C, E} Itemset sup {B, C, E} 2 Supmin = 2