Basic Concepts: Association Rules Tid Items bought Find all the rulesⅩ→ywit Beer, Nuts diaper 0000 minimum support and confidence Beer, Coffee, diaper Beer, Diaper, eggs support s, probability that a Nuts, Eggs, Milk transaction contains xu y 50Nuts, Coffee, Diaper, Eggs,Milk confidence, c conditional Customer Customer probability that a transaction lyS diaper having x also contains r Let minsup= 50%, minconf=50% Freg. Pat: Beer: 3, Nuts: 3, Diaper: 4, Eggs: 3, Customer Beer, Diaper]: 3 buys beer Association rules: (many more!) Beer> Diaper(60%, 100%) Diaper> Beer(60%,75%) 6
6 Basic Concepts: Association Rules ◼ Find all the rules X → Y with minimum support and confidence ◼ support, s, probability that a transaction contains X Y ◼ confidence, c, conditional probability that a transaction having X also contains Y Let minsup = 50%, minconf = 50% Freq. Pat.: Beer:3, Nuts:3, Diaper:4, Eggs:3, {Beer, Diaper}:3 Customer buys diaper Customer buys both Customer buys beer 40 Nuts, Eggs, Milk 50 Nuts, Coffee, Diaper, Eggs, Milk 30 Beer, Diaper, Eggs 20 Beer, Coffee, Diaper 10 Beer, Nuts, Diaper Tid Items bought ◼ Association rules: (many more!) ◼ Beer → Diaper (60%, 100%) ◼ Diaper → Beer (60%, 75%)
Basic Concepts: Frequent Patterns and Association rules Itemset X={X1,…× Find all the rules x with minimum support and confidence support,, s,probability sup port(X→Y)=P(r∪F) that a transaction contains x∪Y confidence, c conditional probability that a confidence(→=P(F|X) transaction having x also contains y
7 Basic Concepts: Frequent Patterns and Association Rules ◼ Itemset X = {x1 , …, xk} ◼ Find all the rules X → Y with minimum support and confidence ◼ support, s, probability that a transaction contains X Y ◼ confidence, c, conditional probability that a transaction having X also contains Y sup port(X Y) = P(X Y ) confidence(X Y ) = P(Y | X )
Basic Concepts: Frequent Patterns and Association Rules Transaction-id Items bought Let Supmin =50%, confmin =50% 10 A,B D 20 A,C,D Minimum support number is 3 30 A, D,E B, EF Freg. Pat: A: 3, B: 3, D: 4 E 3, AD: 3 50 B, C.DEF Association rules Customer Customer buys both A→D(60%,100%)3/5,1) buys diaper D→A(60%75%)3/5,3/4) Strong association rule Customer buys beer 8
8 Basic Concepts: Frequent Patterns and Association Rules Let supmin = 50%, confmin = 50% Minimum support number is 3 Freq. Pat.: {A:3, B:3, D:4, E:3, AD:3} Association rules: A → D (60%, 100%)(3/5,1) D → A (60%, 75%)(3/5,3/4) Strong association rule Customer buys diaper Customer buys both Customer buys beer Transaction-id Items bought 10 A, B, D 20 A, C, D 30 A, D, E 40 B, E, F 50 B, C, D, E, F
Association Rule Mining Task Given a set of transactions T, the goal of association rule mining is to find all rules having support> minsup threshold confidence minconf threshold Brute-force approach a List all possible association rules Compute the support and confidence for each rule Prune rules that fail the minsup and minconf thresholds Computationally prohibitive
9 Association Rule Mining Task ◼ Given a set of transactions T, the goal of association rule mining is to find all rules having ◼ support ≥ minsup threshold ◼ confidence ≥ minconf threshold ◼ Brute-force approach: ◼ List all possible association rules ◼ Compute the support and confidence for each rule ◼ Prune rules that fail the minsup and minconf thresholds Computationally prohibitive!
Mining Association Rules Example of Rules TD ltems Bread. milk MIlk,Diaper]> Beer)(s=0. 4, C=0.67) Bread, Diaper, Beer, eggs MIlk, Beer]->Diaper](s=0.4, C=1.0) 2345 Milk, Diaper, Beer, Coke [Diaper, Beer]->Milk](s=0.4, C=0.67) Bread, Milk, Diaper, beer Beer]>Milk, Diaper](s=0.4,C=0.67) DIaper)>Milk, Beer)(s=0. 4, C=0.5) Bread, Milk, Diaper, Coke (Milk)>Diaper, Beer (s=0.4, c=0.5) Observations All the above rules are binary partitions of the same itemset Milk, Diaper, Beery Rules originating from the same itemset have identical support but can have different confidence Thus, we may decouple the support and confidence requirements
10 Mining Association Rules Example of Rules: {Milk,Diaper} → {Beer} (s=0.4, c=0.67) {Milk,Beer} → {Diaper} (s=0.4, c=1.0) {Diaper,Beer} → {Milk} (s=0.4, c=0.67) {Beer} → {Milk,Diaper} (s=0.4, c=0.67) {Diaper} → {Milk,Beer} (s=0.4, c=0.5) {Milk} → {Diaper,Beer} (s=0.4, c=0.5) TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Observations: • All the above rules are binary partitions of the same itemset: {Milk, Diaper, Beer} • Rules originating from the same itemset have identical support but can have different confidence • Thus, we may decouple the support and confidence requirements