COMP 578 Data Warehousing data mining Ch 2 Discovering Association Rules Keith C.C. Chan Department of computing The Hong Kong Polytechnic University
Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University Ch 2 Discovering Association Rules COMP 578 Data Warehousing & Data Mining
The Ar Mining Problem Given a database of transactions Each transaction being a list of items E.g. purchased by a customer in a visit Find all rules that correlate the presence of one set of items with that of another set of items E. g, 30%of people who buys diapers also uys beer 2
2 The AR Mining Problem ◼ Given a database of transactions. ◼ Each transaction being a list of items. ◼ E.g. purchased by a customer in a visit. ◼ Find all rules that correlate the presence of one set of items with that of another set of items ◼ E.g., 30% of people who buys diapers also buys beer
Motivation applications a If we can find such associations, we will be able to answer 222→beer (What should the company do to boost beer sales?) Diapers→??2 (What other products should the store stocks up?) Attached mailing in direct marketing 3
3 Motivation & Applications (1) ◼ If we can find such associations, we will be able to answer: ◼ ??? beer (What should the company do to boost beer sales?) ◼ Diapers ??? (What other products should the store stocks up?) ◼ Attached mailing in direct marketing
Motivation applications(2) Originally for marketing to understand purchasing trends What products or services customers tend to purchase at the same time or later on? Use market basket analysis to plan Coupon and discounting Do not offer simultaneous discounts on beer and diapers if they tend to be bought together Discount one to pull in sales of the other Product placement a Place products that have a strong purchasing relationship close together Place such products far apart to increase traffic past other Items
4 ◼ Originally for marketing to understand purchasing trends. ◼ What products or services customers tend to purchase at the same time, or later on? ◼ Use market basket analysis to plan: ◼ Coupon and discounting: ◼ Do not offer simultaneous discounts on beer and diapers if they tend to be bought together. ◼ Discount one to pull in sales of the other. ◼ Product placement. ◼ Place products that have a strong purchasing relationship close together. ◼ Place such products far apart to increase traffic past other items. Motivation & Applications (2)
Measure of Interestingness a For a data mining algorithm to mine for interesting association rules, users have to define a measure of"interestingness a Two popular interestingness measures have been ropose Support and Confidence Lift Ratio(Interest) MineSet from SGI use the terms predictability and prevalence instead of support and confidence
5 Measure of Interestingness ◼ For a data mining algorithm to mine for interesting association rules, users have to define a measure of “interestingness”. ◼ Two popular interestingness measures have been proposed: ◼ Support and Confidence ◼ Lift Ratio (Interest) ◼ MineSet from SGI use the terms predictability and prevalence instead of support and confidence