Data Mining (Cont.) Descriptive Patterns Associations Find books that are often bought by "similar"customers.If a new such customer buys one such book,suggest the others too. Associations may be used as a first step in detecting causation E.g.,association between exposure to chemical X and cancer, Clusters E.g.,typhoid cases were clustered in an area surrounding a contaminated well Detection of clusters remains important in detecting epidemics Database System Concepts-6th Edition 20.12 @Silberschatz,Korth and Sudarshan
Database System Concepts - 6 20.12 ©Silberschatz, Korth and Sudarshan th Edition Data Mining (Cont.) Descriptive Patterns Associations Find books that are often bought by “similar” customers. If a new such customer buys one such book, suggest the others too. Associations may be used as a first step in detecting causation E.g., association between exposure to chemical X and cancer, Clusters E.g., typhoid cases were clustered in an area surrounding a contaminated well Detection of clusters remains important in detecting epidemics
Classification Rules Classification rules help assign new objects to classes. E.g.,given a new automobile insurance applicant,should he or she be classified as low risk,medium risk or high risk? Classification rules for above example could use a variety of data,such as educational level,salary,age,etc. V person P,P.degree masters and P.income 75,000 →P.credit=excellent V person P,P.degree bachelors and (P.income≥25,000andP.income≤75,000) →P.credit=good Rules are not necessarily exact:there may be some misclassifications Classification rules can be shown compactly as a decision tree. Database System Concepts-6th Edition 20.13 ©Silberschat乜,Korth and Sudarshan
Database System Concepts - 6 20.13 ©Silberschatz, Korth and Sudarshan th Edition Classification Rules Classification rules help assign new objects to classes. E.g., given a new automobile insurance applicant, should he or she be classified as low risk, medium risk or high risk? Classification rules for above example could use a variety of data, such as educational level, salary, age, etc. person P, P.degree = masters and P.income > 75,000 P.credit = excellent person P, P.degree = bachelors and (P.income 25,000 and P.income 75,000) P.credit = good Rules are not necessarily exact: there may be some misclassifications Classification rules can be shown compactly as a decision tree