Chapter 1, Introduction Why Data Mining? What Is Data Mining? A Multi-Dimensional View of Data Mining What Kinds of data Can be mined? What kinds of patterns can be mined? What Kinds of Technologies Are Used? What Kinds of Applications Are Targeted? Major issues in Data Mining A Brief History of data Mining and Data Mining Societ Summary 14
14 Chapter 1. Introduction ◼ Why Data Mining? ◼ What Is Data Mining? ◼ A Multi-Dimensional View of Data Mining ◼ What Kinds of Data Can Be Mined? ◼ What Kinds of Patterns Can Be Mined? ◼ What Kinds of Technologies Are Used? ◼ What Kinds of Applications Are Targeted? ◼ Major Issues in Data Mining ◼ A Brief History of Data Mining and Data Mining Society ◼ Summary
What Is Data Mining? Data mining( knowledge discovery from data Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data Alternative names Knowledge discovery(mining) in databases(KDD), knowledge extraction, data/pattern analysis data archeology data dredging, information harvesting business intelligence, etc Watch out: Is everything"data mining"? Simple search and query processing (Deductive)expert systems 迹 15
15 What Is Data Mining? ◼ Data mining (knowledge discovery from data) ◼ Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) patterns or knowledge from huge amount of data ◼ Alternative names ◼ Knowledge discovery (mining) in databases (KDD), knowledge extraction, data/pattern analysis, data archeology, data dredging, information harvesting, business intelligence, etc. ◼ Watch out: Is everything “data mining”? ◼ Simple search and query processing ◼ (Deductive) expert systems
What is(not)Data Mining? What is not data What is data mining? Mining Look up phone Certain names are more number in phone prevalent in certain US directory locations(O'Brien, ORurke O'Reilly . in Boston area) Query a Web Group together similar search engine for documents returned by information search engine according to about amazon their context (e.g. Amazon rainforest, Amazon. com,)
What is (not) Data Mining? What is Data Mining? – Certain names are more prevalent in certain US locations (O’Brien, O’Rurke, O’Reilly… in Boston area) – Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com,) What is not Data Mining? – Look up phone number in phone directory – Query a Web search engine for information about “Amazon
Applications Banking: loan/credit card approval predict good customers based on old customers a Customer relationship management identify those who are likely to leave for a competitor. targeted marketing identify likely responders to promotions fraud detection telecommunications financial transactions from an online stream of event identify fraudulent events Manufacturing and production automatically adjust knobs when process parameter changes
Applications ◼ Banking: loan/credit card approval ◼ predict good customers based on old customers ◼ Customer relationship management: ◼ identify those who are likely to leave for a competitor. ◼ Targeted marketing: ◼ identify likely responders to promotions ◼ Fraud detection: telecommunications, financial transactions ◼ from an online stream of event identify fraudulent events ◼ Manufacturing and production: ◼ automatically adjust knobs when process parameter changes
Applications(continued) Medicine disease outcome, effectiveness of treatments analyze patient disease history: find relationship between di seases Molecular/Pharmaceutical: identify new drugs a Scientific data analysis identify new galaxies by searching for sub clusters a Web site/store design and promotion find affinity of visitor to pages and modify layout
Applications (continued) ◼ Medicine: disease outcome, effectiveness of treatments ◼ analyze patient disease history: find relationship between diseases ◼ Molecular/Pharmaceutical: identify new drugs ◼ Scientific data analysis: ◼ identify new galaxies by searching for sub clusters ◼ Web site/store design and promotion: ◼ find affinity of visitor to pages and modify layout