Data rich but information poor Databases are too big Data Mining can help discover knowledge 您 Terrorbytes
14 Data Rich but Information Poor Databases are too big Data Mining can help discover knowledge Terrorbytes
What is Data Mining? (1) Knowledge Discovery in Databases(KDD) Discover useful patterns from large data warehouses Nontrivial extraction of implicit, previously unknown, and potentially useful nformation from data 95% of the salesperson, male or female, that are located in toronto and are over 6 feet in height and unable to speak French make over 1 million in sales every year for the last 5 years
15 What is Data Mining? (1) • Knowledge Discovery in Databases (KDD). • Discover useful patterns from large data warehouses. • Nontrivial extraction of implicit, previously unknown, and potentially useful information from data – 95% of the salesperson, male or female, that are located in Toronto and are over 6 feet in height and unable to speak French make over 1 million in sales every year for the last 5 years
What is Data Mining(2) Data Data Data Sources nowledge Ba Data Warehosuimg and Data Miming
16 What is Data Mining (2) Data Warehouse Data Sources Data Mining Knowledge Base
Data Mining Vs. Statistical Inference Age distribution, Female Female Age Distribution 600 400 z300 Can you tell the differences? ys88守8怒N88 Age distribution, Male 250 20 150 50 Male Age Distribution P9856B石后RR吕5 ge
17 Data Mining vs. Statistical Inference Age distribution, Female 0 100 200 300 400 500 600 0 6 12 18 24 30 36 42 48 54 60 66 72 78 84 90 Age N Age distribution, Male 0 50 100 150 200 250 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 Age N Female Age Distribution Can you tell the differences? Male Age Distribution
Data Mining Vs. Statistical Inference(2) 内科 針炙科 推拿科 3%2 囗腫瘤科 6% 36% ■婦科 口呼吸系統科 8% 糖尿科 11% 2% 口消化系統科 風濕科 口腎科 囗老年病科
18 36% 11% 22% 8% 6% 3%3%2%2%2%1%1%1%0% 內科 針炙科 推拿科 腫瘤科 婦科 呼吸系統科 糖尿科 消化系統科 風濕科 腎科 老年病科 腦內科 Data Mining vs. Statistical Inference (2)