Data Mining Concepts and Definitions Why Data Mining? More intense competition at the global scale Recognition of the value in data sources Availability of quality data on customers, vendors, transactions. Web., etc Consolidation and integration of data repositories into data warehouses The exponential increase in data processing and storage capabilities and decrease in cost Movement toward conversion of information resources into nonphysical form Copynight@ 2014 Pearson Education, Inc Slide 4-6
Copyright © 2014 Pearson Education, Inc. Slide 4- 6 Data Mining Concepts and Definitions Why Data Mining? ▪ More intense competition at the global scale. ▪ Recognition of the value in data sources. ▪ Availability of quality data on customers, vendors, transactions, Web, etc. ▪ Consolidation and integration of data repositories into data warehouses. ▪ The exponential increase in data processing and storage capabilities; and decrease in cost. ▪ Movement toward conversion of information resources into nonphysical form
Definition of Data Mining The nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases Fayyadet al,(1996 Keywords in this definition: Process, nontrivial valid, novel, potentially useful, understandable Data mining: a misnomer? Other names: knowledge extraction, pattern analysis, knowledge discovery, information harvesting, pattern searching, data dredging Copynight@ 2014 Pearson Education, Inc Slide 4-7
Copyright © 2014 Pearson Education, Inc. Slide 4- 7 Definition of Data Mining ▪ The nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases. - Fayyad et al., (1996) ▪ Keywords in this definition: Process, nontrivial, valid, novel, potentially useful, understandable. ▪ Data mining: a misnomer? ▪ Other names: knowledge extraction, pattern analysis, knowledge discovery, information harvesting, pattern searching, data dredging,…
Data Mining at the Intersection of Many Disciplines Pattern R ecognItion DATA Machine MINING Learning Mathematical Modeling Databases Management Science Information Systems Copynight@ 2014 Pearson Education, Inc Slide 4-8
Copyright © 2014 Pearson Education, Inc. Slide 4- 8 Statistics Management Science & Information Systems Artificial Intelligence Databases Pattern Recognition Machine Learning Mathematical Modeling DATA MINING Data Mining at the Intersection of Many Disciplines
Data Mining Characteristics/Objectives Source of data for dm is often a consolidated data warehouse(not always!) DM environment is usually a client-server or a Web based information systems architecture Data is the most critical ingredient for DM which may include soft/unstructured data The miner is often an end user Striking it rich requires creative thinking Data mining tools capabilities and ease of use are essential(Web, Parallel processing, etc. Copynight@ 2014 Pearson Education, Inc Slide 4-9
Copyright © 2014 Pearson Education, Inc. Slide 4- 9 ▪ Source of data for DM is often a consolidated data warehouse (not always!). ▪ DM environment is usually a client-server or a Webbased information systems architecture. ▪ Data is the most critical ingredient for DM which may include soft/unstructured data. ▪ The miner is often an end user. ▪ Striking it rich requires creative thinking. ▪ Data mining tools’ capabilities and ease of use are essential (Web, Parallel processing, etc.). Data Mining Characteristics/Objectives
Application Case 4.1 Smarter Insurance: Infinity P&C Improves Customer Service and Combats fraud with Predictive Analytics Questions for Discussion 1. How did Infinity p&c improve customer service with data mining? 2. What were the challenges, the proposed solution, and the obtained results 3. What was their implementation strategy? Why is it important to produce results as early as possible in data mining studies? Copynight@ 2014 Pearson Education, Inc Slide 4-10
Copyright © 2014 Pearson Education, Inc. Slide 4- 10 Application Case 4.1 Smarter Insurance: Infinity P&C Improves Customer Service and Combats Fraud with Predictive Analytics Questions for Discussion 1. How did Infinity P&C improve customer service with data mining? 2. What were the challenges, the proposed solution, and the obtained results? 3. What was their implementation strategy? Why is it important to produce results as early as possible in data mining studies?