Data Mining in Business Intelligence Increasing potential to support business decisions End user Decision Making Data presentation Business Analyst Visualization Techniques Data Mining Information Discovery Analyst Data Exploration Statistical Summary, Querying, and Reporting Data Preprocessing/Integration, Data Warehouses Data Sources DBA Paper, Files, Web documents, Scientific experiments, Database Systems
24 Data Mining in Business Intelligence Increasing potential to support business decisions End User Business Analyst Data Analyst DBA Decision Making Data Presentation Visualization Techniques Data Mining Information Discovery Data Exploration Statistical Summary, Querying, and Reporting Data Preprocessing/Integration, Data Warehouses Data Sources Paper, Files, Web documents, Scientific experiments, Database Systems
KDD Process: A Typical view from ML and Statistics nput Data→ Data Pre-→ Data→post Processing Mining Processing Data integration Pattern discovery Pattern evaluation Normalization association correlation Pattern selection Feature selection Classification Pattern interpretation Dimension reduction Clustering Pattern visualization Outlier analysis This is a view from typical machine learning and statistics communities 25
25 KDD Process: A Typical View from ML and Statistics Input Data Data Mining Data PreProcessing PostProcessing ◼ This is a view from typical machine learning and statistics communities Data integration Normalization Feature selection Dimension reduction Pattern discovery Association & correlation Classification Clustering Outlier analysis … … … … Pattern evaluation Pattern selection Pattern interpretation Pattern visualization
Which view Do You prefer? Which view do you prefer? KDD VS. ML/Stat VS. Business Intelligence Depending on the data, applications and your focus Data Mining Vs Data Exploration Business intelligence view Warehouse data cube, reporting but not much mInIng Business objects vs. data mining tools Supply chain example: mining vs OLAP VS presentation tools Data presentation Vs. data exploration
26 Which View Do You Prefer? ◼ Which view do you prefer? ◼ KDD vs. ML/Stat. vs. Business Intelligence ◼ Depending on the data, applications, and your focus ◼ Data Mining vs. Data Exploration ◼ Business intelligence view ◼ Warehouse, data cube, reporting but not much mining ◼ Business objects vs. data mining tools ◼ Supply chain example: mining vs. OLAP vs. presentation tools ◼ Data presentation vs. data exploration
Chapter 1, Introduction Why Data Mining? What Is Data Mining? A Multi-Dimensional View of Data Mining What Kinds of data Can be mined? What kinds of patterns can be mined? What Kinds of Technologies Are Used? What Kinds of Applications Are Targeted? Major issues in Data Mining A Brief History of data Mining and Data Mining Societ Summary
27 Chapter 1. Introduction ◼ Why Data Mining? ◼ What Is Data Mining? ◼ A Multi-Dimensional View of Data Mining ◼ What Kinds of Data Can Be Mined? ◼ What Kinds of Patterns Can Be Mined? ◼ What Kinds of Technologies Are Used? ◼ What Kinds of Applications Are Targeted? ◼ Major Issues in Data Mining ◼ A Brief History of Data Mining and Data Mining Society ◼ Summary
Multi-Dimensional View of Data Mining Data to be mined Database data(extended-relational, object-oriented heterogeneous legacy), data warehouse, transactional data, stream, spatiotemporal time-series, sequence, text and web, multi-media, graphs social and information networks Knowledge to be mined(or: Data mining functions) Characterization, discrimination, association, classification, clustering trend/deviation, outlier analysis, etc Descriptive vs. predictive data mining Multiple/integrated functions and mining at multiple levels Techniques utilized Data-intensive, data warehouse(oLAP), machine learning, statistics, pattern recognition, visualization, high-performance, etc. Applications adapted Retail, telecommunication, banking fraud analysis, bio-data mining stock market analysis text mining Web mining, etc
28 Multi-Dimensional View of Data Mining ◼ Data to be mined ◼ Database data (extended-relational, object-oriented, heterogeneous, legacy), data warehouse, transactional data, stream, spatiotemporal, time-series, sequence, text and web, multi-media, graphs & social and information networks ◼ Knowledge to be mined (or: Data mining functions) ◼ Characterization, discrimination, association, classification, clustering, trend/deviation, outlier analysis, etc. ◼ Descriptive vs. predictive data mining ◼ Multiple/integrated functions and mining at multiple levels ◼ Techniques utilized ◼ Data-intensive, data warehouse (OLAP), machine learning, statistics, pattern recognition, visualization, high-performance, etc. ◼ Applications adapted ◼ Retail, telecommunication, banking, fraud analysis, bio-data mining, stock market analysis, text mining, Web mining, etc