当前位置：和泉文库 > 计算机 > 浏览文档

重庆大学：《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件（英文版）Chapter 9 Outlier Analysis

◼ Outlier and Outlier Analysis ◼ Outlier Detection Methods ◼ Statistical Approaches ◼ Proximity-Base Approaches ◼ Clustering-Base Approaches ◼ Classification Approaches ◼ Summary

文件格式：PPT，文件大小：855KB，售价：13.74元

文档详细内容（约49页）

Chapter 9. Outlier Analysis Outlier and outlier Analysis Outlier Detection Methods Statistical Approaches Proximity-Base Approaches Clustering-Base Approaches Classification Approaches Summary

1 Chapter 9. Outlier Analysis ◼ Outlier and Outlier Analysis ◼ Outlier Detection Methods ◼ Statistical Approaches ◼ Proximity-Base Approaches ◼ Clustering-Base Approaches ◼ Classification Approaches ◼ Summary

What Is Outlier Discovery? What are outliers? The set of objects are considerably dissimilar from the remainder of the data EXample: Sports: Michael Jordon, Wayne Gretzky, Problem: Define and find outliers in large data sets Applications Credit card fraud detection Telecom fraud detection Customer segmentation ■ Medical analysis network intrusion detection fault detection

2 What Is Outlier Discovery? ◼ What are outliers? ◼ The set of objects are considerably dissimilar from the remainder of the data ◼ Example: Sports: Michael Jordon, Wayne Gretzky, ... ◼ Problem: Define and find outliers in large data sets ◼ Applications: ◼ Credit card fraud detection ◼ Telecom fraud detection ◼ Customer segmentation ◼ Medical analysis ◼ network intrusion detection ◼ fault detection

What Are outliers? Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism EX: Unusual credit card purchase, sports: Michael Jordon, Wayne Gretzky,… Outliers are different from the noise data Noise is random error or variance in a measured variable Noise should be removed before outlier detection Outliers are interesting: It violates the mechanism that generates the normal data R Outlier detection Vs novelty detection: early stage outlier; but later merged into the model

3 What Are Outliers? ◼ Outlier: A data object that deviates significantly from the normal objects as if it were generated by a different mechanism ◼ Ex.: Unusual credit card purchase, sports: Michael Jordon, Wayne Gretzky, ... ◼ Outliers are different from the noise data ◼ Noise is random error or variance in a measured variable ◼ Noise should be removed before outlier detection ◼ Outliers are interesting: It violates the mechanism that generates the normal data ◼ Outlier detection vs. novelty detection: early stage , outlier; but later merged into the model

Anomaly Detection Challenges a How many outliers are there in the data? Method is unsupervised Validation can be quite challenging just like for clustering) Finding needle in a haystack Working assumption a There are considerably more normal observations than abnormal observations (outliers/anomalies )in the data

4 Anomaly Detection ◼ Challenges ◼ How many outliers are there in the data? ◼ Method is unsupervised ◼ Validation can be quite challenging (just like for clustering) ◼ Finding needle in a haystack ◼ Working assumption: ◼ There are considerably more “normal” observations than “abnormal” observations (outliers/anomalies) in the data

Anomaly Detection Schemes General steps Build a profile of the"normal behavior Profile can be patterns or summary statistics for overall population Use the normal profile to detect anomalies Anomalies are observations whose characteristics differ significantly from the normal profile Types of anomaly detection schemes Graphical Statistical-based Distance-based Model-based 5

5 Anomaly Detection Schemes ◼ General Steps ◼ Build a profile of the “normal” behavior ◼ Profile can be patterns or summary statistics for overall population ◼ Use the “normal” profile to detect anomalies ◼ Anomalies are observations whose characteristics differ significantly from the normal profile ◼ Types of anomaly detection schemes ◼ Graphical & Statistical-based ◼ Distance-based ◼ Model-based

点击进入文档下载页（PPT格式）

共49页，可试读17页，点击继续阅读 ↓↓

您可能感兴趣的文档

重庆大学：《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件（英文版）Chapter 8 Cluster Analysis：Basic Concepts and Methods
重庆大学：《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件（英文版）Chapter 7 Classification：Basic Concepts
重庆大学：《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件（英文版）Chapter 6 Advanced Frequent Pattern Mining
重庆大学：《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件（英文版）Chapter 5 Mining Frequent Patterns, Association and Correlations：Basic Concepts and Methods
重庆大学：《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件（英文版）Chapter 4 OLAP - Data Warehousing and On-line Analytical Processing
重庆大学：《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件（英文版）Chapter 3 Data Preprocessing
重庆大学：《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件（英文版）Chapter 2 about data - Getting to Know Your Data
重庆大学：《数据仓库与数据挖掘 Data Warehouse and Data mining》课程PPT教学课件（英文版）Chapter 1 introduction
重庆师范大学：《人工智能 AI》精品课程PPT教学课件_第7章机器人规划
重庆师范大学：《人工智能 AI》精品课程PPT教学课件_第6章机器学习
重庆师范大学：《人工智能 AI》精品课程PPT教学课件_第5章搜索策略
重庆师范大学：《人工智能 AI》精品课程PPT教学课件_第4章智能计算（计算智能）
延安大学：《网页制作基础教程》课程教学资源_教学大纲
延安大学：《网页制作基础教程》学术论文_基于AJAX技术的Web模型在网站互动平台的应用研究
延安大学：《网页制作基础教程》学术论文_基于RIA技术的实验演示系统的设计与实现
延安大学：《网页制作基础教程》学术论文_服务器推技术在实验演示系统中的应用
延安大学：《网页制作基础教程》学术论文_用户行为驱动的网页布局自动调整的研究
《网页制作基础教程》参考书籍（PDF）：JavaScript 权威指南（第四版）
《网页制作基础教程》参考书籍（PDF）：Python学习手册（第3版，涵盖Pathon 2.5）
《网页制作基础教程》参考书籍：CSS Mastery 精通CSS书籍——高级WEB标准解决方案（人民邮电出版社）
延安大学：《网页制作基础教程》课程PPT教学课件_第一章网页结构（牛永洁）
延安大学：《网页制作基础教程》课程PPT教学课件_第二章网页头部
延安大学：《网页制作基础教程》课程PPT教学课件_第三章格式化
延安大学：《网页制作基础教程》课程PPT教学课件_第四章列表的应用

点击购买下载（PPT）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录