当前位置：和泉文库 > 计算机 > 浏览文档

电子科技大学：《大数据分析与挖掘 Big Data Analysis and Mining》课程教学资源（课件讲稿）Lecture 2 BasicConcepts（Foundations of Data Mining）

文件格式：PDF，文件大小：4.03MB，售价：22.91元

文档详细内容（约116页）

Example:Payment Prediction Revisit o Find the mapping function or model to answer whether one's salary is more than 50k. Age Edu.year HoursPerWeek Pay 25 7 40 <50k 38 9 50 ≥50k 28 12 40 ≥50k 24 10 40 <50k 55 4 10 ?

• Find the mapping function or model to answer whether one’s salary is more than 50k. Example: Payment Prediction Revisit Age Edu. year HoursPerWeek Pay 25 7 40 <50k 38 9 50 ≥50k 28 12 40 ≥50k 24 10 40 <50k 55 4 10 ?

If the solid points represent"salary 50k"and hollow ones for 50k",we can use the line to separate those points Which one is better? Blue one -Why?>minimum error on predicted result(separate results) -A good model should minimize the loss on training data

If the solid points represent “salary < 50k” and hollow ones for “≥50k”, we can use the line to separate those points Which one is better? – Blue one – Why?  minimum error on predicted result (separate results) – A good model should minimize the loss on training data

LOSS FUNCTION To measure the predicted results,we introduce the loss function L(Y,F(X)),which a non-negative function -0-1 loss 6,w)- y=F(x|8) y+F(x|0) Squared loss L(y,F(x)=(y-F(x8)1 Absolute loss L(y,F(x10))=ly-F(x10)I Log loss L(y,P(ylx,0))=-logP(ylx,0)

Training Loss and Test Loss Training loss:loss on training data Test loss:loss on test data Performance on training data of three models 8 000 )● Performance on training data and test data Who wins? ●

Performance on training data of three models Performance on training data and test data Training Loss and Test Loss Who wins? Training loss: loss on training data Test loss: loss on test data

Generalization Empirical risk: R回-∑6ox》 Note:A good model cannot only take training loss into account and minimize the empirical risk.Instead,improve the model generalization Model Model Model True function True function True function Samples Samples ●Samples Model Selection:To avoid Underfitting and Overfitting

Empirical risk: Note: A good model cannot only take training loss into account and minimize the empirical risk. Instead, improve the model generalization. Generalization R F = 1 N 𝑖=1 𝑁 L yi , F xi Model Selection: To avoid Underfitting and Overfitting

点击进入文档下载页（PDF格式）

共116页，可试读30页，点击继续阅读 ↓↓

您可能感兴趣的文档

电子科技大学：《大数据分析与挖掘 Big Data Analysis and Mining》课程教学资源（课件讲稿）Lecture 1 Intro（主讲：邵俊明）
计算机科学与技术（PPT讲稿）Unlock with Your Heart - Heartbeat-based Authentication on Commercial Mobile Phones
计算机科学与技术（参考文献）VECTOR - Velocity Based Temperature-field Monitoring with Distributed Acoustic Devices
计算机科学与技术（参考文献）VSkin - Sensing Touch Gestures on Surfaces of Mobile Devices Using Acoustic Signals
计算机科学与技术（参考文献）RespTracker - Multi-user Room-scale Respiration Tracking with Commercial Acoustic Devices
计算机科学与技术（参考文献）Dynamic Speed Warping - Similarity-Based One-shot Learning for Device-free Gesture Signals
计算机科学与技术（参考文献）SpiderMon - Towards Using Cell Towers as Illuminating Sources for Keystroke Monitoring
计算机科学与技术（参考文献）Unlock with Your Heart：Heartbeat-based Authentication on Commercial Mobile Phones
计算机科学与技术（参考文献）QGesture - Quantifying Gesture Distance and Direction with WiFi Signals
计算机科学与技术（PPT讲稿）QGesture - Quantifying Gesture Distance and Direction with WiFi Signals
计算机科学与技术（参考文献）Gait Recognition Using WiFi Signals
计算机科学与技术（参考文献）Gait Recognition Using WiFi Signals
电子科技大学：《大数据分析与挖掘 Big Data Analysis and Mining》课程教学资源（课件讲稿）Lecture 3 Hashing
电子科技大学：《大数据分析与挖掘 Big Data Analysis and Mining》课程教学资源（课件讲稿）Lecture 4 Sampling for Big Data
电子科技大学：《大数据分析与挖掘 Big Data Analysis and Mining》课程教学资源（课件讲稿）Lecture 5 Data Stream Mining
电子科技大学：《大数据分析与挖掘 Big Data Analysis and Mining》课程教学资源（课件讲稿）Lecture 6 Graph Mining
电子科技大学：《大数据分析与挖掘 Big Data Analysis and Mining》课程教学资源（课件讲稿）Lecture 7 Hadoop-Spark
电子科技大学：《先进计算机网络技术》课程教学资源（课件讲稿）Introduction（冯钢）
电子科技大学：《先进计算机网络技术》课程教学资源（课件讲稿）Unit 1 Overview - A big Picture on Traffic Control and QoS in IP networks
电子科技大学：《先进计算机网络技术》课程教学资源（课件讲稿）Unit 2 Call-level Models and Admission Control
电子科技大学：《先进计算机网络技术》课程教学资源（课件讲稿）Unit 3 Traffic Policing and Shaping
电子科技大学：《先进计算机网络技术》课程教学资源（课件讲稿）Unit 4 TCP Traffic Control
电子科技大学：《先进计算机网络技术》课程教学资源（课件讲稿）Unit 5 Buffer Management
电子科技大学：《先进计算机网络技术》课程教学资源（课件讲稿）Unit 6 Packet Scheduling

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录