电子科技大学研究生《机器学习》课程 eSTC 1966 第8讲非监督学习 8 Unsupervised Learning 郝家胜 (Jiasheng Hao) Ph.D., Associate Professor Email: hao@uestc.edu.cn School of Automation Engineering, Center for Robotics University of Electronics and Science of China, Chengdu 611731 参考:《机器学习》周志华
电子科技大学研究生《机器学习》课程 Email: hao@uestc.edu.cn School of Automation Engineering, Center for Robotics University of Electronics and Science of China, Chengdu 611731 郝家胜 (Jiasheng Hao) Ph.D., Associate Professor 参考:《机器学习》周志华
引言 有监督学习和无监督学习: ·有监督训练过程 一训练样本集中每个样本的类别已经被标记 无监督训练过程 Unsupervised Learning 一使用未被标记的训练样本 X2 我们的目标是发现这组数据中的特殊结构 X1
我们的目标是发现这组数据中的特殊结构
引言 ▣无监督识别应用非常广泛 。收集并标记大型样本集非常费时费力 例如:语音信息的记录 逆向解决问题:用大量未标记样本集训练,再 人工标记数据分组 一例如:数据挖掘的应用 对于待分类模式性质会随时间变化的情况,使 用无监督方法可以大幅提升分类器性能 一例:自动食品分类器中食品随季节而改变
p 无监督识别应用非常广泛
引言 ▣例子 新闻分类:如Google News搜集网上的新闻,并且根据新闻的主题将新 闻分成许多簇,然后将在同一个簇的新闻放在一起。 No12,2015 千”▣*小 Death toll in Northern California wildfire rises to 42,as PARADISE,Calif.-The inferno that ravaged the wooded town of Paradise in northern California became the deadliest wildfire in the Trump OKs disaster declaration for Golden State state's modern history on Monday when officials said they had discovered the remains of 13 more people,bringing the death toll to 42 Fox News·today The Butte County sheriff,Kory L.Honea,has sald more than 200 people remain missing in and around the town,which sits in the California Wildfires Updates:42 Deaths Make Camp Fire foothills of the Sierra Nevada and was popular with retirees. Deadliest in State History "My sincere hope is that I don't have to come here each night and The New York Times today report a higher and higher number,"Sheriff Honea said at a news conference Monday night. The fire,which continues to rage in the hills and ravines east of the Trump couldn't just express empathy for California fire victims. city of Chico,is also the most destructive fire in California history,with What's the matter with him?-Los Angeles Tim more than 7,100 structures destroyed,most of them homes. Los Angeles Times·today·Opinion To the editor:President Trump's first tocet about the California fires should have been: Utility emailed woman about problems 1 day before fire "We are saddened by the loss of life and property caused by California's fires.Our thanks and admirintothe frefightersdthersonthefrof thsdYour WRAL.com·today will do everything it can to assist." Instead,it was:"There is no reason for these massive,deadly and costly forest fires in Camp Fire growth slows as winds ease up Monday Californla except that forest management is so poor.Billions of dollars are given each year, with so many lives lost,all because of gross mismanagement of the forests.Remedy now or ABc10.com KXTV·today no more Fed payments!" As a UCLA Extension instructor on crisis management,I know that support and appreclation are imperative during a crisis;critiques and recommended improvements belong in after. action reports.Threats are never acceptable. How sad that our president doesn't know when to support and when to critieize.People and animals are dying:bomes are going up in flames,and thousands of responders are working around the clock.Yet the president did not mention that at first
p 例子 p 新闻分类:如Google News搜集网上的新闻,并且根据新闻的主题将新 闻分成许多簇, 然后将在同一个簇的新闻放在一起
引言 口例子 ▣提前捕捉未知欺诈和洗钱攻击:DataVisor解决方案可以在没有训练标 签和历史欺诈样本时有效自动检测各类新型攻击,发掘未知的系统性和规 模性的风险。并且能在攻击者发起破坏前阻止他们。DataVisor的反欺诈 工作包括各种恶意注册、盗号、骗贷、刷量等等欺诈行为。DataVisor的 强项就是特征计算,准确的数据清洗、字段提取、字段拆分和字段组合等。 通过对特征的聚类,可以高效地抓到欺诈团伙,及时阻止欺诈行为。将 DataVisor的无监督学习应用于某些欺诈场景,其准确率可以高达99%
p 例子 p 提前捕捉未知欺诈和洗钱攻击: DataVisor解决方案可以在没有训练标 签和历史欺诈样本时有效自动检测各类新型攻击,发掘未知的系统性和规 模性的风险。并且能在攻击者发起破坏前阻止他们。DataVisor的反欺诈 工作包括各种恶意注册、盗号、骗贷、刷量等等欺诈行为。DataVisor的 强项就是特征计算,准确的数据清洗、字段提取、字段拆分和字段组合等。 通过对特征的聚类,可以高效地抓到欺诈团伙,及时阻止欺诈行为。将 DataVisor的无监督学习应用于某些欺诈场景,其准确率可以高达99%