18种通过审核的候选算法 §序列模式( Sequential patterns) 14.GSP: Srikant, R. and Agrawal, R. 1996. Mining Sequential Patterns Generalizations and Performance Improvements. In Proceedings of the 5th GSP nternational Conference on EXtending Database Technology, 1996 15.PrefixSpan:J. Pei, J. Han, B Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayaland M-C. Hsu. Prefix Span: Mining Sequential Patterns Efficiently by Prefix I PrefixSpan Projected Pattern Growth In ICDE01 §集成挖掘( Integrated Mining) 16.CBA: Liu, B, Hsu, W. and Ma, Y.M. Integrating classification and association CBA rule mining KDD-98 §粗糙集( Rough Sets) 17. Finding reduct: Zdzislaw Pawlak, Rough sets Theoretical aspects of Reasoning about Data, KluwerAcademic Publishers, Norwell, MA, 1992 Finding reduct §图挖掘( Graph Mining) 18gSpan: Yan, X and Han, J. 2002. gSpan Graph-Based Substructure Pattern Mining. In ICDM 02
GSP PrefixSpan CBA gSpan Finding reduct §序列模式(Sequential Patterns) 14.GSP: Srikant, R. and Agrawal, R. 1996. Mining Sequential Patterns: Generalizations and Performance Improvements. In Proceedings of the 5th International Conference on Extending Database Technology, 1996. 15.PrefixSpan: J. Pei, J. Han, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayaland M-C. Hsu. PrefixSpan: Mining Sequential Patterns Efficiently by PrefixProjected Pattern Growth. In ICDE '01. §集成挖掘(Integrated Mining) 16.CBA: Liu, B., Hsu, W. and Ma, Y. M. Integrating classification and association rule mining. KDD-98. §粗糙集(Rough Sets) 17.Finding reduct: Zdzislaw Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, KluwerAcademic Publishers, Norwell, MA, 1992 §图挖掘(Graph Mining) 18.gSpan: Yan, X. and Han, J. 2002. gSpan: Graph-Based Substructure Pattern Mining. In ICDM '02. 18种通过审核的候选算法
十大经典算法 1.C4.5(ID3算法) ●2.Thek- means algorithm即 K-Means算法 6 3. Support vector machines o 4. The Apriori algorithm ●5.最大期(EM算法 ◆6. Pagerank o7. Adaboost 68 kNN: k-nearest neighbor classification ◆9. Naive Bayes ●10.CART:分类与回归树
十大经典算法 1. C4.5(ID3算法 ) 2. The k-means algorithm 即K-Means算法 3. Support vector machines 4. The Apriori algorithm 5. 最大期望(EM)算法 6. PageRank 7. AdaBoost 8. kNN: k-nearest neighbor classification 9. Naive Bayes 10. CART: 分类与回归树
决策树基础 ●女孩家长 年龄 <=30 >30 安排相亲 长相 不见 ◆女孩 帅或中等 丑 不厌其烦 收入 不见 ◆女孩 高/啥低 提出决策树 公务 员 不见 ●父母筛选 是不是 候选男士 见 不见 EncZhang's Tech Blog(htp: /eoo2sk cnblogs com)
决策树基础 女孩家长 安排相亲 女孩 不厌其烦 女孩 提出决策树 父母筛选 候选男士
决策树基础 实例 No.头痛肌肉痛体温流感 1是()是()正常0)NO 2是(1)是(①)高()Y(1) 3是(1)是(1)很高(2)Y(1) 4否(0)是(1)正常(0)N0 否(0)否0)高(1)NO) 6否0)是(1)很高(2)|N 7是(1)香0)高(1)X
决策树基础 实例 No. 头痛 肌肉痛 体温 患流感 1 是(1) 是(1) 正常(0) N(0) 2 是(1) 是(1) 高(1) Y(1) 3 是(1) 是(1) 很高(2) Y(1) 4 否(0) 是(1) 正常(0) N(0) 5 否(0) 否(0) 高(1) N(0) 6 否(0) 是(1) 很高(2) N(1) 7 是(1) 否(0) 高(1) Y(1)
生活工作中的决策(做?不做?) 总是优先选取最具有决定性意义的 辅助条件进行判定 如一打不打室外羽毛球? 刮风是最具有决定意义的因素
生活工作中的决策 (做?不做?) •总是优先选取最具有决定性意义的 辅助条件进行判定 如—打不打室外羽毛球? •刮风是最具有决定意义的因素