Statistical Learning Theory and Applications Lecture 7 Non-Linear Classification Model- Ensemble Methods Instructor:Quan Wen SCSE@UESTC Fall,2021
Statistical Learning Theory and Applications Lecture 7 Non-Linear Classification Model - Ensemble Methods Instructor: Quan Wen SCSE@UESTC Fall, 2021
Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 1155
Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 1 / 55
Outline (Level 1) Basic principle Multiple classifier combination Bagging Boosting 2/55
Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 2 / 55
1.Basic principle In any application,we can use several learning algorithms The No Free Lunch Theorem:no single learning algorithm in any domains always introduces the most accurate learner o Try many and choose the one with the best cross-validation results 3/55
1. Basic principle In any application, we can use several learning algorithms The No Free Lunch Theorem: no single learning algorithm in any domains always introduces the most accurate learner Try many and choose the one with the best cross-validation results 3 / 55
Rationale -1 ●On the other hand" Each learning model comes with a set of assumption and thus bias Learning is an ill-posed problem finite data):each model converges to a different solution and fails under different circumstances Why do not we combine multiple learners intelligently,which may lead to improved results? o Why it works? Suppose there are 25 base classifiers Each classifier has error rate,e =0.35 If the base classifiers are identical,thus dependent,then the ensemble will misclassify the same samples predicted incorrectly by the base classifiers. 4/55
Rationale - 1 On the other hand … Each learning model comes with a set of assumption and thus bias Learning is an ill-posed problem ( finite data): each model converges to a different solution and fails under different circumstances Why do not we combine multiple learners intelligently, which may lead to improved results? Why it works? Suppose there are 25 base classifiers Each classifier has error rate, ε = 0.35 If the base classifiers are identical, thus dependent, then the ensemble will misclassify the same samples predicted incorrectly by the base classifiers. 4 / 55