Rationale -2 o Assume classifiers are independent,i.e.,their errors are uncorrelated.Then the ensemble makes a wrong prediction only if more than half of the base classifiers predict incorrectly. o Probability that the ensemble classifier makes a wrong prediction: 25 () 1-e)25-1=0.06 =13 wrong probability correct probability Note:i 13,n =25,=0.35 binomial distribution. 5/55
Rationale - 2 Assume classifiers are independent, i.e., their errors are uncorrelated. Then the ensemble makes a wrong prediction only if more than half of the base classifiers predict incorrectly. Probability that the ensemble classifier makes a wrong prediction: X 25 i=13 25 i ε i |{z} wrong probability (1 − ε) 25−i | {z } correct probability = 0.06 Note: i ≥ 13, n = 25, ε = 0.35 binomial distribution. 5 / 55
Works if… o The base classifiers should be independent o The base classifiers should do better than a classifier that performs random guess.(error 0.5) o In practice,it is hard to have base classifiers perfectly independent Nevertheless,improvements have been observed in ensemble methods when they are slightly correlated. 6/55
Works if … The base classifiers should be independent. The base classifiers should do better than a classifier that performs random guess. (error < 0.5) In practice, it is hard to have base classifiers perfectly independent. Nevertheless, improvements have been observed in ensemble methods when they are slightly correlated. 6 / 55
Rationale One important note is that: When we generate multiple base-learners,we want them to be reasonably accurate but do not require them to be very accurate individually,so they are not,and need not be,optimized separately for best accuracy. The base learners are not chosen for their accuracy,but for their simplicity. 7155
Rationale One important note is that: When we generate multiple base-learners, we want them to be reasonably accurate but do not require them to be very accurate individually, so they are not, and need not be, optimized separately for best accuracy. The base learners are not chosen for their accuracy, but for their simplicity. 7 / 55
Outline (Level 1) Basic principle 2 Multiple classifier combination Bagging Boosting 8/55
Outline (Level 1) 1 Basic principle 2 Multiple classifier combination 3 Bagging 4 Boosting 8 / 55
2.Multiple classifier combination Average results from different models o Why? Better classification performance than individual classifiers More resilience to noise o Why not? ●Time consuming Overfitting 9/55
2. Multiple classifier combination Average results from different models Why? Better classification performance than individual classifiers More resilience to noise Why not? Time consuming Overfitting 9 / 55