Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.

Boosting of classifiers Ata Kaban

Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly better than random guessing – we call this a weak learner – E.g. if an email contains the work “money” then classify it as spam, otherwise as non-spam Is it possible to use this weak learning algorithm to create a strong classifier with error rate close to 0? Ensemble learning – the wisdom of crowds – More heads are better than one

Motivation & beginnings

Idea Use the weak learning algorithm to produce a collection of weak classifiers – Modify the input each time when asking for a new weak classifier Weight the training points differently Find a good way to combine them

Main idea behind Adaboost Iterative algorithm Maintains a distribution of weights over the training examples Initially weights are equal At successive iterations the weight of misclassified examples is increased This forces the algorithm to focus on the examples that have not been classified correctly in previous rounds Take a linear combination of the predictions of the weak learners, with coefficients proportional to the performance of the weak learner.

Demo – first two rounds

Demo (cont’d) – third round …

Demo (cont’d) – the final classifier

Pseudo-code

Details for pseudo-code

The weights of training points

The combination coefficients

Comments One can show that the training error of Adaboost drops exponentially fast as the rounds progress The more rounds the more complex the final classifier is, so overfitting can happen In practice overfitting is rarely observed and Adaboost tends to have excellent generalisation performance

Typical behaviour

Practical advantages of Adaboost Can construct arbitrarily complex decision regions Generic: Can use any classifier as weak learner, we only need it to be slightly better than random guessing Simple to implement Fast to run Adaboost is one of the ‘top 10’ algorithms in data mining

Caveats Adaboost can fail if there is noise in the class labels (wrong labels) It can fail if the weak-learners are too complex It can fail of the weak-learners are no better than random guessing

Topics not covered Other combination schemes for classifiers – E.g. Bagging Combinations for unsupervised learning

Further readings Robert E. Schapire. The boosting approach to machine learning. In D. D. Denison, M. H. Hansen, C. Holmes, B. Mallick, B. Yu, editors, Nonlinear Estimation and Classification. Springer, 2003. Robi Polikar. Ensemble Based Systems in Decision Making, IEEE Circuits and Systems Magazine, 6(3), pp. 21-45, 2006. http://users.rowan.edu/~polikar/RESEARCH/PUBLICATIONS/csm06.pdf http://users.rowan.edu/~polikar/RESEARCH/PUBLICATIONS/csm06.pdf Thomas G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning, 40(2): 139-158, 2000. Collection of papers: http://www.cs.gmu.edu/~carlotta/teaching/CS-795-s09/info.html

Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.

Similar presentations

Presentation on theme: "Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.

Similar presentations

Presentation on theme: "Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly."— Presentation transcript:

Similar presentations

About project

Feedback