Download presentation
Presentation is loading. Please wait.
1
Adaboost Team G Youngmin Jun (ym_jun@naver.com)
Shekhroz Khudoyarov Jaewan Choi Alexandre Larzat
2
Contents Introduction Explanation of Adaboost Mathematical Formula
Image Explanation How to solve
3
Overview of boosting Introduced by Schapire and Freund in 1990s.
“Boosting”: convert a weak learning algorithm into a strong one. Main idea: Combine many weak classifiers to produce a powerful committee. Algorithms: AdaBoost: adaptive boosting Gentle AdaBoost BrownBoost Gradient Tree Boosting XGBoost R. Schapire and Y. Freund won the 2003 Godel Prize (one of the most prestigious awards in theoretical computer science) Prize winning paper (which introduced AdaBoost): "A decision theoretic generalization of on-line learning and an application to Boosting,“ Journal of Computer and System Sciences, 1997, 55:
4
What is the Adaboost? Boosting is an approach to machine learning based on the idea of creating a highly accurate prediction rule by combining many relatively weak and inaccurate rules. AdaBoost, short for Adaptive Boosting, is a machine learning meta-algorithm. It ca n be used in conjunction with many other types of learning algorithms to improve performance. The output of the other learning algorithms ('weak learners') is com bined into a weighted sum that represents the final output of the boosted classifie r.
5
Adaboost Terms Classifier : h [-1, +1] Error rate line
Learner = Hypothesis = Classifier Weak Learner: < 50% error over any distribution Strong Classifier: thresholded linear combination of weak learner outputs Classifier : h [-1, +1] 0.5 1.0 Weak classifier Strong classifier Error rate line Discarded Part
6
Introduction of Adaboost
Weight
7
The Main Algorithm of Adaboost
Each weights are Initialized as 1/m Repeats a “J” loop (Iteration) Finds a Classifier that has a min error Multiplies Dj(i) weight to wrong recognition errors, to calculate the whole error value If the error value is more than 0.5 or same than stop, because it is not good than the random value Else if the error value is less than 0.5, calculate the 𝛼 𝑗 = log 1− 𝜖 𝑗 𝜖 𝑗 value If the 𝛼 𝑗 is set the Dj+1(i) value is updated Repeat this progress until “J” By linear combination of the weak classifiers, we can get a strong classifier
8
The Main Algorithm of Adaboost
9
Algorithm of Adaboost Given a training set with two classes: 𝑇={ 𝑥 1 , 𝑦 1 , 𝑥 2 , 𝑦 2 ,…,( 𝑥 𝑛 , 𝑦 𝑛 )} Where 𝑥 𝑖 ∈ 𝑅 𝑛 , 𝑦 𝑖 ∈{−1, 1}. The procedure of Adaboost can be described as following: Input: training set 𝑇 Output: the final classifier 𝐺(𝑥)
10
Algorithm of Adaboost Initialize weights of training examples: 𝐷 1 = 𝑤 11 ,…,𝑤 1 𝑖 ,…, 𝑤 1𝑛 , 𝑤 1𝑖 = 1 𝑛 ,𝑖=1,2,…𝑛 For 𝑚=1,2,…,𝑀 (Where M is the number of weak classifiers) Fit a classifier 𝐺 𝑚 𝑥 to the training data using weights 𝑤 𝑖 Compute misclassification error of 𝐺 𝑚 𝑥 : 𝑒 𝑚 =𝑃 𝐺 𝑚 𝑥 𝑖 ≠ 𝑦 𝑖 = 𝑖=1 𝑛 𝑤 𝑚𝑖 𝐼( 𝐺 𝑚 𝑥 𝑖 ≠ 𝑦 𝑖 ) (1)
11
Algorithm of Adaboost Compute the weight 𝛼 𝑚 for this classifier 𝐺 𝑚 𝑥 𝛼 𝑚 = 1 2 ln 1− 𝑒 𝑚 𝑒 𝑚 Update weights of training examples: 𝐷 𝑚+1 = 𝑤 𝑚+1,1 ,…, 𝑤 𝑚+1,𝑖 ,…, 𝑤 𝑚+1,𝑛 𝑤 𝑚+1,𝑖 = 𝑤 𝑚,𝑖 𝑍 𝑚 exp − 𝛼 𝑚 𝑦 𝑖 𝐺 𝑚 𝑥 𝑖 where 𝑍 𝑚 = 𝑖=1 𝑛 𝑤 𝑚𝑖 exp − 𝛼 𝑚 𝑦 𝑖 𝐺 𝑚 𝑥 𝑖 is regularization term and renormalize to 𝑤 𝑖 to sum to 1. (2) (3) (4) (5)
12
Algorithm of Adaboost The final classifier 𝐺(𝑥) is weighted sum of on each M iterations’ 𝛼 value and classifier output. 𝐺 𝑥 =𝑠𝑖𝑔𝑛 𝑓 𝑥 =𝑠𝑖𝑔𝑛( 𝑚=1 𝑀 𝛼 𝑚 𝐺 𝑚 (𝑥)) 𝛼 𝑚 stands for the weight of the m-th classifier. According to Equation (2), 𝛼 𝑚 ≥0 when 𝑒 𝑚 ≤ In addition, 𝛼 𝑚 increase with the decrease of 𝑒 𝑚 . Therefore, the classifiers with lower classification error have higher weights in the final classifier. (6)
13
Example of the classify – circle, triangle
Example of Adaboost Example of the classify – circle, triangle
14
Example of Adaboost 𝜀 1 =0.30 𝛼 1 =0.42
15
Example of Adaboost 𝜀 2 =0.21 𝛼 2 =0.65
16
Example of Adaboost 𝜀 3 =0.14 𝛼 3 =0.92
17
Example of Adaboost 𝜀 2 =0.21 𝜀 1 =0.30 𝜀 3 =0.14 𝛼 2 =0.65 𝛼 1 =0.42
𝛼 3 =0.92
18
Example Training data : First iteration :
1 2 3 4 5 6 7 8 9 Y -1 𝝎 𝒊 0.1 First iteration : Best treshold is 2.5 with ℎ 1 𝑥 = +1 𝑖𝑓 𝑥<2.5 −1 𝑖𝑓 𝑥>2.5 𝐸 1 =3∗0.1=0.3 𝛼 1 = 0.5∗ ln 1− = 𝜔 𝑖 (𝑠+1) = 𝜔 𝑖 (𝑠) ∗𝑒 − = 𝜔 𝑖 (𝑠) ∗ , 𝑖𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝜔 𝑖 (𝑠) ∗𝑒 = 𝜔 𝑖 (𝑠) ∗ , 𝑖𝑓 𝑤𝑟𝑜𝑛𝑔
19
Example Updated data : 𝐻 𝑥 =𝑠𝑖𝑔𝑛(0.423649∗ ℎ 1 (𝑥)) 3 errors X 1 2 3 4
1 2 3 4 5 6 7 8 9 Y -1 𝝎 𝒊 updated Prenorm Z = 7* *0, = 𝐻 𝑥 =𝑠𝑖𝑔𝑛( ∗ ℎ 1 (𝑥)) 3 errors
20
Example Training data : Second iteration :
1 2 3 4 5 6 7 8 9 Y -1 𝝎 𝒊 Second iteration : Best treshold is 8.5 (tiniest error) with ℎ 2 𝑥 = +1 𝑖𝑓 𝑥<8.5 −1 𝑖𝑓 𝑥>8.5 𝐸 2 =3∗ = 𝛼 2 = 0.5∗ ln 1− = 𝜔 𝑖 (𝑠+1) = 𝜔 𝑖 (𝑠) ∗ 𝑒 − = 𝜔 𝑖 𝑠 ∗ , 𝑖𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝜔 𝑖 (𝑠) ∗ 𝑒 = 𝜔 𝑖 𝑠 ∗ , 𝑖𝑓 𝑤𝑟𝑜𝑛𝑔
21
Example Updated data : 𝐻 𝑥 =𝑠𝑖𝑔𝑛(0.423649∗ ℎ 1 (𝑥)+0.64963∗ ℎ 2 (𝑥))
1 2 3 4 5 6 7 8 9 Y -1 𝝎 𝒊 updated Prenorm 0,136775 Z = 4* *0, * = 𝐻 𝑥 =𝑠𝑖𝑔𝑛( ∗ ℎ 1 (𝑥) ∗ ℎ 2 (𝑥)) Still 3 errors
22
Example Training data : Third iteration :
1 2 3 4 5 6 7 8 9 Y -1 𝝎 𝒊 Third iteration : Best treshold is 5.5 (tiniest error) with ℎ 3 𝑥 = +1 𝑖𝑓 𝑥>5.5 −1 𝑖𝑓 𝑥<5.5 𝐸 3 =4∗ = 𝛼 3 = 0.5∗ ln 1− = 𝜔 𝑖 (𝑠+1) = 𝜔 𝑖 (𝑠) ∗𝑒 − = 𝜔 𝑖 (𝑠) ∗ , 𝑖𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝜔 𝑖 (𝑠) ∗𝑒 = 𝜔 𝑖 (𝑠) ∗ , 𝑖𝑓 𝑤𝑟𝑜𝑛𝑔
23
Example Training data :
1 2 3 4 5 6 7 8 9 Y -1 𝐻 𝑥 =𝑠𝑖𝑔𝑛( ∗ ℎ 1 (𝑥) ∗ ℎ 2 𝑥 ∗ ℎ 3 (𝑥)) 0 error
24
Reference https://infinitescript.com/2016/09/adaboost/
machine-learning/ %EB%98%91%EB%98%91%ED%95%98%EB%8B%A42-adaboost-ba Yoav Freund, Robert Schapire, a short Introduction to Boosting Robert Schapire, the boosting approach to machine learning; Princeton University Yoav Freund, Robert Schapire, A decision-theoretic generalization of on-line learning and a pplication to boosting Pengyu Hong, Statistical Machine Learning lecture notes.
25
Thank You for Your Listening!
Thank you for listening to my presentation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.