Download presentation
Presentation is loading. Please wait.
Published byApril Gilmore Modified over 8 years ago
1
AdaBoost Algorithm and its Application on Object Detection Fayin Li
2
Motivation and Outline Object detection can be considered as a classification problem (Object / Non-Object) Rowley, Baluja & Kanade use a two-layer neural network to detect faces. Sung and Poggio use SVM to detect face and Pedestrian. Too many features… How to select features efficiently?? Adaboost Algorithm Its application on face / pedestrian detection
3
Adaboost Algorithm Combine the results of multiple “weak” classifier into a single “strong” classifier: –Reusing or selecting data –Adaptive re-weighting the samples and combing Given: training data (x 1,y 1 ),…,(x m, y m ), where x i X, y i Y={-1,+1} –For t = 1,…,T: Train Weak Classifier h t : X Y on the training data Modify the training set somehow –The final hypothesis H(x) is some combination of all weak hypothesis H(x) = f(h(x)) Question: How to modify the training set and how to combine?? (Bagging: random selection and voting for final hypothesis)
4
Adaboost Algorithm Two main modifications in Boosing 1.Instead of a random sample of the training data, use a weighted sample to focus on most difficult examples 2.Instead of combining classifiers with equal vote, use a weighted vote
5
Updating the weight of examples Weak Classifier 1 Weights Increased Weak classifier 3 Final classifier is linear combination of weak classifiers Weak Classifier 2
6
Adaboost Algorithm
7
How to Choose We can show that classification error is minimized by minimizing Z t If the sample x i is classified wrong, Thus minimizing Z t will minimize this error bound Therefore we should choose t to minimize Z t We should modify the “weak classifier” to minimize Z t instead of the squared error
8
Compute analytically If we restrict,then if the example is classified correctly; otherwise Let,find by setting and we can get Then the weight updating rules will be and normalize the weights. If the example is classified correctly, the weight will be decreased. And the weights are increased for examples which are classified wrong. In practice, we can define different loss function for the weak hypothesis. Freund and Schapire define a pseudo- loss function.
9
A variant Adaboost Algorithm (Paul Viola) If we restrict, similar to above, we will minimize the function If we denote be the weight error of false negative, be the weight error of false positive, be total weight of positive examples and be total weight of negative examples, we can get and minimizing Z : And the weights are updated similar to above The decision function
10
A variant of AdaBoost for aggressive feature selection (Paul Viola)
11
Feature Selection For each round of boosting: –Evaluate each rectangle filter on each example –Sort examples by filter values –Select best threshold for each filter (min Z) –Select best filter/threshold (= Feature) –Reweight examples M filters, T thresholds, N examples, L learning time –O( MT L(MTN) ) Naïve Wrapper Method –O( MN ) Adaboost feature selector
12
Discussion on Adaboost Learning Efficient: In each round the entire dependence on previously selected features is efficiently and compactly encoded using the example weighted, which evaluate the weak classifier in constant time The error of strong classifier approaches zeros exponentially in the number of rounds Adaboost achieves large margins rapidly No parameters to tune (except T) Weak classifier: decision tree, nearest neighbor, simple rule of thumb,…. (Paul Viola restricted weak hypothesis on a single feature) The week classifier should not be too strong The hypothesis should not be too complex (Low generalization ability with complex hypothesis)
13
Boosting Cascading Similar to a decision tree Smaller and more efficient boosted classifier can be learned to reject many negative and detect almost all positive. Simpler classifiers to reject major negative and more complex classifier to achieve low false positive rates The number of classifier stages, the number of features of each stage and the threshold of each stage The false positive rate of cascade The total detection rate
14
Cascading Classifiers for Object Detection Given a nested set of classifier hypothesis classes Computational Risk Minimization. Each classifier has 100% detection rate and the cascading reduces the false positive rate vs falsenegdetermined by % False Pos % Detection 0 50 50 100 Object IMAGE SUB-WINDOW Classifier 1 F T NON-Object Classifier 3 T F NON-Object F T NON- Classifier 2 T F NON-Object
15
Some Results
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.