Popular Ensemble Methods: An Empirical Study David Opitz and Richard Maclin Presented by Scott Wespi 5/22/07.

Popular Ensemble Methods: An Empirical Study David Opitz and Richard Maclin Presented by Scott Wespi 5/22/07

Outline Ensemble methods Classifier Ensembles Bagging vs Boosting Results Conclusion

Ensemble Methods Sets of individually trained classifiers whose predictions are combined when classifying new data –Bagging (1996) –Boosting (1996) How are bagging and boosting influenced by the learning algorithm? –Decision trees –Neural networks *Note: Paper is from 1999

Classifier Ensembles Goal: highly accurate individual classifiers that disagree as much as possible Bagging and boosting create disagreement

Bagging vs. Boosting Training Data 1, 2, 3, 4, 5, 6, 7, 8 Bagging training set Set 1: 2, 7, 8, 3, 7, 6, 3, 1 Set 2: 7, 8, 5, 6, 4, 2, 7, 1 Set 3: 3, 6, 2, 7, 5, 6, 2, 2 Set 4: 4, 5, 1, 4, 6, 4, 3, 8 Boosting training set Set 1: 2, 7, 8, 3, 7, 6, 3, 1 Set 2: 1, 4, 5, 4, 1, 5, 6, 4 Set 3: 7, 1, 5, 8, 1, 8, 1, 4 Set 4: 1, 1, 6, 1, 1, 3, 1, 5

Ada-Boosting vs Arcing Ada-Boosting –Every sample has 1/N weight initially, increases every time sample was skipped or misclassified Arcing –If m i = number of times ith example was misclassified

stansimplebagarcadastanbagarcada breast-cancer-w3.43.53.43.8453.73.5 credit-a14.813.713.815.815.714.913.41413.7 credit-g27.924.724.225.225.329.625.225.926.7 diabetes23.92322.824.423.327.824.42625.7 glass38.635.233.13231.131.325.825.523.3 heart-cleveland18.617.41720.721.124.319.521.520.8 hepatitis20.119.517.81919.721.217.316.917.2 house-votes-844.94.84.15.15.33.6 54.8 hypo6.46.2 0.50.4 ionosphere9.77.59.27.68.38.16.466.1 iris4.33.943.73.95.24.95.15.6 kr-vs-kp2.30.8 0.40.30.6 0.30.4 labor6.13.24.23.2 16.513.71311.6 letter1812.810.55.74.61474.13.9 promoters-9365.34.844.54.612.810.66.86.4 ribosome-bind9.38.58.48.18.211.210.29.39.6 satellite1310.910.69.91013.89.98.68.4 segmentation6.65.35.43.53.33.731.71.5 sick5.95.7 4.74.51.31.21.11 sonar16.615.916.812.91329.725.321.521.7 soybean9.26.76.96.76.387.97.26.7 splice4.743.944.25.95.45.15.3 vehicle24.921.220.719.119.729.427.122.522.9

Neural Networks Ada-Boosting Arcing Bagging White bar represents 1 standard deviation

Decision Trees

Composite Error Rates

Neural Networks: Bagging vs Simple

NN DT Box represents reduction in error Ada-Boost: Neural Networks vs. Decision Trees

Arcing

Bagging

Noise Hurts boosting the most

Conclusions Performance depends on data and classifier In some cases, ensembles can overcome bias of component learning algorithm Bagging is more consistent than boosting Boosting can give much better results on some data

Popular Ensemble Methods: An Empirical Study David Opitz and Richard Maclin Presented by Scott Wespi 5/22/07.

Similar presentations

Presentation on theme: "Popular Ensemble Methods: An Empirical Study David Opitz and Richard Maclin Presented by Scott Wespi 5/22/07."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Popular Ensemble Methods: An Empirical Study David Opitz and Richard Maclin Presented by Scott Wespi 5/22/07.

Similar presentations

Presentation on theme: "Popular Ensemble Methods: An Empirical Study David Opitz and Richard Maclin Presented by Scott Wespi 5/22/07."— Presentation transcript:

Similar presentations

About project

Feedback