Evidence Contrary to the Statistical View of Boosting David Mease & Abraham Wyner.

Slides:



Advertisements
Similar presentations
Additive Models, Trees, etc. Based in part on Chapter 9 of Hastie, Tibshirani, and Friedman David Madigan.
Advertisements

Lectures 17,18 – Boosting and Additive Trees Rice ECE697 Farinaz Koushanfar Fall 2006.
Linear Classifiers/SVMs
Detecting Faces in Images: A Survey
Boosting Rong Jin.
CART: Classification and Regression Trees Chris Franck LISA Short Course March 26, 2013.
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
Boosting Ashok Veeraraghavan. Boosting Methods Combine many weak classifiers to produce a committee. Resembles Bagging and other committee based methods.
Model Assessment, Selection and Averaging
CMPUT 466/551 Principal Source: CMU
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei Li,
Introduction to Boosting Slides Adapted from Che Wanxiang( 车 万翔 ) at HIT, and Robin Dhamankar of Many thanks!
Boosting CMPUT 615 Boosting Idea We have a weak classifier, i.e., it’s error rate is a little bit better than 0.5. Boosting combines a lot of such weak.
Econ 140 Lecture 121 Prediction and Fit Lecture 12.
Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficiency with boostrap sampling: Every example.
A Brief Introduction to Adaboost
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Sparse vs. Ensemble Approaches to Supervised Learning
Boosting Main idea: train classifiers (e.g. decision trees) in a sequence. a new classifier should focus on those cases which were incorrectly classified.
Boosting for tumor classification
Ensemble Learning (2), Tree and Forest
For Better Accuracy Eick: Ensemble Learning
Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem
Machine Learning CS 165B Spring 2012
Chapter 10 Boosting May 6, Outline Adaboost Ensemble point-view of Boosting Boosting Trees Supervised Learning Methods.
Boosting Neural Networks Published by Holger Schwenk and Yoshua Benggio Neural Computation, 12(8): , Presented by Yong Li.
Recognition using Boosting Modified from various sources including
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 16 Nov, 3, 2011 Slide credit: C. Conati, S.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Ensemble Classification Methods Rayid Ghani IR Seminar – 9/26/00.
Benk Erika Kelemen Zsolt
Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.
Data Mining - Volinsky Columbia University 1 Topic 10 - Ensemble Methods.
Additive Logistic Regression: a Statistical View of Boosting
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Ensemble Methods: Bagging and Boosting
Ensemble Learning (1) Boosting Adaboost Boosting is an additive model
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Classification Derek Hoiem CS 598, Spring 2009 Jan 27, 2009.
Lecture 6: Classification – Boosting and SVMs CAP 5415 Fall 2006.
Lecture 09 03/01/2012 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Ensemble Methods in Machine Learning
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
1 Introduction to Predictive Learning Electrical and Computer Engineering LECTURE SET 8 Combining Methods and Ensemble Learning.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Boosting and Additive Trees (Part 1) Ch. 10 Presented by Tal Blum.
Model validation: Introduction to statistical considerations Miguel Nakamura Centro de Investigación en Matemáticas (CIMAT), Guanajuato, Mexico
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
By Subhasis Dasgupta Asst Professor Praxis Business School, Kolkata Classification Modeling Decision Tree (Part 2)
1 C.A.L. Bailer-Jones. Machine Learning. Model selection and combination Machine learning, pattern recognition and statistical data modelling Lecture 10.
Cost-Sensitive Boosting algorithms: Do we really need them?
Bagging and Random Forests
Week 2 Presentation: Project 3
Trees, bagging, boosting, and stacking
Boosting and Additive Trees
ECE 5424: Introduction to Machine Learning
Asymmetric Gradient Boosting with Application to Spam Filtering
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei.
A New Boosting Algorithm Using Input-Dependent Regularizer
Lecture 18: Bagging and Boosting
Ensemble Methods for Machine Learning: The Ensemble Strikes Back
Overfitting and Underfitting
Lecture 06: Bagging and Boosting
Ensemble learning Reminder - Bagging of Trees Random Forest
Derek Hoiem CS 598, Spring 2009 Jan 27, 2009
Presentation transcript:

Evidence Contrary to the Statistical View of Boosting David Mease & Abraham Wyner

What is the Statistical View? The idea presented in – J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28:337–374, 2000a – Ada Boost is similar to LogitBoost Regression

The Challenge Centers on areas where the paper’s view fails to explain important characteristics of Ada Boost – The statistical view of boosting focuses only on one aspect of the algorithm - the optimization. Does not explain why the statistical view doesn’t work, merely presents evidence to the contrary

Areas of Deficiency for Statistical View Stagewise nature of the algorithm Empirical variance reduction that can be observed on hold out samples – variance reduction seems to happen accidently. Strong resistance to over fitting of Ada Boost which is lost in regression model

Practical Advice AdaBoost is one of the most successful boosting algorithms Do not assume that newer, regularized and modified versions of boosting are necessarily better Try standard AdaBoost along with these newer algorithms If classification is the goal, monitor the misclassification error on hold out (or cross-validation) samples Much of the evidence presented is counter-intuitive – keep an open mind when experimenting with AdaBoost – If stumps are causing overfitting, be willing to try larger trees – Intuition may suggest the larger trees will overfit, but we have seen that is not necessarily true