Regression Using Boosting Vishakh Advanced Machine Learning Fall 2006.

Slides:



Advertisements
Similar presentations
Ensemble Learning Reading: R. Schapire, A brief introduction to boosting.
Advertisements

On-line learning and Boosting
Lectures 17,18 – Boosting and Additive Trees Rice ECE697 Farinaz Koushanfar Fall 2006.
A Statistician’s Games * : Bootstrap, Bagging and Boosting * Please refer to “Game theory, on-line prediction and boosting” by Y. Freund and R. Schapire,
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Boosting Approach to ML
FilterBoost: Regression and Classification on Large Datasets Joseph K. Bradley 1 and Robert E. Schapire 2 1 Carnegie Mellon University 2 Princeton University.
Games of Prediction or Things get simpler as Yoav Freund Banter Inc.
CMPUT 466/551 Principal Source: CMU
Longin Jan Latecki Temple University
Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005
Introduction to Boosting Slides Adapted from Che Wanxiang( 车 万翔 ) at HIT, and Robin Dhamankar of Many thanks!
Psychology 202b Advanced Psychological Statistics, II February 15, 2011.
Ensemble Learning what is an ensemble? why use an ensemble?
A Brief Introduction to Adaboost
Ensemble Learning: An Introduction
1 How to be a Bayesian without believing Yoav Freund Joint work with Rob Schapire and Yishay Mansour.
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Sparse vs. Ensemble Approaches to Supervised Learning
Online Learning Algorithms
Machine Learning CS 165B Spring 2012
Chapter 10 Boosting May 6, Outline Adaboost Ensemble point-view of Boosting Boosting Trees Supervised Learning Methods.
Cost-Sensitive Bayesian Network algorithm Introduction: Machine learning algorithms are becoming an increasingly important area for research and application.
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
Boosting Neural Networks Published by Holger Schwenk and Yoshua Benggio Neural Computation, 12(8): , Presented by Yong Li.
CS 391L: Machine Learning: Ensembles
8/25/05 Cognitive Computations Software Tutorial Page 1 SNoW: Sparse Network of Winnows Presented by Nick Rizzolo.
Machine Learning CSE 681 CH2 - Supervised Learning.
Benk Erika Kelemen Zsolt
Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Lecture notes for Stat 231: Pattern Recognition and Machine Learning 1. Stat 231. A.L. Yuille. Fall 2004 AdaBoost.. Binary Classification. Read 9.5 Duda,
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
E NSEMBLE L EARNING : A DA B OOST Jianping Fan Dept of Computer Science UNC-Charlotte.
Linear Models for Classification
HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Transfer for Supervised Learning Tasks.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
Ensemble Methods in Machine Learning
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.
Classification Ensemble Methods 1
Classification and Prediction: Ensemble Methods Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Boosting and Additive Trees (Part 1) Ch. 10 Presented by Tal Blum.
Web-Mining Agents: Transfer Learning TrAdaBoost R. Möller Institute of Information Systems University of Lübeck.
Ensemble Methods for Machine Learning. COMBINING CLASSIFIERS: ENSEMBLE APPROACHES.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
SemiBoost : Boosting for Semi-supervised Learning Pavan Kumar Mallapragada, Student Member, IEEE, Rong Jin, Member, IEEE, Anil K. Jain, Fellow, IEEE, and.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Adaboost (Adaptive boosting) Jo Yeong-Jun Schapire, Robert E., and Yoram Singer. "Improved boosting algorithms using confidence- rated predictions."
Cost-Sensitive Boosting algorithms: Do we really need them?
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
HW 2.
Data Mining Practical Machine Learning Tools and Techniques
Boosting and Additive Trees (2)
Machine Learning: Ensembles
ECE 5424: Introduction to Machine Learning
Asymmetric Gradient Boosting with Application to Spam Filtering
Data Mining Practical Machine Learning Tools and Techniques
A New Boosting Algorithm Using Input-Dependent Regularizer
Introduction to Boosting
Lecture 18: Bagging and Boosting
Ensemble learning.
Model Combination.
Ensemble learning Reminder - Bagging of Trees Random Forest
CS 391L: Machine Learning: Ensembles
Presentation transcript:

Regression Using Boosting Vishakh Advanced Machine Learning Fall 2006

Introduction ● Classification with boosting – Well-studied – Theoretical bounds and guarantees – Empirically tested ● Regression with boosting – Rarely used – Some bounds and guarantees – Very little empirical testing

Project Description ● Study existing algorithms & formalisms – AdaBoost.R (Fruend & Schapire, 1997) – SquareLev.R (Duffy & Helmbold, 2002) – SquareLev.C (Duffy & Helmbold, 2002) – ExpLev (Duffy & Helmbold, 2002) ● Verify effectiveness by testing on interesting dataset. – Football Manager 2006

A Few Notes ● Want PAC-like guarantees ● Can't directly transfer processes from classification – Simply re-weighting distribution over iterations doesn't work. – Can modify samples and still remain consistent with original function class. ● Performing gradient descent on a potential function.

SquareLev.R ● Squared error regression. ● Uses regression algorithm for base learner. ● Modifies labels, not distribution. ● Potential function uses variance of residuals. ● New label proportional to negative gradient of potential function. ● Each iteration, mean squared error decreases by a multiplicative factor. ● Can get arbitrarily small squared error as long as correlation between residuals and predictions > threshold.

SquareLev.C ● Squared error regression ● Use a base classifier ● Modifies labels and distribution ● Potential function uses residuals ● New label sign of instance's residual

ExpLev ● Attempts to get small residuals at each point. ● Uses exponential potential. ● AdaBoost pushes all instances to positive margin. ● ExpLev pushes all instances to have small residuals ● Uses base regressor ([-1,+1]) or classifier ({- 1,+1}). ● Two-sided potential uses exponents of residuals. ● Base learner must perform well with relabeled instances.

Naive Approach ● Directly translate AdaBoost to the regression setting. ● Use thresholding of squared error to reweight. ● Use to compare test veracity of other approaches

Dataset ● Data from Football Manager 2006 – Very popular game – Statistically driven ● Features are player attributes. ● Labels are average performance ratings over a season. ● Predict performance levels and use learned model to guide game strategy.

Work so far ● Conducted survey ● Studied methods and formal guarantees and bounds. ● Implementation still underway.

Conclusions ● Interesting approaches and analyses of boosting regression available. ● Insufficient real-world verification. ● Further work – Regressing noisy data – Formal results for more relaxed assumptions