Download presentation
Presentation is loading. Please wait.
Published byAnthony Robinson Modified over 9 years ago
1
Regression Using Boosting Vishakh (vv2131@columbia.edu)vv2131@columbia.edu Advanced Machine Learning Fall 2006
2
Introduction ● Classification with boosting – Well-studied – Theoretical bounds and guarantees – Empirically tested ● Regression with boosting – Rarely used – Some bounds and guarantees – Very little empirical testing
3
Project Description ● Study existing algorithms & formalisms – AdaBoost.R (Fruend & Schapire, 1997) – SquareLev.R (Duffy & Helmbold, 2002) – SquareLev.C (Duffy & Helmbold, 2002) – ExpLev (Duffy & Helmbold, 2002) ● Verify effectiveness by testing on interesting dataset. – Football Manager 2006
4
A Few Notes ● Want PAC-like guarantees ● Can't directly transfer processes from classification – Simply re-weighting distribution over iterations doesn't work. – Can modify samples and still remain consistent with original function class. ● Performing gradient descent on a potential function.
5
SquareLev.R ● Squared error regression. ● Uses regression algorithm for base learner. ● Modifies labels, not distribution. ● Potential function uses variance of residuals. ● New label proportional to negative gradient of potential function. ● Each iteration, mean squared error decreases by a multiplicative factor. ● Can get arbitrarily small squared error as long as correlation between residuals and predictions > threshold.
7
SquareLev.C ● Squared error regression ● Use a base classifier ● Modifies labels and distribution ● Potential function uses residuals ● New label sign of instance's residual
8
ExpLev ● Attempts to get small residuals at each point. ● Uses exponential potential. ● AdaBoost pushes all instances to positive margin. ● ExpLev pushes all instances to have small residuals ● Uses base regressor ([-1,+1]) or classifier ({- 1,+1}). ● Two-sided potential uses exponents of residuals. ● Base learner must perform well with relabeled instances.
10
Naive Approach ● Directly translate AdaBoost to the regression setting. ● Use thresholding of squared error to reweight. ● Use to compare test veracity of other approaches
11
Dataset ● Data from Football Manager 2006 – Very popular game – Statistically driven ● Features are player attributes. ● Labels are average performance ratings over a season. ● Predict performance levels and use learned model to guide game strategy.
13
Work so far ● Conducted survey ● Studied methods and formal guarantees and bounds. ● Implementation still underway.
14
Conclusions ● Interesting approaches and analyses of boosting regression available. ● Insufficient real-world verification. ● Further work – Regressing noisy data – Formal results for more relaxed assumptions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.