CS 2750: Machine Learning Line Fitting + Bias-Variance Trade-off

CS 2750: Machine Learning Line Fitting + Bias-Variance Trade-off
Prof. Adriana Kovashka University of Pittsburgh January 26, 2017

Test set (labels unknown)
Generalization Training set (labels known) Test set (labels unknown) How well does a learned model generalize from the data it was trained on to a new test set? Slide credit: L. Lazebnik

Generalization Components of expected loss
Noise in our observations: unavoidable Bias: how much the average model over all training sets differs from the true model Error due to inaccurate assumptions/simplifications made by the model Variance: how much models estimated from different training sets differ from each other Underfitting: model is too “simple” to represent all the relevant class characteristics High bias and low variance High training error and high test error Overfitting: model is too “complex” and fits irrelevant characteristics (noise) in the data Low bias and high variance Low training error and high test error Adapted from L. Lazebnik

Bias-Variance Trade-off
Models with too few parameters are inaccurate because of a large bias (not enough flexibility). Models with too many parameters are inaccurate because of a large variance (too much sensitivity to the sample). Purple dots = possible test points Red dots = training data (all that we see before we ship off our model!) Green curve = true underlying model Blue curve = our predicted model/fit Adapted from D. Hoiem

Polynomial Curve Fitting
Slide credit: Chris Bishop

Sum-of-Squares Error Function

0th Order Polynomial Slide credit: Chris Bishop

1st Order Polynomial Slide credit: Chris Bishop

3rd Order Polynomial Slide credit: Chris Bishop

9th Order Polynomial Slide credit: Chris Bishop

Over-fitting Root-Mean-Square (RMS) Error: Slide credit: Chris Bishop

Data Set Size: 9th Order Polynomial Slide credit: Chris Bishop

Regularization Penalize large coefficient values (Remember: We want to minimize this expression.) Adapted from Chris Bishop

Regularization: Slide credit: Chris Bishop

Polynomial Coefficients

Polynomial Coefficients
No regularization Huge regularization Adapted from Chris Bishop

Regularization: vs. Slide credit: Chris Bishop

Training vs test error Underfitting Overfitting Complexity Error
Low Bias High Variance High Bias Low Variance Error Test error Training error Slide credit: D. Hoiem

The effect of training set size
Complexity Low Bias High Variance High Bias Low Variance Test Error Few training examples Note: these figures don’t work in pdf Many training examples Slide credit: D. Hoiem

The effect of training set size
Fixed prediction model Number of Training Examples Error Testing Generalization Error Training Adapted from D. Hoiem

Choosing the trade-off between bias and variance
Need validation set (separate from the test set) Complexity Low Bias High Variance High Bias Low Variance Error Validation error Training error Slide credit: D. Hoiem

Bias-variance (Bishop Sec. 3.2)
Figure from Chris Bishop

How to reduce variance? Get more training data
Regularize the parameters Choose a simpler classifier Slide credit: D. Hoiem

Remember… Three kinds of error Try simple classifiers first
Inherent: unavoidable Bias: due to over-simplifications Variance: due to inability to perfectly estimate parameters from limited data Try simple classifiers first Use increasingly powerful classifiers with more training data (bias-variance trade-off) Adapted from D. Hoiem

CS 2750: Machine Learning Line Fitting + Bias-Variance Trade-off

Similar presentations

Presentation on theme: "CS 2750: Machine Learning Line Fitting + Bias-Variance Trade-off"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 2750: Machine Learning Line Fitting + Bias-Variance Trade-off

Similar presentations

Presentation on theme: "CS 2750: Machine Learning Line Fitting + Bias-Variance Trade-off"— Presentation transcript:

Similar presentations

About project

Feedback