Basis Expansions and Generalized Additive Models (1)

Slides:

Advertisements

Similar presentations

Multiple Regression W&W, Chapter 13, 15(3-4). Introduction Multiple regression is an extension of bivariate regression to take into account more than.

Advertisements

Chapter Outline 3.1 Introduction

Data Modeling and Parameter Estimation Nov 9, 2005 PSCI 702.

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.

Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.

Chapter 10 Curve Fitting and Regression Analysis

Linear regression models

Data mining and statistical learning - lecture 6

Basis Expansion and Regularization Presenter: Hongliang Fei Brian Quanz Brian Quanz Date: July 03, 2008.

1 Curve-Fitting Spline Interpolation. 2 Curve Fitting Regression Linear Regression Polynomial Regression Multiple Linear Regression Non-linear Regression.

Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.

Curve-Fitting Regression

Linear Regression Models Based on Chapter 3 of Hastie, Tibshirani and Friedman Slides by David Madigan.

EGR 105 Foundations of Engineering I Fall 2007 – week 7 Excel part 3 - regression.

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

Basis Expansions and Regularization Based on Chapter 5 of Hastie, Tibshirani and Friedman.

Ch. 14: The Multiple Regression Model building

Classification and Prediction: Regression Analysis

Calibration & Curve Fitting

Objectives of Multiple Regression

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

Outline Separating Hyperplanes – Separable Case

Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.

Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.

Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.

Multiple Regression I KNNL – Chapter 6. Models with Multiple Predictors Most Practical Problems have more than one potential predictor variable Goal is.

Curve-Fitting Regression

Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression Regression Trees.

CpSc 881: Machine Learning

Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.

Basis Expansions and Generalized Additive Models Basis expansion Piecewise polynomials Splines Generalized Additive Model MARS.

LECTURE 17: BEYOND LINEARITY PT. 2 March 30, 2016 SDS 293 Machine Learning.

LECTURE 13: LINEAR MODEL SELECTION PT. 3 March 9, 2016 SDS 293 Machine Learning.

Model Selection and the Bias–Variance Tradeoff All models described have a smoothing or complexity parameter that has to be considered: multiplier of the.

Computational Intelligence: Methods and Applications Lecture 14 Bias-variance tradeoff – model selection. Włodzisław Duch Dept. of Informatics, UMK Google:

Chapter 15 Multiple Regression Model Building

The simple linear regression model and parameter estimation

Bayesian Semi-Parametric Multiple Shrinkage

Piecewise Polynomials and Splines

Chapter 7. Classification and Prediction

Linear Regression CSC 600: Data Mining Class 12.

Part 5 - Chapter

ECE3340 Numerical Fitting, Interpolation and Approximation

Part 5 - Chapter 17.

Curve-Fitting Spline Interpolation

Non-linear relationships

Boosting and Additive Trees (2)

Chapter 11: Simple Linear Regression

Regression Diagnostics

CH 5: Multivariate Methods

Linear Regression (continued)

Multiple Regression Analysis and Model Building

Machine learning, pattern recognition and statistical data modelling

Statistical Methods For Engineers

Chapter 12 Curve Fitting : Fitting a Straight Line Gab-Byung Chae

Part 5 - Chapter 17.

Today’s class Multiple Variable Linear Regression

Lasso/LARS summary Nasimeh Asgarian.

Linear regression Fitting a straight line to observations.

What is Regression Analysis?

Linear Model Selection and regularization

Biointelligence Laboratory, Seoul National University

Contact: Machine Learning – (Linear) Regression Wilson Mckerrow (Fenyo lab postdoc) Contact:

Bias-variance Trade-off

Product moment correlation

Basis Expansions and Generalized Additive Models (2)

SKTN 2393 Numerical Methods for Nuclear Engineers

MGS 3100 Business Analysis Regression Feb 18, 2016

Presentation transcript:

Basis Expansions and Generalized Additive Models (1) Regression and shrinkage Basis expansion Piecewise polynomials

Linear regression Simple linear regression: E(y) = α+βx α: intercept, β: slope The fit is a line. When there are multiple predictors: E(y) = α+β1x1+β2x2+…+βkxk The fit is a hyperplane.

Loss function y= α+β1x1+β2x2+…+βkxk+ε, ε~N(0, σ2) The least square loss function: The βj, j = 1, 2,..., k are called “partial-regression coefficients”. βj represents the average increase in y per unit increase in xj, with all other variables held constant.

Loss function Take partial derivative and set to zero: Solve the set of linear equations:

The Matrix approach Loss function: The solution:

Geometric interpretation https://commons.wikimedia.org/wiki/File:OLS_geometric_interpretation.svg

Shrinkage methods The expected prediction error of a model contains variance and bias components, plus the irreducible error. Under model: Y=f(X)+ε

Shrinkage methods Bias-variance trade off: by introducing a little bias into the model, we can sometimes reduce a lot of the variance, making the overall EPE much smaller. Shrink the coefficient estimates towards zero Shrinking the coefficient estimates can significantly reduce their variance (uncertainty), hence reduce prediction variance. Irrelevant predictors are essentially removed by receiving 0 (or extremely small) coefficients.

Shrinkage methods Ridge regression: In multivariate regression, we minimize the loss function: In contrast, ridge regression loss function:

Shrinkage methods It is best to apply ridge regression after standardizing the predictors λ=0, least squares regression λ large, ridge regression coefficient estimates will approach zero

Shrinkage methods

Shrinkage methods Lasso. The loss function: The l1 penalty has the effect of forcing some of the coefficient estimates to be exactly zero when the tuning parameter λ is sufficiently large.

Shrinkage methods

Shrinkage methods Lasso: Ridge:

Shrinkage methods

Basis expansion f(X) = E(Y |X) can often be nonlinear and non-additive in X However, linear models are easy to fit and interpret By augmenting the data, we may construct linear models to achieve non-linear regression/classification.

Basis expansion Some widely used transformations: hm(X) = Xm, m = 1, . . . , p  the original linear model. hm(X) = Xj2, hm(X) = XjXk or higher order polynomials  augment the inputs with polynomial terms the number of variables grows exponentially in the degree of the polynomial: O(pd) for a degree-d polynomial hm(X) = log(Xj), ...  other nonlinear transformations hm(X) = I(Lm ≤ Xk < Um), breaking the range of Xk up into non-overlapping regions  piecewise constant

Basis expansion More often, we use the basis expansions as a device to achieve more flexible representations for f(X) Polynomials are global – tweaking functional forms to suite a region causes the function to flap about madly in remote regions. Red: 6 degree polynomial Blue: 7 degree polynomial

Basis expansion Piecewise-polynomials and splines allow for local polynomial representations Problem: the number of basis functions can grow too large to fit using limited data. Solution: Restriction methods - limit the class of functions Example: additive model

Basis expansion Selection methods Allow large numbers of basis functions, adaptively scan the dictionary and include only those basis functions hm() that contribute significantly to the fit of the model. Example: multivariate adaptive regression splines (MARS) Regularization methods where we use the entire dictionary but restrict the coefficients. Example: Ridge regression Lasso (both regularization and selection)

Piecewise Polynomials Assume X is one-dimensional. Divide the domain of X into contiguous intervals, and represent f(X) by a separate polynomial in each interval. Simplest – piecewise constant

Piecewise Polynomials piecewise linear Three additional basis functions are needed:

Piecewise Polynomials piecewise linear requiring continuity

Piecewise Polynomials Lower-right: Cubic spline