Data mining and statistical learning, lecture 3 Outline  Ordinary least squares regression  Ridge regression.

Slides:



Advertisements
Similar presentations
Multiple Regression and Model Building
Advertisements

Lecture 17: Tues., March 16 Inference for simple linear regression (Ch ) R2 statistic (Ch ) Association is not causation (Ch ) Next.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Managerial Economics in a Global Economy
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Chapter Outline 3.1 Introduction
The Multiple Regression Model.
Ridge Regression Population Characteristics and Carbon Emissions in China ( ) Q. Zhu and X. Peng (2012). “The Impacts of Population Change on Carbon.
BA 275 Quantitative Business Methods
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Ch11 Curve Fitting Dr. Deshi Ye
(c) 2007 IUPUI SPEA K300 (4392) Outline Least Squares Methods Estimation: Least Squares Interpretation of estimators Properties of OLS estimators Variance.
Chapter 12 Simple Linear Regression
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Multiple regression analysis
Chapter 13 Additional Topics in Regression Analysis
To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-1 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ Chapter 4 RegressionModels.
Data mining and statistical learning - lab2-4 Lab 2, assignment 1: OLS regression of electricity consumption on temperature at 53 sites.
Chapter 10 Simple Regression.
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Chapter 12 Multiple Regression
Additional Topics in Regression Analysis
Statistics 350 Lecture 16. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Linear Regression Models Based on Chapter 3 of Hastie, Tibshirani and Friedman Slides by David Madigan.
Data mining and statistical learning, lecture 4 Outline Regression on a large number of correlated inputs  A few comments about shrinkage methods, such.
Data mining and statistical learning, lecture 5 Outline  Summary of regressions on correlated inputs  Ridge regression  PCR (principal components regression)
Ch. 14: The Multiple Regression Model building
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Data mining and statistical learning - lecture 11 Neural networks - a model class providing a joint framework for prediction and classification  Relationship.
Simple Linear Regression Analysis
Classification and Prediction: Regression Analysis
Correlation & Regression
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Objectives of Multiple Regression
Inference for regression - Simple linear regression
Chapter 12 Multiple Regression and Model Building.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Model Building III – Remedial Measures KNNL – Chapter 11.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Statistical Methods Statistical Methods Descriptive Inferential
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Simple Linear Regression. Data available : (X,Y) Goal : To predict the response Y. (i.e. to obtain the fitted response function f(X)) Least Squares Fitting.
Linear Models Alan Lee Sample presentation for STATS 760.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
Multiple Regression Analysis: Estimation. Multiple Regression Model y = ß 0 + ß 1 x 1 + ß 2 x 2 + …+ ß k x k + u -ß 0 is still the intercept -ß 1 to ß.
Multiple Regression David A. Kenny January 12, 2014.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Regression Analysis Part A Basic Linear Regression Analysis and Estimation of Parameters Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied.
Linear Regression.
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Prepared by Lee Revere and John Large
Simple Linear Regression
Simple Linear Regression
Presentation transcript:

Data mining and statistical learning, lecture 3 Outline  Ordinary least squares regression  Ridge regression

Data mining and statistical learning, lecture 3 Ordinary least squares regression (OLS) x1x1 x2x2 xpxp … y Model: Terminology:  0 : intercept (or bias)  1, …,  p : regression coefficients (or weights) The response variable responds directly and linearly to changes in the inputs

Data mining and statistical learning, lecture 3 Least squares regression Assume that we have observed a training set of data Estimate the  coefficients by minimizing the residual sum of squares

Data mining and statistical learning, lecture 3 Matrix formulation of OLS regression Differentiating the residual sum of squares and setting the first derivatives equal to zero we obtain where and

Data mining and statistical learning, lecture 3 Parameter estimates and predictions Least squares estimates of the parameters Predicted values

Data mining and statistical learning, lecture 3 Different sources of inputs Quantitative inputs Transformations of quantitative inputs Numeric or dummy coding of the levels of qualitative inputs Interactions between variables (e.g. X 3 = X 1 X 2 ) Example of dummy coding:

Data mining and statistical learning, lecture 3 An example of multiple linear regression Response variable: Requested price of used Porsche cars (1000 SEK) Inputs: X 1 = Manufacturing year X 2 = Milage (km) X 3 = Model (0 or 1) X 4 = Equipment (1 2, 3) X 5 = Colour (Red Black Silver Blue Black White Green)

Data mining and statistical learning, lecture 3 Price of used Porsche cars Response variable: Requested price of used Porsche cars (1000 SEK) Inputs: X 1 = Manufacturing year X 2 = Milage (km)

Data mining and statistical learning, lecture 3 Interpretation of multiple regression coefficients Assume that and that the regression coefficients are estimated by ordinary least squares regression Then the multiple regression coefficient represents the additional contribution of x j on y, after x j has been adjusted for x 0, x 1, …, x j-1, x j+1, …, x p

Data mining and statistical learning, lecture 3 Confidence intervals for regression parameters Assume that where the X- variables are fixed and the error terms are i.i.d. and N(0,  ) Then where v j is the jth diagonal element of

Data mining and statistical learning, lecture 3 Interpretation of software outputs Adding new independent variables to a regression model alters at least one of the old regression coefficients unless the columns of the X -matrix are orthogonal, i.e.

Data mining and statistical learning, lecture 3 Stepwise Regression: Price (1000SEK) versus Year, Milage (km),... The p -value refers to a t -test of the hypothesis that the regression coefficient of the last entered x -variable is zero Classical statistical model selection techniques are model-based. In data-mining the model selection is data-driven.

Data mining and statistical learning, lecture 3 Stepwise Regression: Price (1000SEK) versus Year, Milage (km),... - model validation by visual inspection of residuals Residual = Observed - Predicted

Data mining and statistical learning, lecture 3 The Gram-Schmidt procedure for regression by successive orthogonalization and simple linear regression 1. Intialize z 0 = x 0 = 1 2. For j = 1, …, p, compute where  depicts the inner product (the sum of coordinate-wise products) 3. Regress y on z p to obtain the multiple regression coefficient

Data mining and statistical learning, lecture 3 Prediction of a response variable using correlated explanatory variables - daily temperatures in Stockholm, Göteborg, and Malmö

Data mining and statistical learning, lecture 3 Absorbance records for ten samples of chopped meat 1 response variable (protein) 100 predictors (absorbance at 100 wavelengths or channels) The predictors are strongly correlated to each other

Data mining and statistical learning, lecture 3 Absorbance records for 240 samples of chopped meat The target is poorly correlated to each predictor

Data mining and statistical learning, lecture 3 Ridge regression The ridge regression coefficients minimize a penalized residual sum of squares: or Normally, inputs are centred prior to the estimation of regression coefficients

Data mining and statistical learning, lecture 3 Matrix formulation of ridge regression for centred inputs If the inputs are orthogonal, the ridge estimates are just a scaled version of the least squares estimates Shrinking enables estimation of regression coefficients even if the number of parameters exceeds the number of cases Figure 3.7

Data mining and statistical learning, lecture 3 Ridge regression – pros and cons Ridge regression is particularly useful if the explanatory variables are strongly correlated to each other. The variance of the estimated regression coefficient is reduced at the expensive of (slightly) biased estimates

Data mining and statistical learning, lecture 3 The Gauss-Markov theorem Consider a linear regression model in which: –the inputs are regarded as fixed –the error terms are i.i.d. with mean 0 and variance  2. Then, the least squares estimator of a parameter a T  has variance no bigger than any other linear unbiased estimator of a T  Biased estimators may have smaller variance and mean squared error!

Data mining and statistical learning, lecture 3 SAS code for an ordinary least squares regression proc reg data=mining.dailytemperature outest = dtempbeta; model daily_consumption = stockholm g_teborg malm_; run;

Data mining and statistical learning, lecture 3 SAS code for ridge regression proc reg data=mining.dailytemperature outest = dtempbeta ridge=0 to 10 by 1; model daily_consumption = stockholm g_teborg malm_; proc print data=dtempbeta; run;