3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.

Slides:



Advertisements
Similar presentations
Properties of Least Squares Regression Coefficients
Advertisements

Multiple Regression Analysis
The Simple Regression Model
Chapter 12 Simple Linear Regression
Statistical Techniques I EXST7005 Simple Linear Regression.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
3.3 Omitted Variable Bias -When a valid variable is excluded, we UNDERSPECIFY THE MODEL and OLS estimates are biased -Consider the true population model:
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Ch.6 Simple Linear Regression: Continued
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
Assumption MLR.3 Notes (No Perfect Collinearity)
8.4 Weighted Least Squares Estimation Before the existence of heteroskedasticity-robust statistics, one needed to know the form of heteroskedasticity -Het.
Part 1 Cross Sectional Data
Chapter 10 Simple Regression.
2.5 Variances of the OLS Estimators
Chapter 12 Simple Regression
Simple Linear Regression
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Chapter 4 Multiple Regression.
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
The Simple Regression Model
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Chapter 11 Multiple Regression.
The Simple Regression Model
6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables.
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Ordinary Least Squares
3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.
Introduction to Linear Regression and Correlation Analysis
Hypothesis Testing in Linear Regression Analysis
Chapter 4-5: Analytical Solutions to OLS
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Ordinary Least Squares Regression.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
10. Basic Regressions with Times Series Data 10.1 The Nature of Time Series Data 10.2 Examples of Time Series Regression Models 10.3 Finite Sample Properties.
2.4 Units of Measurement and Functional Form -Two important econometric issues are: 1) Changing measurement -When does scaling variables have an effect.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Chapter Three TWO-VARIABLEREGRESSION MODEL: THE PROBLEM OF ESTIMATION
3.4 The Components of the OLS Variances: Multicollinearity We see in (3.51) that the variance of B j hat depends on three factors: σ 2, SST j and R j 2.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
6. Simple Regression and OLS Estimation Chapter 6 will expand on concepts introduced in Chapter 5 to cover the following: 1) Estimating parameters using.
Correlation & Regression Analysis
Chapter 8: Simple Linear Regression Yang Zhenlin.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
5. Consistency We cannot always achieve unbiasedness of estimators. -For example, σhat is not an unbiased estimator of σ -It is only consistent -Where.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
6. Simple Regression and OLS Estimation
Multiple Regression Analysis: Estimation
Multiple Regression Analysis
Chapter 6: MULTIPLE REGRESSION ANALYSIS
Simple Linear Regression
Presentation transcript:

3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted values for y, as in the simple regression case we can obtain residuals: -a positive uhat indicates underprediction (y>yhat) and a negative uhat indicates overprediction (y<yhat)

3.2 OLS Fitted Values and Residuals -We can extend the single variable case to obtain important properties for fitted values and residuals: 1) The sample average of the residuals is zero, therefore: 2) Sample covariance between each independent variable and the OLS residual is zero… -Therefore the sample covariance between the OLS fitted values and the OLS residual is zero -Since the fitted values come from our independent variables and OLS estimates

3.2 OLS Fitted Values and Residuals 3) The point Is always on the OLS regression line: Notes: These properties come from the FOC’s in (3.13): -the first FOC says that the sum of residuals is zero and proves (1) -the rest of the FOC’s imply zero covariance between the independent variables and uhat (2) -(3) follows from (1)

3.2 “Partialling Out” -In multiple regression analysis, we don’t need formulas to obtain OLS’s estimates of B j -However, explicit formulas can give us interesting properties -In the 2 independent variable case: -Where rhat are the residuals from regressing x 1 on x 2 -ie: the regression:

3.2 “Partialling Out” -rhat i1 are the part of x i1 that are uncorrelated with x 12 -rhat i1 is equivalent to x i1 after x i2 ’s effects have been “partialled out” or “netted out” -thus B 1 hat measures x 1 ’s effect on y after x 2 has been “partialled out” -In a regression with k variables, the residuals come from a regression of x 1 on ALL other x’s -in this case B 1 hat measures x 1 ’s effect on y after all other x’s have been “partialled out”

3.2 Comparing Simple and Multiple Regressions -In 2 special cases, OLS will estimate the same B 1 hat for 1 and 2 independent variables -Write the simple and multiple regressions as: -The relationship between B 1 hats becomes: -Where delta is the slope coefficient from regressing x 1 on x 2 (proof in Appendix)

3.2 Comparing Simple and Multiple Regressions -Therefore we have two cases where the B 1 hats will be equal: 1) The partial effect of x 2 on yhat is zero: 2) x 1 and x 2 are uncorrelated in the sample:

3.2 Comparing Simple and Multiple Regressions -Although these 2 cases are rare, they do highlight the situations where B 1 hats will be similar -When B 2 hat is small -There is little correlation between x 1 and x 2 In the case of K independent variables, B 1 hat will be equal to the simple regression case if: 1)OLS coefficients on all other x’s are zero 2)X 1 is uncorrelated with all other x’s -Likewise, small coefficients or little correlation will lead to small differences in B 1

3.2 Wedding Example -Assuming decisions in a wedding could be quantified, wedding decisions are regressed on the bride’s opinions to give: -Adding the groom’s opinions gives: -Since B 2 hat is relatively small, B 1 hats are similar in both cases -Although the bride and groom could have similar opinions, it’s the bride’s opinion that often matters in weddings

3.2 Goodness-of-Fit Equivalent to the simple regression, TOTAL SUM OF SQUARES (SST), the EXPLAINED SUM OF SQUARES (SSE) and the RESIDUAL SUM OF SQUARES (SSR) are defined as:

3.2 Sum of Squares SST still measures the sample variation in y. SSE still measures the sample variation in yhat (the fitted component). SSR still measures the sample variation in uhat (the residual component). Total variation in y is still the sum of total variations in yhat and total variations in uhat:

3.2 SS’s and R 2 If total variation in y is nonzero, we can solve for R 2 : R 2 can also be shown to equal the squared correlation coefficient between the actual y and the fitted yhat: (remember ybar=yhatbar)

3.2 R 2 Notes: -R 2 NEVER decreases, and often increases when a variable is added to the regression -SSR never increases when a variable is added -adding a useless varying variable will generally increase R 2 -R 2 is a poor way to decide whether to include a variable -One should ask if a variable has a nonzero effect on y in the population (theory question) -Somewhat testable in chapter 4

3.2 R 2 Example -Consider the following equation: -Here percentage of gambling winnings or losses is explained by gambling skill and gambling experience -skill and experience account for 24% of the variation in gambling outcomes -this may sound low, but a major gambling factor, luck, is immeasurable and has a big impact -other factors can also have an impact

3.2 R 2 Notes: -Even if R 2 is low, it is still possible that OLS estimates are reliable estimators of each variable’s ceteris paribus effect on y -These variables may not control much of y, but one can analyze how their increase or decrease will affect y -a low R 2 simply reflects that variation in y is hard to explain -that it is difficult to predict individual behaviour – people aren’t as rational as would be convenient

3.2 Regression through the Origin If common sense or economic theory states that B 0 should be zero: Here tilde distinguishes from typical OLS -It is possible in this case that the typical R 2 is negative -ybar explains more than the variables -(3.29) avoids this, but no common procedure exists -Note also that these OLS coefficients are biased

3.3 The Expected Value of the OLS Estimators -As in the simple regression model, we will look at FOUR assumptions that are needed to prove that multiple regression OLS estimators are unbiased -these assumptions are more complicated with more independent variables -remember that these statistical properties have nothing to do with a specific sample, but hold in repeated random sampling -an individual sample’s regression could still be a poor estimate

Assumption MLR.1 (Linear in Parameters) The model in the population can be written as: Where B 0, B 1, … B k are the unknown parameters (constants) of interest and u is an unobservable random error or disturbance term (Note: MLR stands for multiple linear regression)

Assumption MLR.1 Notes (Linear in Parameters) -(3.31) is also called the POPULATION MODEL or the TRUE MODEL -as our actual estimated model may differ from (3.31) -the population model is linear in the parameters (B’s) -since the variables can be non-linear (ie: squares and logs), this model is very flexible

Assumption MLR.2 (Random Sampling) We have a random sample of n observations, [(x i1, x i2,…, x ik, y i ): i= 1, 2,…, n} following the population model in Assumption MLR.1.

Assumption MLR.2 Notes (Random Sampling) Combining MLR.1 with MLR.2 gives us: Where u i contains unobserved factors of y i Where B k hat is an estimator of B K. Ie: We’ve already seen that residuals average out to zero and sample correlation between independent variables and residuals is zero -our next assumption makes OLS well defined

Assumption MLR.3 (No Perfect Collinearity) In the sample (and therefore in the population), none of the independent variables is constant, and there are no exact linear relationships among the independent variables.

Assumption MLR.3 Notes (No Perfect Collinearity) -MLR.3 is more complicated than its single regression counterpart -there are now more relationships between more independent variables -if an independent variable is an exact linear combination of other independent variables, PERFECT COLLINEARITY exists -some collinearity, or impact between variables is expected, as long as it’s not perfect