Lecture 4 Econ 488. Ordinary Least Squares (OLS) Objective of OLS  Minimize the sum of squared residuals: where Remember that OLS is not the only possible.

Slides:



Advertisements
Similar presentations
Managerial Economics in a Global Economy
Advertisements

Multiple Regression Analysis
Welcome to Econ 420 Applied Regression Analysis
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Econ 488 Lecture 5 – Hypothesis Testing Cameron Kaplan.
Conclusion to Bivariate Linear Regression Economics 224 – Notes for November 19, 2008.
3.3 Omitted Variable Bias -When a valid variable is excluded, we UNDERSPECIFY THE MODEL and OLS estimates are biased -Consider the true population model:
Assumption MLR.3 Notes (No Perfect Collinearity)
The Simple Linear Regression Model: Specification and Estimation
Multiple Linear Regression Model
Economics Prof. Buckles1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Chapter 10 Simple Regression.
2.5 Variances of the OLS Estimators
CHAPTER 3 ECONOMETRICS x x x x x Chapter 2: Estimating the parameters of a linear regression model. Y i = b 1 + b 2 X i + e i Using OLS Chapter 3: Testing.
Simple Linear Regression
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Multiple Regression Analysis
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
Econ 140 Lecture 181 Multiple Regression Applications III Lecture 18.
CHAPTER 4 ECONOMETRICS x x x x x Multiple Regression = more than one explanatory variable Independent variables are X 2 and X 3. Y i = B 1 + B 2 X 2i +
Multiple Regression Analysis
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
Economics 20 - Prof. Anderson
Topic 3: Regression.
1.The independent variables do not form a linearly dependent set--i.e. the explanatory variables are not perfectly correlated. 2.Homoscedasticity --the.
Topic4 Ordinary Least Squares. Suppose that X is a non-random variable Y is a random variable that is affected by X in a linear fashion and by the random.
Lecture 2 (Ch3) Multiple linear regression
Empirical Estimation Review EconS 451: Lecture # 8 Describe in general terms what we are attempting to solve with empirical estimation. Understand why.
1Prof. Dr. Rainer Stachuletz Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Linear Regression Models Powerful modeling technique Tease out relationships between “independent” variables and 1 “dependent” variable Models not perfect…need.
Ordinary Least Squares
Multiple Linear Regression Analysis
Hypothesis Testing in Linear Regression Analysis
2-1 MGMG 522 : Session #2 Learning to Use Regression Analysis & The Classical Model (Ch. 3 & 4)
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Two Ending Sunday, September 9 (Note: You must go over these slides and complete every.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Properties of OLS How Reliable is OLS?. Learning Objectives 1.Review of the idea that the OLS estimator is a random variable 2.How do we judge the quality.
2.4 Units of Measurement and Functional Form -Two important econometric issues are: 1) Changing measurement -When does scaling variables have an effect.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
3.4 The Components of the OLS Variances: Multicollinearity We see in (3.51) that the variance of B j hat depends on three factors: σ 2, SST j and R j 2.
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin The Two-Variable Model: Hypothesis Testing chapter seven.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Chap 5 The Multiple Regression Model
Class 5 Multiple Regression CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques
Multiple Regression Analysis: Estimation. Multiple Regression Model y = ß 0 + ß 1 x 1 + ß 2 x 2 + …+ ß k x k + u -ß 0 is still the intercept -ß 1 to ß.
Lecture 8: Ordinary Least Squares Estimation BUEC 333 Summer 2009 Simon Woodcock.
Economics 20 - Prof. Anderson1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Chapter 4. The Normality Assumption: CLassical Normal Linear Regression Model (CNLRM)
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Ch. 2: The Simple Regression Model
Multiple Regression Analysis: Estimation
Chapter 5: The Simple Regression Model
Fundamentals of regression analysis
Chapter 3: TWO-VARIABLE REGRESSION MODEL: The problem of Estimation
Multiple Regression Analysis
Ch. 2: The Simple Regression Model
Chapter 6: MULTIPLE REGRESSION ANALYSIS
Basic Econometrics Chapter 4: THE NORMALITY ASSUMPTION:
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Simple Linear Regression
Chapter 7: The Normality Assumption and Inference with OLS
The Simple Regression Model
Ch3 The Two-Variable Regression Model
Presentation transcript:

Lecture 4 Econ 488

Ordinary Least Squares (OLS) Objective of OLS  Minimize the sum of squared residuals: where Remember that OLS is not the only possible estimator of the βs. But OLS is the best estimator under certain assumptions…

Classical Assumptions 1. Regression is linear in parameters 2. Error term has zero population mean 3. Error term is not correlated with X’s 4. No serial correlation 5. No heteroskedasticity 6. No perfect multicollinearity and we usually add: 7. Error term is normally distributed

Assumption 1: Linearity The regression model:  A) is linear It can be written as This doesn’t mean that the theory must be linear For example… suppose we believe that CEO salary is related to the firm’s sales and CEO’s tenure. We might believe the model is:

Assumption 1: Linearity The regression model:  B) is correctly specified The model must have the right variables No omitted variables The model must have the correct functional form This is all untestable  We need to rely on economic theory.

Assumption 1: Linearity The regression model:  C) must have an additive error term The model must have + ε i

Assumption 2: E(ε i )=0 Error term has a zero population mean E(ε i )=0 Each observation has a random error with a mean of zero What if E(ε i )≠0? This is actually fixed by adding a constant (AKA intercept) term

Assumption 2: E(ε i )=0 Example: Suppose instead the mean of ε i was -4. Then we know E(ε i +4)=0 We can add 4 to the error term and subtract 4 from the constant term: Y i =β 0 + β 1 X i +ε i Y i =(β 0 -4)+ β 1 X i +(ε i +4)

Assumption 2: E(ε i )=0 Y i =β 0 + β 1 X i +ε i Y i =(β 0 -4)+ β 1 X i +(ε i +4) We can rewrite: Y i =β 0 *+ β 1 X i +ε i * Where β 0 *= β 0 -4 and ε i *=ε i +4 Now E(ε i *)=0, so we are OK.

Assumption 3: Exogeneity Important!! All explanatory variables are uncorrelated with the error term E(ε i |X 1i,X 2i,…, X Ki,)=0 Explanatory variables are determined outside of the model (They are exogenous)

Assumption 3: Exogeneity What happens if assumption 3 is violated? Suppose we have the model, Y i =β 0 + β 1 X i +ε i Suppose X i and ε i are positively correlated When X i is large, ε i tends to be large as well.

Assumption 3: Exogeneity “True” Line “True Line”

Assumption 3: Exogeneity “True” Line “True Line” Data “True Line” Data

Assumption 3: Exogeneity “True Line” Data Estimated Line

Assumption 3: Exogeneity Why would x and ε be correlated? Suppose you are trying to study the relationship between the price of a hamburger and the quantity sold across a wide variety of Ventura County restaurants.

Assumption 3: Exogeneity We estimate the relationship using the following model: sales i = β 0 +β 1 price i +ε i What’s the problem?

Assumption 3: Exogeneity What’s the problem?  What else determines sales of hamburgers?  How would you decide between buying a burger at McDonald’s ($0.89) or a burger at TGI Fridays ($9.99)?  Quality differs  sales i = β 0 +β 1 price i +ε i  quality isn’t an X variable even though it should be.  It becomes part of ε i

Assumption 3: Exogeneity What’s the problem?  But price and quality are highly positively correlated  Therefore x and ε are also positively correlated.  This means that the estimate of β 1 will be too high  This is called “Omitted Variables Bias” (More in Chapter 6)

Assumption 4: No Serial Correlation Serial Correlation: The error terms across observations are correlated with each other i.e. ε 1 is correlated with ε 2, etc. This is most important in time series If errors are serially correlated, an increase in the error term in one time period affects the error term in the next.

Assumption 4: No Serial Correlation The assumption that there is no serial correlation can be unrealistic in time series Think of data from a stock market…

Assumption 4: No Serial Correlation Stock data is serially correlated!

Assumption 5: Homoskedasticity Homoskedasticity: The error has a constant variance This is what we want…as opposed to Heteroskedasticity: The variance of the error depends on the values of Xs.

Assumption 5: Homoskedasticity Homoskedasticity: The error has constant variance

Assumption 5: Homoskedasticity Heteroskedasticity: Spread of error depends on X.

Assumption 5: Homoskedasticity Another form of Heteroskedasticity

Assumption 6: No Perfect Multicollinearity Two variables are perfectly collinear if one can be determined perfectly from the other (i.e. if you know the value of x, you can always find the value of z). Example: If we regress income on age, and include both age in months and age in years.  But age in years = age in months/12  e.g. if we know someone is 246 months old, we also know that they are 20.5 years old.

Assumption 6: No Perfect Multicollinearity What’s wrong with this? income i = β 0 + β 1 agemonths i + β 2 ageyears i + ε i What is β 1 ? It is the change in income associated with a one unit increase in “age in months,” holding age in years constant.  But if you hold age in years constant, age in months doesn’t change!

Assumption 6: No Perfect Multicollinearity β 1 = Δincome/Δagemonths Holding Δageyears = 0 If Δageyears = 0; then Δagemonths = 0 So β 1 = Δincome/0 It is undefined!

Assumption 6: No Perfect Multicollinearity When more than one independent variable is a perfect linear combination of the other independent variables, it is called Perfect MultiCollinearity Example: Total Cholesterol, HDL and LDL Total Cholesterol = LDL + HDL Can’t include all three as independent variables in a regression. Solution: Drop one of the variables.

Assumption 7: Normally Distributed Error

This is required not required for OLS, but it is important for hypothesis testing More on this assumption next time.

Putting it all together Last class, we talked about how to compare estimators. We want: 1. is unbiased.   on average, the estimator is equal to the population value 2. is efficient  The variance of the estimator is as small as possible

Putting it all togehter

Gauss-Markov Theorem Given OLS assumptions 1 through 6, the OLS estimator of β k is the minimum variance estimator from the set of all linear unbiased estimators of β k for k=0,1,2,…,K OLS is BLUE The Best, Linear, Unbiased Estimator

Gauss-Markov Theorem What happens if we add assumption 7? Given assumptions 1 through 7, OLS is the best unbiased estimator Even out of the non-linear estimators OLS is BUE?

Gauss-Markov Theorem With Assumptions 1-7 OLS is: 1. Unbiased: 2. Minimum Variance – the sampling distribution is as small as possible 3. Consistent – as n  ∞, the estimators converge to the true parameters  As n increases, variance gets smaller, so each estimate approaches the true value of β. 4. Normally Distributed. You can apply statistical tests to them.