Lecture 8: Ordinary Least Squares Estimation BUEC 333 Summer 2009 Simon Woodcock.

Slides:



Advertisements
Similar presentations
Things to do in Lecture 1 Outline basic concepts of causality
Advertisements

Properties of Least Squares Regression Coefficients
Managerial Economics in a Global Economy
Marietta College Week 2 1 Collect Asst 2: Due Tuesday in class 1.#3, Page 25 2.
The Simple Regression Model
Applied Econometrics Second edition
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Quantitative Methods 2 Lecture 3 The Simple Linear Regression Model Edmund Malesky, Ph.D., UCSD.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week Three Ending Tuesday, September 11 (Note: You must go over these slides and complete every.
Lecture 4 Econ 488. Ordinary Least Squares (OLS) Objective of OLS  Minimize the sum of squared residuals: where Remember that OLS is not the only possible.
1 The Basics of Regression. 2 Remember back in your prior school daze some algebra? You might recall the equation for a line as being y = mx + b. Or maybe.
Lecture 3 Cameron Kaplan
Chapter 10 Simple Regression.
Chapter 4 Multiple Regression.
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
The Basics of Regression continued
ESTIMATING THE REGRESSION COEFFICIENTS FOR SIMPLE LINEAR REGRESSION.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
The Simple Regression Model
Lecture 2 (Ch3) Multiple linear regression
BCOR 1020 Business Statistics
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Spreadsheet Problem Solving
Simple Linear Regression Analysis
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
Chapter 6 (cont.) Regression Estimation. Simple Linear Regression: review of least squares procedure 2.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Ordinary Least Squares
Simple Linear Regression
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Correlation and Linear Regression
Hypothesis Testing in Linear Regression Analysis
EC339: Lecture 6 Chapter 5: Interpreting OLS Regression.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
CORRELATION & REGRESSION
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Regression. Idea behind Regression Y X We have a scatter of points, and we want to find the line that best fits that scatter.
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
7.1 Multiple Regression More than one explanatory/independent variable This makes a slight change to the interpretation of the coefficients This changes.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Multiple regression - Inference for multiple regression - A case study IPS chapters 11.1 and 11.2 © 2006 W.H. Freeman and Company.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 2 – Slide 1 of 20 Chapter 4 Section 2 Least-Squares Regression.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
1Spring 02 First Derivatives x y x y x y dy/dx = 0 dy/dx > 0dy/dx < 0.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Scatterplots & Regression Week 3 Lecture MG461 Dr. Meredith Rolfe.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Economics 173 Business Statistics Lecture 10 Fall, 2001 Professor J. Petry
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 7: Regression.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Multiple Regression Chapter 14.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
BPS - 5th Ed. Chapter 231 Inference for Regression.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 10: Comparing Models.
The simple linear regression model and parameter estimation
QM222 Class 9 Section A1 Coefficient statistics
Linear Regression Bonus
Multiple Regression.
CHAPTER 26: Inference for Regression
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Simple Linear Regression
Simple Linear Regression
Simple Linear Regression
Presentation transcript:

Lecture 8: Ordinary Least Squares Estimation BUEC 333 Summer 2009 Simon Woodcock

From Last Day  Recall our population regression function:  Because the coefficients (β) and the errors (ε i ) are population quantities, we don’t observe them.  Sometimes our primary interest is the coefficients themselves β k measures the marginal effect of variable X ki on the dependent variable Y i.  Sometimes we’re more interested in predicting Y i. if we have sample estimates of the coefficients, we can calculate predicted values:  In either case, we need a way to estimate the unknown β’s. That is, we need a way to compute from a sample of data  It turns out there are lots of ways to estimate the β’s (compute ).  By far the most common method is called ordinary least squares (OLS).

What OLS does  Recall that we can write: where e i are the residuals. these are the sample counterpart to the population errors ε i they measure how far our predicted values ( ) are from the true Y i think of them as prediction mistakes  We want to estimate the β’s in a way that makes the residuals as small as possible. we want the predicted values as close to the truth as possible  OLS minimizes the sum of squared residuals:

Why OLS?  OLS is “easy” computers do it routinely if you had to do OLS by hand, you could  Minimizing squared residuals is better than just minimizing residuals: we could minimize the sum (or average) of residuals, but the positive and negative residuals would cancel out – and we might end up with really bad predicted values (huge positive and negative “mistakes” that cancel out – draw a picture) squaring penalizes “big” mistakes (big e i ) more than “little” mistakes (small e i ) by minimizing the sum of squared residuals, we get a zero average residual (mistake) as a bonus  OLS estimates are unbiased, and are most efficient in the class of (linear) unbiased estimators (more about this later).

How OLS works  Suppose we have a linear regression model with one independent variable:  The OLS estimates of β 0 and β 1 are the values that minimize: you all know how to solve for the OLS estimates. We just differentiate this expression with respect to β 0 and β 1, set the derivatives equal to zero, and solve.  The solutions to this minimization problem are (look familiar?):

OLS in practice  Knowing the summation formulas for OLS estimates is useful for understanding how OLS estimation works. once we add more than one independent variable, these summation formulas become cumbersome In practice, we never do least squares calculations by hand (that’s what computers are for)  In fact, doing least squares regression in EViews is a piece of cake – time for an example.

An example  Suppose we are interested in how an NHL hockey player’s salary varies with the number of points they score. it’s natural to think variation in salary is related to variation in points scored our dependent variable (Y i ) will be SALARY_USD our independent variable (X i ) will be POINTS  After opening the EViews workfile, there are two ways to set up the equation: 1. select SALARY_USD and then POINTS (the order is important), then right-click one of the selected objects, and OPEN -> AS EQUATION or 2. QUICK -> ESTIMATE EQUATION and then in the EQUATION SPECIFICATION dialog box, type: salary_usd points c (the first variable in the list is the dependent variable, the remaining variables are the independent variables including the intercept c )  You’ll see a drop down box for the estimation METHOD, and notice that least squares (LS) is the default. Click OK.  It’s as easy as that. Your results should look like the next slide...

Estimation Results

What the results mean  The column labeled “Coefficient” gives the least squares estimates of the regression coefficients. So our estimated model is: USD_SALARY = ( )*POINTS That is, players who scored zero points earned $335,602 on average For each point scored, players were paid an additional $41,801 on average So the “average” 100-point player was paid $4,515,702  The column labeled “Std. Error” gives the standard error (square root of the sampling variance) of the regression coefficients the OLS estimates are functions of the sample data, and hence are RVs – more on their sampling distribution later  The column labeled “t-Statistic” is a test statistic for the null hypothesis that the corresponding regression coefficient is zero (more about this later)  The column labeled “Prob.” is the p-value associated with this test  Ignore the rest for now  Now let’s see if anything changes when we add a player’s age & years of NHL experience to our model

Another Example

What’s Changed: The Intercept  You’ll notice that the estimated coefficient on POINTS and the intercept have changed.  This is because they now measure different things.  In our original model (without AGE and YEARS_EXP among the independent variables), the intercept ( c ) measured the average USD_SALARY when POINTS was zero ($335,602) That is, the intercept estimated E(USD_SALARY | POINTS=0) This quantity puts no restriction on the value of AGE and YEARS_EXP  In the new model (including AGE and YEARS_EXP among the independent variables), the intercept measures the average USD_SALARY when POINTS, AGE, and YEARS_EXP are all zero ($419,897.8) That is, the new intercept estimates E(USD_SALARY | POINTS = 0, AGE = 0, YEARS_EXP = 0)

What’s Changed: The Slope  In our original model (excluding AGE and YEARS_EXP), the coefficient on POINTS was an estimate of the marginal effect of POINTS on USD_SALARY, i.e., This quantity puts no restriction on the values of AGE and YEARS_EXP (implicitly, we are allowing them to vary along with POINTS) – it’s a total derivative  In the new model (which includes AGE and YEARS_EXP), the coefficient on POINTS measures the marginal effect of POINTS on USD_SALARY holding AGE and YEARS_EXP constant, i.e., That is, it’s a partial derivative  The point: what your estimated regression coefficients measure depends on what is (and isn’t) in your model!