What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.

Slides:



Advertisements
Similar presentations
Things to do in Lecture 1 Outline basic concepts of causality
Advertisements

The Simple Regression Model
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Linear regression models
ELASTICITIES AND DOUBLE-LOGARITHMIC MODELS
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
Some Topics In Multivariate Regression. Some Topics We need to address some small topics that are often come up in multivariate regression. I will illustrate.
1. Are Women Paid Less than Men? Intro and revision of basic statistics.
What is MPC? Hypothesis testing.
Sociology 601, Class17: October 27, 2009 Linear relationships. A & F, chapter 9.1 Least squares estimation. A & F 9.2 The linear regression model (9.3)
Lecture 4 This week’s reading: Ch. 1 Today:
EC220 - Introduction to econometrics (chapter 2)
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Chapter 10 Simple Regression.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
The Simple Regression Model
The Basics of Regression continued
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
The Simple Regression Model
Interpreting Bi-variate OLS Regression
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
1 Regression and Calibration EPP 245 Statistical Analysis of Laboratory Data.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: testing a hypothesis relating to a regression coefficient (2010/2011.
1 INTERPRETATION OF A REGRESSION EQUATION The scatter diagram shows hourly earnings in 2002 plotted against years of schooling, defined as highest grade.
Back to House Prices… Our failure to reject the null hypothesis implies that the housing stock has no effect on prices – Note the phrase “cannot reject”
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
SLOPE DUMMY VARIABLES 1 The scatter diagram shows the data for the 74 schools in Shanghai and the cost functions derived from a regression of COST on N.
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
TOBIT ANALYSIS Sometimes the dependent variable in a regression model is subject to a lower limit or an upper limit, or both. Suppose that in the absence.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy variable classification with two categories Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: the effects of changing the reference category Original citation: Dougherty,
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: Tobit models Original citation: Dougherty, C. (2012) EC220 - Introduction.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
Econometrics 1. Lecture 1 Syllabus Introduction of Econometrics: Why we study econometrics? 2.
Introduction to Linear Regression and Correlation Analysis
Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Chapter 4 Linear Regression with One Regressor.
Simple linear regression Linear regression with one predictor variable.
Returning to Consumption
Serial Correlation and the Housing price function Aka “Autocorrelation”
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two.
Introduction to Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 3: Basic techniques for innovation data analysis. Part II: Introducing regression.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Simple regression model: Y =  1 +  2 X + u 1 We have seen that the regression coefficients b 1 and b 2 are random variables. They provide point estimates.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
COST 11 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 This sequence explains how you can include qualitative explanatory variables in your regression.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
SEMILOGARITHMIC MODELS 1 This sequence introduces the semilogarithmic model and shows how it may be applied to an earnings function. The dependent variable.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
Simple linear regression. What is simple linear regression? A way of evaluating the relationship between two continuous variables. One variable is regarded.
CHAPTER 12 More About Regression
QM222 Class 9 Section A1 Coefficient statistics
QM222 Class 8 Section A1 Using categorical data in regression
The slope, explained variance, residuals
CHAPTER 12 More About Regression
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

What is the MPC?

Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of best fit in precise sense 3.Show that the line links the conditional expectations of the variables 4.A more formal approach to hypothesis testing

Consumption Function Keynesian Consumption function  income today,  consumption today C=a+b*Y Econometrics : quantify economic relationships – What are “a” and “b”

Look at some data Look at individual level data: individual.dta Stata: scatter cons nmwage This gives a scatter plot with the first variable on the vertical axis and the second variable on the x axis

Look at data

Two Obvious facts 1.Observe many households at different income levels – There is clearly a positive relationship 2.cons depends on income but households with same income will not have same consumption – other factors influence consumption

How do we Calculate the MPC? Draw a line Many possible lines Intuition tells us that an “average” line would be a better estimate – We will show why this intuition is correct later Any line we draw (even the “best”) will not go through all the points – There will be deviations from the line

Conditional Expectation As an alternative to the line we could follow the logic of the gender example from the pervious section and look at conditional expectation Recall we answered the question of gender discrimination by comparing the average wage of two groups – The expected waged conditional on being a man or woman – we used the “summ if” command Formally – E(hwage|gender==1)= – E(hwage|gender==2)=

Conditional Expectation We can apply the same logic to the consumption function. Divide in two groups – Rich: nmwage>1000 – Poor: nmwage<1000 – generate rich=(nmwage>1000) Compare the average consumption of each using summ if

Conditional Expectation We get average consumption conditional on being rich or poor – E(Cons|Rich)= – E(Cons|Poor)= We can measure the marginal propensity of consume by taking the average income of each group – E(nmwage|Rich)= – E(nmwage|Poor)=

Conditional Expectation As you move from “poor” to “rich” your income rises by: – =661 – And consumption rises by: =490 So an estimate of the MPC would be 490/661 which is 0.74 This is a simple and intuitive method that builds on the logic of the gender example But…..

Obvious Problem The division between risk and poor was entirely arbitrary – Not natural like gender We throw away information by forcing individuals into one group or another Why not have 3 groups or any number of groups you like Intuitively the more the better – 10 group example But large numbers of groups would make calculations tedious and would always leave out some information

10 Income Groups

Compromise Imagine there are an infinity of groups but the conditional means are all related Specifically they have a linear relationship – E(cons|nmwage)=a+b*nmwage From now on we will write in more general notation – E(Y|X)=  1 +  2 X

Comment Note this is a restriction and it may not be true in the real world We impose it on the model – Looks reasonable in the consumption example If it isn't true then there might be a problem – Linear approx – GIGO Relationship doesn’t have to be linear but it does have to be parametric – We will see more on this later

So to Recap… We have data that appears to illustrate a relationship between two variables Intuitively we will put a line through the data that represents the data in some way What way? Two ways: 1.the line links all the conditional means 2.We choose the particular line that is closest to the data in a defined way These turn out to be the same

Draw a line to represent the data Show three data points for illustration

An Explanation Change in notation to be more general – Y is the LHS or dependent variable – X is the RHS or independent variable E(Y|X i ) = conditional mean i.e. does not describe every observation – Y i = E(Y|X i ) + u i – u i represents the deviation of each individual observation from the conditional mean Y i = E(Y|X i ) + u i Y i =  1+  2 X i + u i

What is U i ? Any factor other than income (X) which influences consumption (Y) – individual tastes and unpredictability approximation error because of assumption of linear relationship Later we will model this a random variable Perhaps with a normal distribution – Remember our warnings about the bell curve

OLS Estimation Find line of “best fit” Method of Ordinary Least Squares (OLS) to estimate  1  2 Objective: find estimates of  1  2 that minimizes the distance between the regression line and the actual data points, i.e. minimize the error terms Minimise the sum of squared deviations i.e. – Aside: why not absolute deviation or others?

Algebra of OLS min  i u i 2 i.e. min (u u 2 2 +u 3 2 +…+u i 2 ) Y i =  1 +  2 X i +u i => u i = Y i -  1 -  2 X  i u i 2 =  i (Y i -  1 -  2 X ) 2 = S(  1,  2 ) => sum of squared errors is a function of  1,  2 min S(  1,  2 ) = min  i (Y i -  1 -  2 X ) 2

To find minimum of any function: differentiate with respect to the arguments and set derivative = 0 i.e. find the point where the slope with respect to the argument = 0.

An Explanation b 1, b 2 are the Ordinary Least Squares (OLS) estimators of the true population parameters  1,  2. b 2 is the estimator of the slope coefficient: the slope coefficient measures the effect on y of a one unit change in x b 1 is the estimator of the intercept: the value of Y which occurs if X=0;

OLS in stata regress cons nmwage Source | SS df MS Number of obs = F( 1, 1328) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = cons | Coef. Std. Err. t P>|t| [95% Conf. Interval] nmwage | _cons | Estimated coef  i u i 2

The Answer The regression gives us a measure of the MPC The OLS estimate of the MPC is What use is this – Prediction – Causation – Statistical inference

Prediction We can use this to make predictions What would the consumption be if income were 2500 Cons= *2500 – This is equal to 1953 Be careful this is the predicted conditional mean – It is the next point on the line – What people with 2500 would consume on average – What they actually will consume is unknown because we don’t observe their U i

Predicted Consumption Predicted Cons Actual Consumption

Causation Remember all this only really identifies variables that move together It doesn’t show causation Need theory for that Obvious in the gender example (wages don’t cause changes in gender) Not obvious here causation can run both ways

Statistical Inference This estimate is generated from a sample Recall that the issue is whether we can use this fact about the sample to make statements about the world (“population”) The same issues of statistical inference arise in context of regression – OLS estimates are sample statistics just like the sample average wages in the gender example

More on the Residual (U i ) The residual is the difference between the line (conditional expectation) and the actual data Think of every individuals consumption as being made up of two bits – Conditional expectation – Residual The conditional expectation is that same for everyone with the same X (income) Residual is potentially different even for those with same income

Random Variable Residual is unknown in advance so we model it as a random variable Think of consumption being determined by systematic bit plus a roll of a dice See diagram – Actual consumption (expectation+residual) is distributed around the mean – All the means are linked

Each distribution is a slice in the data

Distribution of Y for two different “slices” of X

Empirical Distribution We can use the hist comand in stata to look at this Just as we got distribution of hwage for men and women hist cons, by(rich) norm We could do the same for any income group – hist cons if nmwage 900, norm All OLS does is draw a line through all the means Imagine laying all these distributions side by side

The “Slice” Around nmwage=1000

Distribution of Y f(Y|X) E(Y|X) X=600 X =900 X=1200

Putting it all together We usually assume that the residual is a normal random variable Seems reasonable in this case – But remember our concerns about normal So the full model is – Y i =  1 +  2 X i + u i – Where E(Y|X i )=  1 +  2 X i – And u i ~N(0,   