Econ 326 Lecture 19.

Slides:



Advertisements
Similar presentations
There are at least three generally recognized sources of endogeneity. (1) Model misspecification or Omitted Variables. (2) Measurement Error.
Advertisements

Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 7. Specification and Data Problems.
Econ 140 Lecture 131 Multiple Regression Models Lecture 13.
1 MF-852 Financial Econometrics Lecture 8 Introduction to Multiple Regression Roy J. Epstein Fall 2003.
Further Inference in the Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
Ordinary Least Squares
Lecture 5 Correlation and Regression
Chapter 13: Inference in Regression
Chapter 11 Simple Regression
JDS Special program: Pre-training1 Carrying out an Empirical Project Empirical Analysis & Style Hint.
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Specification Error I.
CHAPTER 14 MULTIPLE REGRESSION
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Statistics and Econometrics for Business II Fall 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working.
Multiple Linear Regression ● For k>1 number of explanatory variables. e.g.: – Exam grades as function of time devoted to study, as well as SAT scores.
Stat 112 Notes 9 Today: –Multicollinearity (Chapter 4.6) –Multiple regression and causal inference.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Heteroscedasticity Chapter 8
Announcements Reminder: Exam 1 on Wed.
F-tests continued.
Chapter 4 Basic Estimation Techniques
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
QM222 Nov. 28 Presentations Some additional tips on the project
Multiple Regression Analysis: Further Issues
Multiple Regression Analysis: Further Issues
Chow test.
Please hand in Project 4 To your TA.
Multiple Regression Analysis: Further Issues
Testing Multiple Linear Restrictions: The F-test
Prediction, Goodness-of-Fit, and Modeling Issues
Further Inference in the Multiple Regression Model
More on Specification and Data Issues
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
QM222 A1 Nov. 27 More tips on writing your projects
QM222 A1 How to proceed next in your project Multicollinearity
Elementary Statistics
More on Specification and Data Issues
Instrumental Variables and Two Stage Least Squares
Further Inference in the Multiple Regression Model
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
I271B Quantitative Methods
CHAPTER 26: Inference for Regression
Multiple Regression Analysis with Qualitative Information
Chapter 6: MULTIPLE REGRESSION ANALYSIS
Prepared by Lee Revere and John Large
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Instrumental Variables and Two Stage Least Squares
Undergraduated Econometrics
Correlation and Regression-III
Serial Correlation and Heteroscedasticity in
Instrumental Variables and Two Stage Least Squares
Tutorial 1: Misspecification
Chapter 7: The Normality Assumption and Inference with OLS
Seminar in Economics Econ. 470
Product moment correlation
Multiple Regression Analysis: Further Issues
Multiple Regression Analysis with Qualitative Information
Instrumental Variables Estimation and Two Stage Least Squares
The Multiple Regression Model
Regression and Categorical Predictors
More on Specification and Data Issues
Chapter 9 Dummy Variables Undergraduated Econometrics Page 1
Serial Correlation and Heteroscedasticity in
Presentation transcript:

Econ 326 Lecture 19

Today Please turn in Problem Set 9 We will discuss the paper you read for today “Traffic Congestion and Infant Health: Evidence from EZ Pass”

April Fools! Problem Set 9 and “Traffic Congestion” reading are not due until next Wednesday. Today: Go over some hard parts on Exam 2. Final paper Schedule for the rest of the semester. Chapter 9

Exam 2 Not yet graded. Feedback?

Final Project Announcement: McMyler Prize Starting this spring, the Economics department will award the McMyler Prize to the current student who wrote the best Econometrics paper over the past two completed semesters (e.g. Spring 2014, Fall 2015). $500 cash prize If you are graduating this year, you won’t be eligible, unfortunately, since the winner will be chosen next Spring.

Final Project By now, you should have your data in Stata format and bring it with you to every lab session. If you feel you are behind, it is even more important to come and work on your data with assistance from myself and the TAs. The next 2 Stata labs will focus on problem-solving, tips for developing your empirical specifications, and tools to create tables displaying your results. The last 2 Stata labs will be devoted to student presentations (5 min. each, all required to attend.)

Phases of your final paper Handout: Format and Content Guidelines I am returning your Proposal with comments, as you can revise and include parts of it in your final paper (in Sections 1, 2, and 4, for example). The main purpose of Phase 2 was for you to learn about the related literature and figure out where your project fits into it. A shortened version of your literature review can be included in Section 2. (Summarize what is known about your topic, and describe the key approach/findings of each paper in 2 sentences, at most 3.) Phase 3 (due April 20) will form the basis of sections III and V. Phase 4: Final Paper due on May 6 (in place of a final exam)

More on Specification and Data Issues Chapter 9 Wooldridge: Introductory Econometrics: A Modern Approach, 5e

Multiple Regression Analysis: Specification and Data Issues Tests for functional form misspecification One can always test whether explanatory should appear as squares or higher order terms by testing whether such terms can be excluded Another option is to use a general specification test such as RESET Regression specification error test (RESET) The idea of RESET is to include squares and possibly higher order fitted values in the regression. Test for the exclusion of these terms. If they cannot be exluded, this is evidence for omitted higher order terms and interactions, i.e. for misspecification of functional form.

RESET test 1. Regress Y on the X’s according to your model. 2. Create the fitted values 𝑌 as a new variable (yhat) 3. Create squared and cubed versions of this variable, e.g. yhat2 and yhat3 4. Regress Y on the X’s along with yhat2 and yhat3. Q. How can we test for the joint significance of yhat2 and yhat3? A. Step 4 is your unrestricted model, Step 1 is your restricted model. Do a basic F-test with q=2.

Multiple Regression Analysis: Specification and Data Issues Example: Housing price equation The small p-value means your F-statistic is above the critical value at the 5% level, and you reject the null hypothesis that 𝑝𝑟𝑖𝑐𝑒 2 and 𝑝𝑟𝑖𝑐𝑒 3 can be excluded. Discussion Since the predicted values are linear in the X variables, 𝑝𝑟𝑖𝑐𝑒 2 and 𝑝𝑟𝑖𝑐𝑒 3 are nonlinear functions of the X variables. RESET provides little guidance as to where misspecification comes from, but it‘s a hint that you should try another specification. Add 𝑝𝑟𝑖𝑐𝑒 2 and 𝑝𝑟𝑖𝑐𝑒 3 to get the unrestricted results. Evidence for misspecification

Multiple Regression Analysis: Specification and Data Issues What do you do when you know there’s an important omitted variable that you cannot observe? Example: Natural ability, when you are trying to study the effect of education on earnings. Idea: find a proxy variable for ability which is able to control for ability differences between individuals so that the coefficients of the other variables will not be biased. A possible proxy for ability is the IQ score or some other test score.

Multiple Regression Analysis: Specification and Data Issues Using proxy variables for unobserved explanatory variables Example: Omitted ability in a wage equation General approach to using proxy variables Replace by proxy In general, the estimates for the returns to education and experience will be biased because one has omit the unobservable ability variable. Omitted variable, e.g. ability Regression of the omitted variable on its proxy

Multiple Regression Analysis: Specification and Data Issues Assumptions necessary for the proxy variable method to work The proxy is “ just a proxy“ for the omitted variable, it does not belong into the population regression, i.e. it is uncorrelated with its error The proxy variable is a “ good“ proxy for the omitted variable, i.e. using other variables in addition will not help to predict the omitted variable If the error and the proxy were correlated, the proxy would actually have to be included in the population regression function Otherwise x1 and x2 would have to be included in the regression for the omitted variable

Multiple Regression Analysis: Specification and Data Issues Under these assumptions, the proxy variable method works: Discussion of the proxy assumptions in the wage example Assumption 1: Should be fullfilled as IQ score is not a direct wage determinant; what matters is how able the person proves at work Assumption 2: Most of the variation in ability should be explainable by variation in IQ score, leaving only a small rest to educ and exper In this regression model, the error term is uncorrelated with all explanatory variables. As a consequence, all coefficients will be correctly estimated using OLS. The coefficents for the explanatory variables x1 and x2 will be correctly identified. The coefficient for the proxy va-riable may also be of interest (it is a multiple of the coefficient of the omitted variable).

Multiple Regression Analysis: Specification and Data Issues As expected, the measured return to education decreases if IQ is included as a proxy for unobserved ability. The coefficient for the proxy suggests that ability differences between indivi-duals are important (e.g. + 15 points IQ score are associated with a wage increase of 5.4 percentage points). Even if IQ score imperfectly soaks up the variation caused by ability, inclu-ding it will at least reduce the bias in the measured return to education. No significant interaction effect bet-ween ability and education.

Multiple Regression Analysis: Specification and Data Issues Using lagged dependent variables as proxy variables In many cases, omitted unobserved factors may be proxied by the value of the dependent variable from an earlier time period Example: City crime rates Including the past crime rate will at least partly control for the many omitted factors that also determine the crime rate in a given year Another way to interpret this equation is that one compares cities which had the same crime rate last year; this avoids comparing cities that differ very much in unobserved crime factors