Lecture 7 (Ch14) Pooled Cross Sections and Simple Panel Data Methods

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

AP STUDY SESSION 2.
1
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
STATISTICS Joint and Conditional Distributions
STATISTICS HYPOTHESES TEST (I)
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Chapter 7 Sampling and Sampling Distributions
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Lecture 3-4 Summarizing relationships among variables ©
Simple Linear Regression 1. review of least squares procedure 2
Biostatistics Unit 5 Samples Needs to be completed. 12/24/13.
Break Time Remaining 10:00.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
Chapter 3 Learning to Use Regression Analysis Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and.
Lecture 3 Learning to Use Regression Analysis اقتصادسنجيا © Dr. Yoke Muelgini, M.Sc. FEB Unila, 2012 Department of Economics and Development Studies,
PP Test Review Sections 6-1 to 6-6
Quantitative Methods II
The Frequency Table or Frequency Distribution Table
LIAL HORNSBY SCHNEIDER
Instrumental Variables Estimation and Two Stage Least Square
Regression with Panel Data
Pooled Cross Sections and Panel Data I
Economics 20 - Prof. Anderson
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Direct-Current Circuits
Chapter 10 Estimating Means and Proportions
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.5 Dividing Polynomials Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Chapter 1: Expressions, Equations, & Inequalities
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Statistical Inferences Based on Two Samples
© The McGraw-Hill Companies, Inc., Chapter 10 Testing the Difference between Means and Variances.
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
Copyright © 2008 Pearson Addison-Wesley. All rights reserved. Chapter 12 Keynesian Business Cycle Theory: Sticky Wages and Prices.
Essential Cell Biology
12 System of Linear Equations Case Study
1 Interpreting a Model in which the slopes are allowed to differ across groups Suppose Y is regressed on X1, Dummy1 (an indicator variable for group membership),
Chapter Thirteen The One-Way Analysis of Variance.
Chapter 8 Estimation Understandable Statistics Ninth Edition
Exponents and Radicals
PSSA Preparation.
Module 20: Correlation This module focuses on the calculating, interpreting and testing hypotheses about the Pearson Product Moment Correlation.
Simple Linear Regression Analysis
Multiple Regression and Model Building
Completing the Square Topic
9. Two Functions of Two Random Variables
4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.
Random Assignment Experiments
Lecture 8 (Ch14) Advanced Panel Data Method
Lecture 12 (Ch16) Simultaneous Equations Models (SEMs)
Pooled Cross Sections and Panel Data II
Topic 3: Regression.
1 Research Method Lecture 11-1 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©
1 Research Method Lecture 6 (Ch7) Multiple regression with qualitative variables ©
Statistics and Econometrics for Business II Fall 2014 Instructor: Maksym Obrizan Lecture notes III # 2. Advanced topics in OLS regression # 3. Working.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Pooling Cross Sections across Time: Simple Panel Data Methods
Pooling Cross Sections across Time: Simple Panel Data Methods
Advanced Panel Data Methods
Multiple Regression Analysis with Qualitative Information
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Ch. 13. Pooled Cross Sections Across Time: Simple Panel Data.
Advanced Panel Data Methods
Presentation transcript:

Lecture 7 (Ch14) Pooled Cross Sections and Simple Panel Data Methods Research Method Lecture 7 (Ch14) Pooled Cross Sections and Simple Panel Data Methods

An independently pooled cross section This type of data is obtained by sampling randomly from a population at different points in time (usually in different years) You can pool the data from different year and run regressions. However, you usually include year dummies.

Panel data This is the cross section data collected at different points in time. However, this data follow the same individuals over time. You can do a bit more than the pooled cross section with Panel data. You usually include year dummies as well.

Pooling independent cross sections across time. As long as data are collected independently, it causes little problem pooling these data over time. However, the distribution of independent variables may change over time. For example, the distribution of education changes over time. To account for such changes, you usually need to include dummy variables for each year (year dummies), except one year as the base year Often the coefficients for year dummies are of interest.

Example 1 Consider that you would like to see the changes in fertility rate over time after controlling for various characteristics. Next slide shows the OLS estimates of the determinants of fertility over time. (Data: FERTIL1.dta) The data is collected every other year. The base year for the year dummies are year 1972.

Dependent variable =# kids per woman

The number of children one woman has in 1982 is 0 The number of children one woman has in 1982 is 0.49 less than the base year. Similar result is found for year 1984. The year dummies show significant drops in fertility rate over time.

Example 2 CPS78_85.dta has wage data collected in 1978 and 1985. we estimate the earning equation which includes education, experience, experience squared, union dummy, female dummy and the year dummy for 1985. Suppose that you want to see if gender gap has changed over time, you include interaction between female and 1985; that is you estimate the following.

Log(wage)=β0+β1(educ) +β2(exper)+β3(expersq)+β4(Union) +β5(female) +β6(year85) +β7(year85)(female) You can check if gender wage gap in 1985 is different from the base year (1978) by checking if β7 is equal to zero or not. The gender gap in each period is given by: -gender gap in the base year (1978) = β5 -gender gap in 1985= β5+ β7

Coefficient for the interaction term (y85)(Female) is positive and significant at 10% significance level. So gender gap appear to have reduced over time. gender gap in 1978 =-0.319 gender gap in 1985=-0.319+0.088 =-0.231

Policy analysis with pooled cross sections: The difference in difference estimator I explain a typical policy analysis with pooled cross section data, called the difference-in-difference estimation, using an example.

Example: Effects of garbage incinerator on housing prices This example is based on the studies of housing price in North Andover in Massachusetts The rumor that a garbage incinerator will be build in North Andover began after 1978. The construction of incinerator began in 1981. You want to examine if the incinerator affected the housing price.

Our hypothesis is the following. Hypothesis: House located near the incinerator would fall relative to the price of more distant houses. For illustration define a house to be near the incinerator if it is within 3 miles. So create the following dummy variables nearinc =1 if the house is `near’ the incinerator =0 if otherwise

price =β0+β1(nearinc)+u Most naïve analysis would be to run the following regression using only 1981 data. price =β0+β1(nearinc)+u where the price is the real price (i.e., deflated using CPI to express it in 1978 constant dollar). Using the KIELMC.dta, the result is the following But can we say from this estimation that the incinerator has negatively affected the housing price?

To see this, estimate the same equation using 1979 data To see this, estimate the same equation using 1979 data. Note this is before the rumor of incinerator building began. Note that the price of the house near the place where the incinerator is to be build is lower than houses farther from the location. So negative coefficient simply means that the garbage incinerator was build in the location where the housing price is low.

Now, compare the two regressions. Year 1978 regression Compared to 1978, the price penalty for houses near the incinerator is greater in 1981. Perhaps, the increase in the price penalty in 1981 is caused by the incinerator Year 1981 regression This is the basic idea of the difference-in-difference estimator

The difference-in-difference estimator : The difference-in-difference estimator in this example may be computed as follows. I will show you more a general case later on. The difference-in-difference estimator : = (coefficient for nearinc in 1981) ‒ (coefficient for nearinc in 1979) = ‒ 30688.27 ‒(‒ 18824.37)= ‒11846 So, incinerator has decreased the house prices on average by $11846.

Note that, in this example, the coefficient for (nearinc) in 1979 is equal to Average price of houses not near the incinerator Average price of houses near the incinerator ‒ This is because the regression includes only one dummy variable: (Just recall Ex.1 of the homework 2). Therefore the difference in difference estimator in this example is written as. This is the reason why the estimator is called the difference in difference estimator.

Difference in difference estimator: More general case. The difference-in-difference estimator can be estimated by running the following single equation using pooled sample. price =β0+β1(nearinc) +β2(year81)+δ1(year81)(nearinc) Difference in difference estimator

Difference in difference estimator This form is more general since in addition to policy dummy (nearinc), you can include more variables that affect the housing price such as the number of bedrooms etc. When you include more variables, cannot be expressed in a simple difference-in-difference format. However, the interpretation does not change, and therefore, it is still called the difference-in-difference estimator

Natural experiment (or quasi-experiment) The difference in difference estimator is frequently used to evaluate the effect of governmental policy. Often governmental policy affects one group of people, while it does not affect other group of people. This type of policy change is called the natural experiment. For example, the change in spousal tax deduction system in Japan which took place in 1995 has affected married couples but did not affect single people.

The group of people who are affected by the policy is called the treatment group. Those who are not affected by the policy is called the control group. Suppose that you want to know how the change in spousal tax deduction has affected the hours worked by women. Suppose, you have the pooled data of workers in 1994 and 1995. The next slide shows the typical procedure you follow to conduct the difference-in-difference analysis.

Step 1: Create the treatment dummy such that Dtreat =1 if the person is affected by the policy change =0 otherwise. Step 2: Run the following regression. (Hours worked)=β0+β1Dtreat+ β0(year95) +δ1(Year95)(Dtreat)+u Difference in difference estimator. This shows the effect of the policy change on the women’s hours worked.

Two period panel data analysis Motivation: Remember the effects of employee training grant on the scrap rate. You estimated the following model for the 1987 data. You did not find the evidence that receiving the grant will reduce scrap rate.

The reason why we did not find the significant effect is probably due to the endogeneity problem. The company with low ability workers tend to apply for the grant, which creates positive bias in the estimation. If you observe the average ability of the workers, you can eliminate the bias by including the ability variable. But since you cannot observe ability, you have the following situation. where ability is in the error term v. v=(β3ability+u) is called the composite error term.

We predicted the direction of bias in the following way. Because ability and grant are correlated (negatively), this causes a bias in the coefficient for (grant). We predicted the direction of bias in the following way. Effect of ability on scrap rate Sign is determined by the correlation between ability and grant True effect of grant Bias term The true negative effect of grant is cancelled out by the bias term. Thus, the bias make it difficult to find the effect.

Now you know that there is a bias Now you know that there is a bias. Is there anything we can do to correct for the bias? When you have a panel data, we can eliminate the bias. I will explain the method using this example. I will generalize it later.

Eliminating bias using two period panel data Now, go back to the equation. The grant is administered in 1988. Suppose that you have a panel data of firms for two period, 1987 and 1988. Further assume that the average ability of workers does not change over time. So (ability) is interpreted as the innate ability of workers, such as IQ.

When you have the two period panel data, the equation can be written as: i is the index for ith firm. t is the index for the period. Since ability is constant overtime, ability has only i index. Now, I will use a short hand notation for β4(ability)i. Since (ability) is assumed constant over time, write β4(ability)i=ai. Then above equation can be written as:

ai is called, the fixed effect, or the unobserved effect ai is called, the fixed effect, or the unobserved effect. If you want to emphasize that it is the unobserved firm characteristic, you can call it the firm fixed effect as well uit is called the idiosyncratic error. Now the bias in OLS occurs because the fixed effect is correlated with (grant). So if we can get rid of the fixed effect, we can eliminate the bias. This is the basic idea. In the next slide, I will show the procedure of what is called the first-differenced estimation.

First, for each firm, take the first difference First, for each firm, take the first difference. That is, compute the following. It follows that, The first differenced equation.

So, by taking the first difference, you can eliminate the fixed effect. If ∆uit is not correlated with ∆(grant)it, estimating the first differenced model using OLS will produce unbiased estimates. If we have controlled for enough time-varying variables, it is reasonable to assume that they are uncorrelated. Note that this model does not have the constant. Now, estimate this model using JTRAIN.dta

When you use ‘nocons’ option, the stata omits constant term. Now, the grant is negative and significant at 10% level.

Note that, when you use this method in your research, it is a good idea to tell your audience what the potential fixed effect would be and whether it is correlated with the explanatory variables. In this example, unobserved ability is potentially an important source of the fixed effect. Off course, one can never tell exactly what the fixed effect is since it is the aggregate effects of all the unobserved effects. However, if you tell what is contained in the fixed effect, your audience can understand the potential direction of the bias, and why you need to use the first-differenced method.

General case First differenced model in a more general situation can be written as follows. Yit=β0+β1xit1+β2xit2+…+βkxitk+ai+uit If ai is correlated with any of the explanatory variables, the estimated coefficients will be biased. So take the first difference to eliminate ai, then estimate the following model by OLS. ∆Yit=∆ β1xit1+ ∆ β2xit2+…+ ∆ xitk+∆ uit Fixed effect

Note, when you take the first difference, the constant term will also be eliminated. So you should use `nocons’ option in STATA when you estimate the model. When some variables are time invariant, these variables are also eliminated. If the treatment variable does not change overtime, you cannot use this method.

First differencing for more than two periods. You can use first differencing for more than two periods. You just have to difference two adjacent periods successively. For example, suppose that you have 3 periods. Then for the dependent variable, you compute ∆yi2=yi2-yi1, and ∆yi3=yi3-yi2. Do the same for x-variables. Then run the regression.

Exercise The data ezunem.dta contains the city level unemployment claim statistics in the state of Indiana. This data also contains information about whether the city has an enterprise zone or not. The enterprise zone is the area which encourages businesses and investments through reduced taxes and restrictions. Enterprise zones are usually created in an economically depressed area with the purpose of increasing the economic activities and reducing unemployment.

Using the data, ezunem.dta, you are asked to estimate the effect of enterprise zones on the city-level unemployment claim. Use the log of unemployment claim as the dependent variable Ex1. First estimate the following model using OLS. log(unemployment claims)it =β0+β1(Enterprise zone)it +β(year dummies)it+vit Discuss whether the coefficient for enterprise zone is biased or not. If you think it is biased, what is the direction of bias? Ex2. Estimate the model using the first difference method. Did it change the result? Was your prediction of bias correct?

OLS results

First differencing

The do file used to generate the results. tsset city year reg luclms ez d81 d82 d83 d84 d85 d86 d87 d88 gen lagluclms =luclms -L.luclms gen lagez =ez -L.ez gen lagd81 =d81 -L.d81 gen lagd82 =d82 -L.d82 gen lagd83 =d83 -L.d83 gen lagd84 =d84 -L.d84 gen lagd85 =d85 -L.d85 gen lagd86 =d86 -L.d86 gen lagd87 =d87 -L.d87 gen lagd88 =d88 -L.d88 reg lagluclms lagez lagd81 lagd82 lagd83 lagd84 lagd85 lagd86 lagd87 lagd88, nocons

The assumptions for the first difference method. Assumption FD1: Linearity For each i, the model is written as yit=β0+β1xit1+…+βkxitk+ai+uit

Assumption FD2: We have a random sample from the cross section Assumption FD3: There is no perfect collinearity. In addition, each explanatory variable changes over time at least for some i in the sample.

Assumption FD4. Strict exogeneity E(uit|Xi,ai)=0 for each i Assumption FD4. Strict exogeneity E(uit|Xi,ai)=0 for each i. Where Xi is the short hand notation for ‘all the explanatory variables for ith individual for all the time period’. This means that uit is uncorrelated with the current year’s explanatory variables as well as with other years’ explanatory variables.

The unbiasedness of first difference method Under FD1 through FD4, the estimated parameters for the first difference method are unbiased.

Assumption FD5: Homoskedasticity Var(∆uit|Xi)=σ2 Assumption FD6: No serial correlation within ith individual. Cov(∆uit,∆uis)=0 for t≠s Note that FD2 assumes random sampling across difference individual, but does not assume randomness within each individual. So you need an additional assumption to rule out the serial correlation.