LT5: Review Sam Marden
1. Working with summary data
2. More Stats Refresher (b)
3. Panel Data (a) We are trying to learn whether the Aid to Families With Dependant Children program (which provided block grants to states to support programs targeted at low income women with children) effected birth weights. You run the OLS regression: LowBirthWeight = a + b*AFDCPct + u Where AFDCPct is the share of the states population on AFDC supported welfare programs and LowBirthWeight is the percentage of children born with low birth weight i.What do you expect b_hat to be? Why? ii.Do you think this is likely to be the causal effect of the welfare program?
3. Panel Data (a) We are trying to learn whether the Aid to Families With Dependant Children program (which provided block grants to states to support programs targeted at low income women with children) effected birth weights. You run the OLS regression: LowBirthWeight = a + b*AFDCPct + u Where AFDCPct is the share of the states population on AFDC supported welfare programs and LowBirthWeight is the percentage of children born with low birth weight i.What do you expect b_hat to be? Why? i.Causal effect – maybe weakly negative. But OVB, in particular correlates with poverty and cov(poverty, lowbirthweight)>0 and cov(poverty, AFDCPct)>0 so will be biased upwards. Bias probably stronger than causal effect. ii.Do you think this is likely to be the causal effect of the welfare program?
3. Panel Data (b) Like a boss, you add some controls for doctors per capita, hospital beds per capita and income. i.What is the (likely) causal effect of each of these variables? ii.How is your estimate of b_hat likely to change when you control for each of these factors iii.What would need to be true for the new estimates of b_hat to be a consistent estimator of the programs effect? iv.Suppose you add state fixed effects. What problem do they help solve? What would you expect to happen to b_hat when you include state FE?
3. Panel Data (b) Like a boss, you add some controls for doctors per capita, hospital beds per capita and income. i.What is the (likely) causal effect of each of these variables? ii.How is your estimate of b_hat likely to change when you control for each of these factors iii.What would need to be true for the new estimates of b_hat to be a consistent estimator of the programs effect? iv.Suppose you add state fixed effects. What problem do they help solve? What would you expect to happen to b_hat when you include state FE? Parts I common sense. Part ii think about OVB. Part iii cov(x,e)=0 (what does this mean. Part 4, takes care of all time invariant differences between states identify only off ‘within’ variation. Not clear what the direction of the change should be.
Question 4: The Wald Estimator (a) What is the meaning of: E[y i c |T] E: y i c : T:
Question 4: The Wald Estimator (a) What is the meaning of: E[y i c |T] E: the expectation of – the ‘population’ mean y i c : test scores for school i if it were treated T: conditional on being part of the treated group So, E[y i c |T] is the expected average test score of an school in the treated group, had it not got the treatment.
Question 4: The Wald Estimator (b) What does Ḕ[y i c |C] mean? What is the value of Ḕ[y i c |C] ?
Question 4: The Wald Estimator (b) What does Ḕ[y i c |C] mean? It’s the sample analogue of, “the expected test score of an individual in the control group, had they not got the treatment.” What is the value of Ḕ[y i c |C] ? 60
Question 4: The Wald Estimator (c)
Question 4: The Wald Estimator (d) With random assignment of schoolbooks within the treatment and the control group we obtain the ATE (think about why this is true). How would our estimates of ATE 1.Be biased if only the control schools with books were a non- random sample (within the control group)? 2.Be biased if only the ‘treated’ schools without books were a non- random sample (within the treatment group)? 3.What is the overall bias..
Question 4: The Wald Estimator (d)
Question 5 We run a regression and use it to predict house prices. It turns out that our predictions are too low for the most expensive houses and too high for the cheapest. What gives?
Question 6 (a) We obtain the following regression results: DaysIll i = *FluShot i 1.What is the interpretation of the coefficients? 2.What is the biggest problem with interpreting things causally?
Question 6 (b) and (c) 4b. Is HMO membership a good instrument for getting a flu shot? 4c. Is being visited by a health worker who talks about flu and flu shots a good intrument for getting a flu shot? i
Question 6 (b) and (c) Take 3. 4b. Is HMO membership a good instrument for getting a flu shot? 4c. Is being visited by a health worker who talks about flu and flu shots a good intrument for getting a flu shot? Conditions of a good instrument (1) relevance, (2) exogeneity, Both probably satisfy relevance. We can check this anyway. Neither probably satisfy exogeneity e.g. b)There is selection into HMO’s – people may be poor sicker whatever. Also, HMO’s focus on preventative care which may affect days sick other than through the flu shot. c)The health worker talks about the risk of flu. People may be more careful e.g. washing their hands. This will also effect the number of days sick