By Randall Munroe, xkcd.com Econometrics: The Search for Causal Relationships.

Slides:



Advertisements
Similar presentations
The World Bank Human Development Network Spanish Impact Evaluation Fund.
Advertisements

Economics 20 - Prof. Anderson1 Panel Data Methods y it = x it k x itk + u it.
Economics 20 - Prof. Anderson
Managerial Economics in a Global Economy
Multiple Regression Analysis
Welcome to Econ 420 Applied Regression Analysis
There are at least three generally recognized sources of endogeneity. (1) Model misspecification or Omitted Variables. (2) Measurement Error.
1 Difference in Difference Models Bill Evans Spring 2008.
Random Assignment Experiments
Introduction to Regression with Measurement Error STA431: Spring 2015.
Conclusion to Bivariate Linear Regression Economics 224 – Notes for November 19, 2008.
3.3 Omitted Variable Bias -When a valid variable is excluded, we UNDERSPECIFY THE MODEL and OLS estimates are biased -Consider the true population model:
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Omitted Variable Bias Methods of Economic Investigation Lecture 7 1.
Review for the chapter 6 test 6. 1 Scatter plots & Correlation 6
Sociology 601 Class 17: October 28, 2009 Review (linear regression) –new terms and concepts –assumptions –reading regression computer outputs Correlation.
PHSSR IG CyberSeminar Introductory Remarks Bryan Dowd Division of Health Policy and Management School of Public Health University of Minnesota.
Chapter 19 Confidence Intervals for Proportions.
Econ Prof. Buckles1 Welcome to Econometrics What is Econometrics?
Econ 140 Lecture 241 Simultaneous Equations II Lecture 24.
Analysis of Economic Data
The Simple Linear Regression Model: Specification and Estimation
Chapter 10 Simple Regression.
Linear Regression with One Regression
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Regression Hal Varian 10 April What is regression? History Curve fitting v statistics Correlation and causation Statistical models Gauss-Markov.
Econ 140 Lecture 181 Multiple Regression Applications III Lecture 18.
The Basics of Regression continued
Empirical methods take real-world data estimate size of relationship between variables two types  regression analysis  natural experiments take real-world.
Chapter 2 – Tools of Positive Analysis
1 MF-852 Financial Econometrics Lecture 6 Linear Regression I Roy J. Epstein Fall 2003.
THE IDENTIFICATION PROBLEM
Introduction to Regression with Measurement Error STA431: Spring 2013.
Experiments and Observational Studies.  A study at a high school in California compared academic performance of music students with that of non-music.
3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.
Simple Linear Regression. Types of Regression Model Regression Models Simple (1 variable) LinearNon-Linear Multiple (2
3.1 Ch. 3 Simple Linear Regression 1.To estimate relationships among economic variables, such as y = f(x) or c = f(i) 2.To test hypotheses about these.
Assessing Studies Based on Multiple Regression
  What is Econometrics? Econometrics literally means “economic measurement” It is the quantitative measurement and analysis of actual economic and business.
PowerPoint presentation to accompany Research Design Explained 6th edition ; ©2007 Mark Mitchell & Janina Jolley Chapter 7 Introduction to Descriptive.
Where Do Data Come From? ● Conceptualization and operationalization of concepts --> measurement strategy --> data. ● Different strategies --> different.
Statistics and Quantitative Analysis U4320 Segment 8 Prof. Sharyn O’Halloran.
Selecting Variables and Avoiding Pitfalls Chapters 6 and 7.
Introduction and Identification Todd Wagner Econometrics with Observational Data.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
RCTs and instrumental variables Anna Vignoles University of Cambridge.
Application 3: Estimating the Effect of Education on Earnings Methods of Economic Investigation Lecture 9 1.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Application 2: Minnesota Domestic Violence Experiment Methods of Economic Investigation Lecture 6.
May 2004 Prof. Himayatullah 1 Basic Econometrics Chapter 7 MULTIPLE REGRESSION ANALYSIS: The Problem of Estimation.
7.4 DV’s and Groups Often it is desirous to know if two different groups follow the same or different regression functions -One way to test this is to.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Generalized Linear Models (GLMs) and Their Applications.
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
5. Consistency We cannot always achieve unbiasedness of estimators. -For example, σhat is not an unbiased estimator of σ -It is only consistent -Where.
Copyright © 2006 Pearson Addison-Wesley. All rights reserved. 4-1 Basic Mathematical tools Today, we will review some basic mathematical tools. Then we.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
Empirical Studies of Marriage and Divorce. Korenman and Neumark, 1991 Does Marriage Really Make Men More Productive? How do we explain the male marriage.
1 Econometrics (NA1031) Chap 7 Using Indicator Variables.
Experimental Evaluations Methods of Economic Investigation Lecture 4.
By Melissa S. Kearney And Phillip B. Levine NBER WP #19795
Selecting the Best Measure for Your Study
Multiple Regression Analysis: Estimation
Econometrics ITFD Week 8.
Simple Linear Regression - Introduction
Correlation and Simple Linear Regression
1) A residual: a) is the amount of variation explained by the LSRL of y on x b) is how much an observed y-value differs from a predicted y-value c) predicts.
Economics 20 - Prof. Anderson
Correlation and Simple Linear Regression
Advanced Tools and Techniques of Program Evaluation
Presentation transcript:

By Randall Munroe, xkcd.com Econometrics: The Search for Causal Relationships

Parents should consider limiting their teen's exposure to sexual content on TV, said the study's lead author, Anita Chandra, a behavioral scientist at RAND, a nonprofit research organization. Television producers should consider more realistic depictions of the consequences of sex in their scripts... Sexual Content on TV is linked to teen pregnancy LA Times, 11/3/08 Teenagers who watch a lot of television programs that contain sexual content are more than twice as likely to be involved in a pregnancy, according to a study published today in the journal Pediatrics. study published today in the journal Pediatrics The teens who watched the most sexual content on TV (the 90th percentile) were twice as likely to have become pregnant or caused a pregnancy compared to the teens who watched the least amount of sexual content on TV (the 10th percentile).

Lasting Effects Found From Spanking Children: Antisocial Behavior Is Increased, Study Says Washington Post, August 15, 1997, page A3 Spanking children is apt to cause more long-term behavioral problems than most parents who use that approach to discipline may realize, a new study reports. Children who get spanked regularly are more likely over time to cheat or lie, to be disobedient at school and to bully others, and have less remorse for what they do wrong, according to the study by researchers at the University of New Hampshire. It is being published this month in the medical journal Archives of Pediatrics and Adolescent Medicine. "When parents use corporal punishment to reduce antisocial behavior, the long- term effect tends to be the opposite," the study concludes.

How can we identify causal effects? Angrist & Pischke’s 5 tools Ideal case: randomized trial Easiest (but least persuasive?): regression Natural experiments: Instrumental variables Regression discontinuity Differences-in-differences

1. Randomized, Controlled Trials (RCTs) Good benchmark Increasingly popular in economics, especially development (“randomistas”) and behavioral.

Question to always ask: What is the point of this table?

Treatment and Control Groups Treatment group receives the intervention you want to study (Ex: health insurance, new drug, bed nets) Control group should be statistically identical to treatment group except they don’t get treated Study participants are randomly assigned to the groups.

The Math Where n ≡ number in group Y i ≡ outcome for person i (health) D i ≡ treatment status for i (=1 if insurance, 0 otherwise) Y 1i ≡ i‘s outcome if treated Y 0i ≡ i‘s outcome if not treated (can only observe one of these)

The Math What we want to know is Avg n [Y 1i – Y 0i ]. Suppose health insurance improves everyone’s health by k. Then equation 1.2 becomes: The true causal effect The “selection” effect—the difference in health between the insured & uninsured groups if nobody had insurance.

The Math Table 1.1 suggests that the selection effect in this case would probably be positive. So selection would cause us to overstate the effect of having insurance. Note: Law of large numbers implies that Avg n [Y i ] = E[Y i ] as n goes to infiniti.

The Math What does random assignment do for us? Sets E[Y 0i │D i =1] = E[Y 0i │D i =0] (fix notes!) So, no selection effect, and the differences in groups is the pure causal effect we want (if groups are large enough).

Question to always ask: What is the point of this table?

A well-constructed randomized trial can identify causal effects very well. So why don’t we always use this method? Expensive Time consuming (and outcome may not be realized for years) Ethical concerns Logistically difficult External validity

2. Regression Analysis Relationship between college rank and earnings:

2. Regression Analysis Want to draw a line through those points that describes the relationship. y i = α + βx i + e i wage i = α + βrank i + e i Parameters without hats are the true population values. With hats are sample estimates. α is the intercept (wage if rank = 0) β is the slope—how much wage increases with a one- unit increase in rank.

2. Regression Analysis e i is the error term, or how individual i’s wage is different from what we would predict for them based on the values of α and β and their college rank.

2. Regression Analysis We use statistical software packages to estimate α and β, like Stata, R, or SAS. With Ordinary Least Squares Regression (OLS), the software chooses α and β to minimize the sum of the (squared) e i ’s.

2. Regression Analysis How are kids who go to more highly ranked schools different from those who don’t? Ex: Family income (FI), gender, intelligence When we have something in the error term (like FI) that is correlated with x, our estimates of β are biased. This means that E[β-hat] ≠ β. We can’t expect to get the right answer.

2. Regression Analysis Compare the regression model with and without FI: wage i = α + β 1 rank i + β 2 F i + e i wage i = α + β 1 rank i + e i

Econ Prof. Buckles21 Summary of Direction of Bias Corr(x 1, x 2 ) > 0Corr(x 1, x 2 ) < 0  2 > 0 Positive biasNegative bias  2 < 0 Negative biasPositive bias

2. Regression Analysis Compare the regression model with and without FI: wage i = α + β 1 rank i + β 2 F i + e i wage i = α + β 1 rank i + e i Since corr(FI, rank) > 0, β 2 > 0, expect positive bias in our estimate of β 1. Ex. 2: Family size. Ex. 3: Gender

2. Regression Analysis Solution: Include the omitted variable in the model. This gives us the “ceteris paribus” interpretation we’re after. We “hold constant” family income. Then β 1 gives us the effect of rank for people with the same family income.

2. Regression Analysis What this looks like:

Question to always ask: What is the point of this table?