Comments on Midterm Comments on HW4 Final Project Regression Example Sensitivity Analysis? Quiz STA 320 Design and Analysis of Causal Studies Dr. Kari.

Slides:



Advertisements
Similar presentations
Designing an impact evaluation: Randomization, statistical power, and some more fun…
Advertisements

Inference: Fishers Exact p-values STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke.
Inference: Neyman’s Repeated Sampling STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science.
Chapter 9 Tests of Significance Target Goal: I can perform a significance test to support the alternative hypothesis. I can interpret P values in context.
Ch11 Curve Fitting Dr. Deshi Ye
What is MPC? Hypothesis testing.
Review: What influences confidence intervals?
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 2. Hypothesis Testing.
Hypothesis Testing Steps of a Statistical Significance Test. 1. Assumptions Type of data, form of population, method of sampling, sample size.
Stat 217 – Day 27 Topics in Regression. Last Time – Inference for Regression Ho: no association or  =0 Ha: is an/positive/negative association Minitab.
Stat Day 16 Observations (Topic 16 and Topic 14)
BCOR 1020 Business Statistics Lecture 28 – May 1, 2008.
Understanding Statistics in Research
Stat 217 – Week 10. Outline Exam 2 Lab 7 Questions on Chi-square, ANOVA, Regression  HW 7  Lab 8 Notes for Thursday’s lab Notes for final exam Notes.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
Using Statistics in Research Psych 231: Research Methods in Psychology.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
STA 320: Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
Simple Linear Regression Least squares line Interpreting coefficients Prediction Cautions The formal model Section 2.6, 9.1, 9.2 Professor Kari Lock Morgan.
Using Covariates in Experiments: Design and Analysis STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical.
Weighting STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
Subclassification STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
Randomized Experiments STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
Statistics for the Social Sciences
Correlation and Linear Regression
Tuesday, September 10, 2013 Introduction to hypothesis testing.
HYPOTHESIS TESTING: A FORM OF STATISTICAL INFERENCE Mrs. Watkins AP Statistics Chapters 23,20,21.
STA291 Statistical Methods Lecture 27. Inference for Regression.
Means Tests Hypothesis Testing Assumptions Testing (Normality)
1 BA 275 Quantitative Business Methods Hypothesis Testing Elements of a Test Concept behind a Test Examples Agenda.
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Statistics: Unlocking the Power of Data Lock 5 Synthesis STAT 250 Dr. Kari Lock Morgan SECTIONS 4.4, 4.5 Connecting bootstrapping and randomization (4.4)
Comparing Two Proportions
Introduction to Linear Regression
SUTVA, Assignment Mechanism STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
Part 2: Model and Inference 2-1/49 Regression Models Professor William Greene Stern School of Business IOMS Department Department of Economics.
1 Chapter 10: Introduction to Inference. 2 Inference Inference is the statistical process by which we use information collected from a sample to infer.
Propensity Score Matching for Causal Inference: Possibilities, Limitations, and an Example sean f. reardon MAPSS colloquium March 6, 2007.
Univariate Linear Regression Problem Model: Y=  0 +  1 X+  Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both.
Statistical Significance The power of ALPHA. “ Significant ” in the statistical sense does not mean “ important. ” It means simply “ not likely to happen.
Generalizing Observational Study Results Applying Propensity Score Methods to Complex Surveys Megan Schuler Eva DuGoff Elizabeth Stuart National Conference.
KNR 445 Statistics t-tests Slide 1 Introduction to Hypothesis Testing The z-test.
1 Section 10.2 Tests of Significance AP Statistics March 2, 2010 Berkley High School, D1B1.
Categorical Independent Variables STA302 Fall 2013.
Matching STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Review I A student researcher obtains a random sample of UMD students and finds that 55% report using an illegally obtained stimulant to study in the past.
Logic and Vocabulary of Hypothesis Tests Chapter 13.
General Linear Model.
Jump to first page Inferring Sample Findings to the Population and Testing for Differences.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.3 Other Ways of Comparing Means and Comparing Proportions.
Section 9.1 First Day The idea of a significance test What is a p-value?
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Statistics 22 Comparing Two Proportions. Comparisons between two percentages are much more common than questions about isolated percentages. And they.
Copyright © Cengage Learning. All rights reserved. 10 Inferences about Differences.
Inference about the slope parameter and correlation
Correlation and Linear Regression
P-values.
Statistics for the Social Sciences
R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15
Review: What influences confidence intervals?
Introduction to Econometrics
STA 291 Summer 2008 Lecture 18 Dustin Lueker.
Power and Error What is it?.
Hypothesis Testing for Proportions
STA 291 Spring 2008 Lecture 17 Dustin Lueker.
Presentation transcript:

Comments on Midterm Comments on HW4 Final Project Regression Example Sensitivity Analysis? Quiz STA 320 Design and Analysis of Causal Studies Dr. Kari Lock Morgan and Dr. Fan Li Department of Statistical Science Duke University

Midterm Comments 2: why gain scores? 3: Probabilistic 4a: covariates must be pre-treatment 4b: matching best when n T << n C 4c: estimand, not estimate, ATT 5b: Estimation or test depends on question 5b: Neyman relies on CLT and large n 6c: If balance good, no more subclasses 6f: average of subclass estimates / original SE 7b: 7e:

Hypothesis Test Conclusions (HW4) Generic decision: o in terms of null o Reject or do not reject null Interpret in context: o in terms of alternative o have or do not have evidence for alternative

Not statistically significant Do not reject H 0 We do not reject the null that there is no effect of increasing the minimum wage on employment We do not have evidence that increasing the minimum wage effects employment

Statistically significant Reject H 0 We reject the null that the job training program has no effect on wages We have evidence that the job training program increases wages

Final Project Details herehere Due in class on the last day of class, 4/23 30% of final grade Topic can be anything related to causal inference Good link for finding data: Maximum length:10 pages Individual – no communication or help

Why Regression (without thinking) is dangerous? Fan Li March 26, 2014

A Toy Example Population: patients with heart attack. Treatment (W): 1 surgery; 0 medication Outcome (Y): prognosis after 1 month One covariate (X) severity: 1 severe; 0 mild-average. All other covariates are matched between groups. Sample: n patients admitted in a hospital. Goal: (1) estimate the effect of surgery comparing to medication; (2) predict the prognosis of a new patient. It happens to be: in the observed sample, all patients with X=1 get W=1, and all patients with X=0 get W=0.

Regression An idiosyncratic way is to write down the following regression model for the observed data: Y~a+b*W+c*X For goal (1): fit the model to the sample, and the coefficient “b” is the ``treatment effect”. For goal (2): for a new patient, plug in W and X to get a predicted Y. Question: what if the new patient is with (W=0, X=1), or (W=1, X=0)? What is odd here?

What is odd? There is no interaction W*X in the model. No interaction: effectively (implicitly) assuming the effect of W is additive. In fact, in all observed data, W*X=0 (complete inbalance in X between W groups). Therefore even if there is an interaction term, there is no information in the data to estimate the coefficient. What does the regression model here really assumes? Clear if we adopt a potential outcome approach.

Potential Outcome Approach One regression for each potential outcome: Y(0)~a 0 +c 0 *X Y(1)~a 1 +c 1 *X Remember the “fancy” way of writing the observed outcome Y in terms of Y(0) and Y(1): Y=Y(1)*W+Y(0)*(1-W) Plug in the two potential outcome regressions to this form, we have a regression for Y: Y=(a 1 +c 1 *X)*W+(a 0 +c 0 *X)*(1-W) =a 0 +(a 1 -a 0 )*W+c 0 *X+(c 1 -c 0 )X*W Clearly the interaction should be there. Without interaction, one implicitly assumes c 1 =c 0.

But by the implicit additivity (or equivalently homogenous effect or linear) assumption, the previous model can get a point prediction. This is not magic, but extrapolation based on an untestable assumption. Regression (or any model) comes with a package, you need to know and acknowledge what assumptions---explicit or implicit---come with the model.

Take home message Potential outcome approach forces you to look at the data, think hard and be honest about the assumptions. In this case: (1) overlap, (2) linearity (or additivity). Immediately resorting to OLS regression misses all of these. Causal inference is a way of thinking, not a particular model or procedure (like OLS regression). This doesn’t say you can’t use regressions for causal inference, in fact, they are routinely used. But the point of causality is free of what particular model you use, it is more about how to formulate a problem.