Analysis of Experimental Data II Christoph Engel.

Slides:



Advertisements
Similar presentations
ASSUMPTION CHECKING In regression analysis with Stata
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Quantitative Data Analysis: Hypothesis Testing
Instrumental Variables Estimation and Two Stage Least Square
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
Objectives (BPS chapter 24)
Session 2. Applied Regression -- Prof. Juran2 Outline for Session 2 More Simple Regression –Bottom Part of the Output Hypothesis Testing –Significance.
The Simple Linear Regression Model: Specification and Estimation
Chapter 13 Additional Topics in Regression Analysis
Economics Prof. Buckles1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Chapter 10 Simple Regression.
Linear Regression with One Regression
Regression Hal Varian 10 April What is regression? History Curve fitting v statistics Correlation and causation Statistical models Gauss-Markov.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Statistical Analysis SC504/HS927 Spring Term 2008 Session 7: Week 23: 7 th March 2008 Complex independent variables and regression diagnostics.
FRM Zvi Wiener Following P. Jorion, Financial Risk Manager Handbook Financial Risk Management.
Topic 3: Regression.
Empirical Estimation Review EconS 451: Lecture # 8 Describe in general terms what we are attempting to solve with empirical estimation. Understand why.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Linear Regression Models Powerful modeling technique Tease out relationships between “independent” variables and 1 “dependent” variable Models not perfect…need.
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Business Statistics - QBM117 Statistical inference for regression.
Finding help. Stata manuals You have all these as pdf! Check the folder /Stata12/docs.
Variance and covariance Sums of squares General linear models.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Objectives of Multiple Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Things that I think are important Chapter 1 Bar graphs, histograms Outliers Mean, median, mode, quartiles of data Variance and standard deviation of.
Returning to Consumption
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
Model Building III – Remedial Measures KNNL – Chapter 11.
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Specification Error I.
Chapter 12 Examining Relationships in Quantitative Research Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Chapter 12: Linear Regression 1. Introduction Regression analysis and Analysis of variance are the two most widely used statistical procedures. Regression.
Introduction to Linear Regression
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Multiple Regression & OLS violations Week 4 Lecture MG461 Dr. Meredith Rolfe.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
11 Chapter 12 Quantitative Data Analysis: Hypothesis Testing © 2009 John Wiley & Sons Ltd.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
Model Selection and Validation. Model-Building Process 1. Data collection and preparation 2. Reduction of explanatory or predictor variables (for exploratory.
Right Hand Side (Independent) Variables Ciaran S. Phibbs.
Chapter 4 The Classical Model Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University.
Analysis of Experimental Data IV Christoph Engel.
Analysis of Experimental Data III Christoph Engel.
 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
Experimental Evaluations Methods of Economic Investigation Lecture 4.
Lecturer: Ing. Martina Hanová, PhD..  How do we evaluate a model?  How do we know if the model we are using is good?  assumptions relate to the (population)
Linear Regression with One Regression
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
Econometric methods of analysis and forecasting of financial markets
Multivariate Analysis Lec 4
Instrumental Variables and Two Stage Least Squares
Chapter 6: MULTIPLE REGRESSION ANALYSIS
Multiple Regression Models
Instrumental Variables and Two Stage Least Squares
Migration and the Labour Market
Types of Control I. Measurement Control II. Statistical Control
Instrumental Variables and Two Stage Least Squares
Chapter 7: The Normality Assumption and Inference with OLS
Regression III.
Regression Forecasting and Model Building
Linear Panel Data Models
Chapter 13 Additional Topics in Regression Analysis
Instrumental Variables Estimation and Two Stage Least Squares
Presentation transcript:

Analysis of Experimental Data II Christoph Engel

linear model I.treatment effect II.continuous explanatory variable III.heteroskedasticity IV.control variables V.interaction effects VI.outliers VII.endogeneity VIII.small and big problems

I. treatment effect  pro  (usually) more statistical power  greater flexibility  control variables  heteroskedasticity  instrumental variables  time series and panel models  non-linear functional form  automatic estimate of  effect size  (in principle) marginal effect  contra  more assumptions

data generation  set obs 1000  gen uid = _n  gen error = rnormal()  gen treat = (uid > 500)  gen dv = 5 + 2*treat + error

data

non-parametric

parametric

ttest  hardly ever used with experimental data  no effect size  assumes normality

(linear) regression

reference category: baseline, mean treatment: cons = 6.992

(linear) regression reliability of estimates

(linear) regression explained variance

regression model  explanandum  depvar(i)  explanans  indepvars(i)  explanation  cons  coef

regression model

fundamental assumption  error is uncorrelated with explanatory variables  graphical way of testing  residuals  predicted value should be orthogonal

plot

II. continuous iv

data generating process dv = 5 +.5*level + error

regression

interpretation  in a linear model  coef = marginal effect  take first derivative wrt level  prediction  one unit increase of level  leads to.495 increase of dv

orthogonality of error

prediction reg dv level predict preddv two (sc dv level) (sc preddv level, c(L))

regression

significance  intuitive criterion  H 0  regressor has no explanatory power  = is zero  is 0 within confidence interval?

how to construct?  mean - / *SE  SE = sqrt(entry in var covar matrix)  not very intuitive

intuitive approximation  assuming the error  orthogonal  mean 0

graph

what goes wrong?  6.3 % below 0  procedure attributes entire unexplained variance to level regressor

III. heteroskedasticty dv = 5 +.5*level +.1*level*error

estimation

problem  probably even bias / inconsistency  at any rate standard errors wrong  SE level underestimated  SE cons overestimated  solution  (heteroskedasticity) robust standard errors

technically σ0000 0σ000 00σ00 000σ0 0000σ  assuming homoskedasticity  all obs are iid  variance / sd / se the same all over  (and all covariance terms are 0)

by contrast σ0000 0σ000 00σ00 000σ0 0000σ σ1σ σ2σ σ3σ σ4σ σ5σ5

IV. control variables

data generating process  two dimensional  orthogonal  rare in experimental data  but correlation of indepvar no problem  if not very pronounced   multicollinearity  dv = 5 + 2*treat +.5*level + error

omitted variables if orthogonal no problem with consistency but SE are wrong but cons is wrong

prediction

same with collinearity  data generating process as before  but  replace treat = treat +.1*level

consistency affected

V. interaction effects  data generating process dv = 5 + 2*treat +.5*level -.25*treat*level + error

regression

prediction

testing net effect  is something relevant happening  in the treatment  at the beginning

testing treatment effect  at various levels  is there a treatment effect at the beginning?  is there one in the end?

everywhere?

VI. outliers data generating process dv = 5 +.5*level + error replace dv = 1000 if uid > 995

heavy problem

what to do?  think of endgame effect  proximate cause: highest level (last period)  relatively good, but level insig.

transform dv

best: 1/sqrt(dv)  good for cons  after retransformation  very poor for level 

find reason / contingency

problem solved

VII. endogeneity  immaterial for treatment effect  randomization prevents  easily relevant when explaining treatment effect  data generating process  level = 2 +.5*trait + error  dv = 5 + 2*treat +.5*level + error

inconsistency

2sls

VIII. small and big problems  heteroskedasticity  consistent  robust SE  non-normality (of error term)  (law of large numbers)  alternative functional form  non-independence  dgp induced  match with statistical model

(small and big problems)  (omitted variables)  decontextualisation  outliers  capture by specification  (transform dv)  endogeneity  (randomization)  (create) iv