A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or.

Slides:



Advertisements
Similar presentations
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.7 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Advertisements

Sociology 601 Class 24: November 19, 2009 (partial) Review –regression results for spurious & intervening effects –care with sample sizes for comparing.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.16 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Randomised Controlled Trials in the Social Sciences Analysis of randomised trials Martin Bland Professor of Health Statistics University of York www-users.york.ac.uk/~mb55/
From Anova to Regression: analyzing the effect on consumption of no. of persons in family Family consumption data family.dta E/Albert/Courses/cdas/appstat00/From.
Heteroskedasticity The Problem:
HETEROSCEDASTICITY-CONSISTENT STANDARD ERRORS 1 Heteroscedasticity causes OLS standard errors to be biased is finite samples. However it can be demonstrated.
1 Nonlinear Regression Functions (SW Chapter 8). 2 The TestScore – STR relation looks linear (maybe)…
EC220 - Introduction to econometrics (chapter 7)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: exercise 3.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Valuation 4: Econometrics Why econometrics? What are the tasks? Specification and estimation Hypotheses testing Example study.
Sociology 601 Class 21: November 10, 2009 Review –formulas for b and se(b) –stata regression commands & output Violations of Model Assumptions, and their.
Sample size and analytical issues for cluster trials David Torgerson Director, York Trials Unit
SPH 247 Statistical Analysis of Laboratory Data 1April 23, 2010SPH 247 Statistical Analysis of Laboratory Data.
1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data.
Regression Example Using Pop Quiz Data. Second Pop Quiz At my former school (Irvine), I gave a “pop quiz” to my econometrics students. The quiz consisted.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
1 Michigan.do. 2. * construct new variables;. gen mi=state==26;. * michigan dummy;. gen hike=month>=33;. * treatment period dummy;. gen treatment=hike*mi;
Sociology 601 Class 23: November 17, 2009 Homework #8 Review –spurious, intervening, & interactions effects –stata regression commands & output F-tests.
Interpreting Bi-variate OLS Regression
1 Zinc Data EPP 245 Statistical Analysis of Laboratory Data.
Sociology 601 Class 26: December 1, 2009 (partial) Review –curvilinear regression results –cubic polynomial Interaction effects –example: earnings on married.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification iii: consequences for diagnostics Original.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: two sets of dummy variables Original citation: Dougherty, C. (2012) EC220.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: the effects of changing the reference category Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy classification with more than two categories Original citation:
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
1 TWO SETS OF DUMMY VARIABLES The explanatory variables in a regression model may include multiple sets of dummy variables. This sequence provides an example.
Quantile Regression Prize Winnings – LPGA 2009/2010 Seasons Kahane, L.H. (2010). “Returns to Skill in Professional Golf: A Quantile Regression.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
Returning to Consumption
Country Gini IndexCountryGini IndexCountryGini IndexCountryGini Index Albania28.2Georgia40.4Mozambique39.6Turkey38 Algeria35.3Germany28.3Nepal47.2Turkmenistan40.8.
1 Estimation of constant-CV regression models Alan H. Feiveson NASA – Johnson Space Center Houston, TX SNASUG 2008 Chicago, IL.
How do Lawyers Set fees?. Learning Objectives 1.Model i.e. “Story” or question 2.Multiple regression review 3.Omitted variables (our first failure of.
MultiCollinearity. The Nature of the Problem OLS requires that the explanatory variables are independent of error term But they may not always be independent.
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: exercise 1.5 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Bandit Thinkhamrop, PhD. (Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen University, THAILAND.
. reg LGEARN S WEIGHT85 Source | SS df MS Number of obs = F( 2, 537) = Model |
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.2 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Two-stage least squares 1. D1 S1 2 P Q D1 D2D2 S1 S2 Increase in income Increase in costs 3.
Chapter 5: Dummy Variables. DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 We’ll now examine how you can include qualitative explanatory variables.
Panel Data. Assembling the Data insheet using marriage-data.csv, c d u "background-data", clear d u "experience-data", clear u "wage-data", clear d reshape.
COST 11 DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES 1 This sequence explains how you can include qualitative explanatory variables in your regression.
Lecture 5. Linear Models for Correlated Data: Inference.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: exercise 6.13 Original citation: Dougherty, C. (2012) EC220 - Introduction.
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
RAMSEY’S RESET TEST OF FUNCTIONAL MISSPECIFICATION 1 Ramsey’s RESET test of functional misspecification is intended to provide a simple indicator of evidence.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
366_7. T-distribution T-test vs. Z-test Z assumes we know, or can calculate the standard error of the distribution of something in a population We never.
GRAPHING A RELATIONSHIP IN A MULTIPLE REGRESSION MODEL The output above shows the result of regressing EARNINGS, hourly earnings in dollars, on S, years.
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
1 In the Monte Carlo experiment in the previous sequence we used the rate of unemployment, U, as an instrument for w in the price inflation equation. SIMULTANEOUS.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
WHITE TEST FOR HETEROSCEDASTICITY 1 The White test for heteroscedasticity looks for evidence of an association between the variance of the disturbance.
VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE In this sequence we will investigate the consequences of including an irrelevant variable.
1 Estimating and Testing  2 0 (n-1)s 2 /  2 has a  2 distribution with n-1 degrees of freedom Like other parameters, can create CIs and hypothesis tests.
Bandit Thinkhamrop, PhD. (Statistics) Department of Biostatistics and Demography Faculty of Public Health Khon Kaen University, THAILAND.
Diff-inDiff Are exports from i to j, the same as imports in i from j? Should they be?. gen test=xij-mji (14 missing values generated). sum test,
From t-test to multilevel analyses Del-2
The slope, explained variance, residuals
QM222 Class 15 Section D1 Review for test Multicollinearity
Common Statistical Analyses Theory behind them
EPP 245 Statistical Analysis of Laboratory Data
Introduction to Econometrics, 5th edition
Presentation transcript:

A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or no incentive. Outcome variable: number of sessions attended.

Classes randomised to incentive or no incentive. Two groups of 14 classes. Labeled “X” and “Y” in this data set. Blinded for analysis. Group X: 77 students Group Y: 86 students

Outcome variable: number of sessions attended.

Compare mean number of sessions ignoring clustering:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] X | Y | combined | diff | Degrees of freedom: 150 Ho: mean(X) - mean(Y) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t =

Compare mean number of sessions ignoring clustering:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] X | Y | combined | diff | Degrees of freedom: 150 Ho: mean(X) - mean(Y) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = Stata version 8.

Compare mean number of sessions ignoring clustering:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] X | Y | combined | diff | Degrees of freedom: 150 Ho: mean(X) - mean(Y) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = P = — a highly significant difference!

Compare mean number of sessions ignoring clustering:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] X | Y | combined | diff | Degrees of freedom: 150 Ho: mean(X) - mean(Y) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = P = — a highly significant difference! But it is wrong — it ignores the clustering!

Compare mean number of sessions ignoring clustering, regression:. regress sessions group Source | SS df MS Number of obs = F( 1, 150) = 7.78 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | _cons | P = — identical to two sample t method. It is still wrong — it ignores the clustering!

Compare mean number of sessions including clustering, two sample t method on cluster means:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] | | combined | diff | Degrees of freedom: 26 Ho: mean(1) - mean(2) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = P = — not significant.

Compare mean number of sessions including clustering, two sample t method on cluster means:. ttest sessions, by(group) Two-sample t test with equal variances Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] | | combined | diff | Degrees of freedom: 26 Ho: mean(1) - mean(2) = diff = 0 Ha: diff 0 t = t = t = P |t| = P > t = P = — not significant. Almost correct — it takes the data structure into account, but not the variation in class size.

Compare number of sessions including clustering, two sample t method on cluster means Almost correct — it takes the data structure into account, but not the variation in class size.

Compare mean number of sessions including clustering, regression method, weighted by class size:. regress session group [aweight=learner] (sum of wgt is e+02) Source | SS df MS Number of obs = F( 1, 26) = 2.77 Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | _cons | P = — not significant. Correct — it takes the data structure into account, including the variation in class size.

Compare individual number of sessions including clustering, robust standard error method (Huber-White-sandwich method):. regress sessions group, cluster(class) Regression with robust standard errors Number of obs = 152 F( 1, 27) = 2.79 Prob > F = R-squared = Number of clusters (class) = 28 Root MSE = | Robust sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | _cons | P = — not significant. Correct — it takes the data structure into account. Very similar estimate and P value to method using means.

Compare individual number of sessions including clustering, robust standard error method (Huber-White-sandwich method). Correct — it takes the data structure into account. Very similar estimate and P value to method using means.

Compare individual number of sessions including clustering, robust standard error method (Huber-White-sandwich method. Correct — it takes the data structure into account. Very similar estimate and P value to method using means. I can do that using SPSS. So what is the advantage?

Compare individual number of sessions including clustering, robust standard error method (Huber-White-sandwich method): Correct — it takes the data structure into account. Very similar estimate and P value to method using means. I can do that using SPSS. So what is the advantage? We can use subject-level covariates.

Mid-score = reading score before randomisation.

Compare individual number of sessions including clustering, robust standard error method, adjusting for mid-score:. regress sessions group midscl, cluster(class) Regression with robust standard errors Number of obs = 152 F( 2, 27) = Prob > F = R-squared = Number of clusters (class) = 28 Root MSE = | Robust sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | midscl | _cons | P = — significant. Correct — it takes the data structure into account.

Compare individual number of sessions including clustering, robust standard error method, adjusting for mid-score:. regress sessions group midscl, cluster(class) Regression with robust standard errors Number of obs = 152 F( 2, 27) = Prob > F = R-squared = Number of clusters (class) = 28 Root MSE = | Robust sessions | Coef. Std. Err. t P>|t| [95% Conf. Interval] group | midscl | _cons | P = — significant. Correct — it takes the data structure into account. Adjustment produces true significant difference.