Download presentation
Presentation is loading. Please wait.
1
R Shiny app KonFound-it! (konfound-it.com/)
Quantifying What it Would Take to Change an Inference: Toward a Pragmatic Sociology Ken Frank Duncan lecture ASA August R Shiny app KonFound-it! (konfound-it.com/)
2
Background Papers Technical Papers:
Frank, K.A., Maroulis, S., Duong, M., and Kelcey, B What would it take to Change an Inference?: Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences. Education, Evaluation and Policy Analysis. Vol 35: Frank, K.A., Gary Sykes, Dorothea Anagnostopoulos, Marisa Cannata, Linda Chard, Ann Krause, Raven McCrory Extended Influence: National Board Certified Teachers as Help Providers. Education, Evaluation, and Policy Analysis. Vol 30(1): 3-30. *Frank, K. A. and Min, K Indices of Robustness for Sample Representation. Sociological Methodology. Vol 37, * co first authors. Pan, W., and Frank, K.A "An Approximation to the Distribution of the Product of Two Dependent Correlation Coefficients." Journal of Statistical Computation and Simulation, 74, Pan, W., and Frank, K.A., "A probability index of the robustness of a causal inference," Journal of Educational and Behavioral Statistics, 28, Frank, K "Impact of a Confounding Variable on the Inference of a Regression Coefficient." Sociological Methods and Research, 29(2), Recent applications: Dietz, T., Frank, K. A., Whitley, C. T., Kelly, J., & Kelly, R Political influences on greenhouse gas emissions from US states. Proceedings of the National Academy of Sciences, 112(27), Frank, K.A., Penuel, W.R. and Krause, A., What Is A “Good” Social Network For Policy Implementation? The Flow Of Know‐How For Organizational Change. Journal of Policy Analysis and Management, 34(2), pp
3
Quick Examples of Quantifying the Robustness to Invalidate an Inference
Downloads: spreadsheet for calculating indices [KonFound-it!©] quick examples powerpoint for replacement of cases powerpoint for correlation framework powerpoint for comparison of frameworks The value of controlling for pretests published empirical examples R Shiny app for sensitivity analysishttp://konfound-it.com. R Shiny app KonFound-it The whole package is at: project.org/web/packages/konfound/ STATA: . ssc install konfound . ssc install moss . ssc install matsort . ssc install indeplist Commands are konfound for a procedure you ran in stata Pkonfound for published example Mkonfound for multiple analyses
4
Public Sociology endeavors to bring sociology into dialogue with audiences beyond the academy, an open dialogue in which both sides deepen their understanding of public issues. But what is its relation to the rest of sociology? It is the opposite of Professional Sociology – a scientific sociology created by and for sociologists – inspired by public sociology but, equally, without which public sociology would not exist. The relation between professional and public sociology is, thus, one of antagonistic interdependence. Emphasis added. The bulk of public sociology is indeed of an organic kind—sociologists working with a labor movement, neighborhood associations, communities of faith, immigrant rights groups, human rights organizations. Between the organic public sociologist and a public is a dialogue, a process of mutual education. Burawoy 2004, page 8. I say, robustness analysis can provide professional sociologists with a more precise and more interpretable language for pragmatic dialogue with the public.
5
Policy, Public, and Pragmatic Sociologies
Policy Sociology Public Sociology X Y Action $ Pragmatic Sociology: Is there enough evidence to act? You cannot prove!
6
Wilson, The Truly Disadvantaged
7
Robustness Analysis Sensitivity can assess how estimates change as a result of alternative specifications Changing covariates Changing samples Changing estimation techniques Robustness analysis quantifies what it would take to change an inference based on hypothetical data or conditions Unobserved covariates Unobserved samples
8
Change the Discourse Can you make a causal inference
from an observational study? Of course you can. You just might be wrong. It’s causal inference, not determinism. But what would it take for the inference to be wrong?
9
History of Robustness Analyses
A dialogue with the public!
10
Rosenbaum Summarizes The First Sensitivity Analysis The first sensitivity analysis in an observational study was conducted by Cornfield, et al. [6] for certain observational studies of cigarette smoking as a cause of lung cancer; see also [10]. Although the tobacco industry and others had often suggested that cigarettes might not be the cause of high rates of lung cancer among smokers, that some other difference between smokers and nonsmokers might be the cause, Cornfield, et al. found that such an unobserved characteristic would need to be a near perfect predictor of lung cancer and about nine times more common among smokers than among nonsmokers. While this sensitivity analysis does not rule out the possibility that such a characteristic might exist, it does clarify what a scientist must logically be prepared to assert in order to defend such a claim Diprete, T. & Gangl, M. (2004). Assessing Bias in the Estimation of Causal Effects: Rosenbaum Bounds on Matching Estimators and Instrumental Variables Estimation with Imperfect Instruments. Sociological Methodology 34. Lin. D.Y., Psaty, B. M., & Kronmal, R.A. (1998). Assessing the sensitivity of regression results to unmeasured confounders in observational studies. Biometrics, 54(3), Robins, J., A. Rotnisky, and D. Scharfstein “Sensitivity Analysis for Selection Bias and Unmeasured Confounding in Missing Data and Causal Inference Models.” Pp. 1–95 in Statistical Models in Epidemiology,the Environment and Clinical Trials (The IMA Volumes in Mathematics and Its Applications), edited by E. Halloran and D. Berry. New York: Springer-Verlag. VanderWeele, T.J. and Arah, O.A., Bias formulas for sensitivity analysis of unmeasured confounding for general outcomes, treatments, and confounders. Epidemiology (Cambridge, Mass.), 22(1), pp Rosenbaum, P. R. (2005). Sensitivity analysis in observational studies. Encyclopedia of statistics in behavioral science, 4, See also: Diprete & Gangl 2004; Lin, Psaty and Kronmal, 1998; Robins , Rotnisky & Scharfstein, 2000; VanderWeele, & Arah, 2011
11
Policy, Public, and Pragmatic Sociologies: Smoking and Lung Cancer
Policy Sociology Public Sociology Action $ smoke Cancer Pragmatic Sociology: Is there enough evidence to act? Cornfield Tobacco Co You cannot prove! Fisher
12
Realizing the Potential of Robustness Analysis
General linear model Account for sampling error (e.g., statistical significance More accessible terms and language Correlations and people Beyond *, **, and *** P values rely on sampling distribution Must interpret relative to standard errors Information lost for modest and high levels of robustness Rosenbaum, P. R. (2005). Sensitivity analysis in observational studies. Encyclopedia of statistics in behavioral science, 4,
13
Case Replacement Motivation: File-drawer problem in meta analysis:
How many studies must be in the file drawer to make the combined results of observed and unobserved studies have a p-value of .05? Apply beyond meta-analysis Consider cases replaced as well as added Apply to internal and external validity Counterfactual framework Consider multiple mechanisms of bias (not just publication bias) Use any threshold for decision-making Develop in a full statistical hypothesis testing framework general linear model if desired Frank, K.A., Maroulis, S., Duong, M., and Kelcey, B What would it take to Change an Inference?: Using Rubin’s Causal Model to Interpret the Robustness of Causal Inferences. Education, Evaluation and Policy Analysis. Vol 35: *Frank, K. A. and Min, K Indices of Robustness for Sample Representation. Sociological Methodology. Vol 37, * co first authors.
14
Replacement of Cases Framework
How much bias must there be to invalidate an inference? Concerns about Internal Validity What percentage of cases would you have to replace with counterfactual cases (with zero effect) to invalidate the inference? Concerns about External Validity What percentage of cases would you have to replace with cases from an unsampled population (with zero effect) to invalidate the inference?
15
% bias necessary to invalidate the inference
δ# % bias necessary to invalidate the inference { } δ#
16
Framework for Interpreting % Bias to Invalidate an Inference: Rubin’s Causal Model and the Counterfactual I have a headache I take an aspirin (treatment) My headache goes away (outcome) Q) Is it because I took the aspirin? We’ll never know – it is counterfactual – for the individual This is the Fundamental Problem of Causal Inference Holland 1986 JASA.
17
Approximating the Counterfactual with Observed Data
But how well does the observed data approximate the counterfactual? Difference between counterfactual values and observed values for the control implies the treatment effect of 1 8 9 10 1 3 4 5 6 9 4.00 is overestimated as 6 using observed control cases with mean of 4 Fundamental problem of causal inference is that we cannot simultaneously observe Holland, Paul W "Statistics and Causal Inference." Journal of the American Statistical Association 81:945_70. (25-40)
18
Using the Counterfactual to Interpret % Bias to Invalidate the Inference: Replacement with Average Values How many cases would you have to replace with zero effect counterfactuals to change the inference? Assume threshold is 4 (δ# =4): 1- δ# / =1-4/6=.33 =(1/3) 7 6 7 7 3 4 5 7 9 5 4 6 Note paired t for counterfact data not relevant because counterfact data are constant – no correlation between pairs, so no correction in standard error of paired t. The inference would be invalid if you replaced 33% (or 1 case) with counterfactuals for which there was no treatment effect. New estimate=(1-% replaced) +%replaced(no effect)= (1-%replaced) =(1-.33)6=.66(6)=4
19
Which Cases to Replace? Think of it as an expectation: if you randomly replaced 1/3 of the cases, and repeated 1,000 times, on average the new estimate would be 4 Assumes constant treatment effect Conditioning on covariates and interactions already in the model Assumes cases carry equal weight Current extensions include selective replacement, spillover, weighted observations
20
% bias necessary to invalidate the inference
δ# % bias necessary to invalidate the inference { } δ# To invalidate the inference, replace 33% of cases with counterfactual cases with zero effect
21
Using an Unsampled Population to Invalidate the Inference (External Validity)
How many cases would you have to replace with cases with zero effect to change the inference? Assume threshold is: δ# =4: 1- δ# / =1-4/6=.33 =(1/3) 9 10 11 3 4 5 7 7 7 6 4
22
% bias necessary to invalidate the inference
{ } δ# To invalidate the inference, replace 33% of cases with cases from unsampled population with zero effect
23
Propensity based questions (not explored here)
Example of Calculating the % Bias to Invalidate and Inference in an Observational Study : The Effect of Kindergarten Retention on Reading and Math Achievement (Hong and Raudenbush 2005) 1. What is the average effect of kindergarten retention policy? (Example used here) Should we expect to see a change in children’s average learning outcomes if a school changes its retention policy? Propensity based questions (not explored here) 2. What is the average impact of a school’s retention policy on children who would be promoted if the policy were adopted? Use principal stratification. Hong, G. and Raudenbush, S. (2005). Effects of Kindergarten Retention Policy on Children’s Cognitive Growth in Reading and Mathematics. Educational Evaluation and Policy Analysis. Vol. 27, No. 3, pp. 205–224 Hong, G. and Raudenbush, S. (2005). Effects of Kindergarten Retention Policy on Children’s Cognitive Growth in Reading and Mathematics. Educational Evaluation and Policy Analysis. Vol. 27, No. 3, pp. 205–224 See page 208.
24
Data Early Childhood Longitudinal Study Kindergarten cohort (ECLSK)
US National Center for Education Statistics (NCES). Nationally representative Kindergarten and 1st grade observed Fall 1998, Spring 1998, Spring 1999 Student background and educational experiences Math and reading achievement (dependent variable) experience in class Parenting information and style Teacher assessment of student School conditions Analytic sample (1,080 schools that do retain some children) 471 kindergarten retainees 10,255 promoted students
25
Estimated Effect of Retention on Reading Scores (Hong and Raudenbush)
26
Possible Confounding Variables (note they controlled for these)
Gender Two Parent Household Poverty Mother’s level of Education (especially relevant for reading achievement) Extensive pretests measured in the Spring of 1999 (at the beginning of the second year of school) standardized measures of reading ability, math ability, and general knowledge; indirect assessments of literature, math and general knowledge that include aspects of a child’s process as well as product; teacher’s rating of the child’s skills in language, math, and science
27
Obtain df, Estimated Effect and Standard Error
( ) = -9.01 Standard error=.68 n= =7639; df > 500, t critical=-1.96 From: Hong, G. and Raudenbush, S. (2005). Effects of Kindergarten Retention Policy on Children’s Cognitive Growth in Reading and Mathematics. Educational Evaluation and Policy Analysis. Vol. 27, No. 3, pp. 205–224
28
Calculating % Bias to Invalidate the Inference
1) Calculate threshold δ# Estimated effect is statistically significant if: |Estimated effect| /standard error > |tcritical | |Estimated effect| > |tcritical |x standard error= δ # |Estimated effect| > 1.96 x .68 = 1.33= δ # 2) Record =|Estimated effect|= 9.01 3) % bias to invalidate the inference is 1- δ #/ = /9.01=.85 85% of the estimate would have to be due to bias to invalidate the inference You would have to replace 85% of the cases with counterfactual cases with 0 effect of retention on achievement to invalidate the inference
29
In R Shiny app KonFound-it! (konfound-it.com/)
Estimated effect Standard error Number of observations Number of covariates Take out your phone and try it!!!
30
Konfound-it.com R and STATA
31
In R Shiny app KonFound-it!
32
% Bias necessary to invalidate inference
= 1-δ#/ =1-1.33/9.01=85% 85% of the estimate must be due to bias to invalidate the inference. 85% of the cases must be replaced with null hypothesis cases to invalidate the inference Estimated Effect δ# Threshold
33
Using the Counterfactual to Interpret % Bias to Invalidate the Inference: Replacement with Average Values How many cases would you have to replace with zero effect counterfactuals to change the inference? Assume threshold is 4 (δ# =4): 1- δ# / =1-4/6=.33 =(1/3) 7 6 7 7 3 4 5 7 9 5 6 4 The inference would be invalid if you replaced 33% (or 1 case) with counterfactuals for which there was no treatment effect. New estimate=(1-% replaced) +%replaced(no effect)= (1-%replaced) =(1-.33)6=.66(6)=4
34
Example Replacement of Cases with Counterfactual Data to Invalidate Inference of an Effect of Kindergarten Retention The retained group shifts more because the counterfactual data were assigned the mean of the overall data, which is dominated by the promoted group. Counterfactual: No effect Retained Promoted Original cases that were not replaced Replacement counterfactual cases with zero effect Original distribution
35
Evaluation of % Bias Necessary to Invalidate Inference
Pragmatic, contentious sociology: 50% cut off– for every case you remove, I get to keep one 2) Compare bias necessary to invalidate inference with bias accounted for by background characteristics 1% of estimated effect accounted for by background characteristics (including mother’s education), once controlling for pretests. Other sources of bias would have to be 85 times more important than background characteristics Rosenbaum, P. R. (1986). Dropping out of high school in the United States: An observational study. Journal of Educational Statistics, 11(3), 3) Compare with % bias necessary to invalidate inference in other studies. Use correlation metric: Adjusts for differences in scale
36
Compare with Bias other Observational Studies
37
% Bias to Invalidate Inference for Observational Studies on-line EEPA July 24-Nov 15 2012
As in Rosenbaum, Dropping out of high school in the United States: An observational study. Journal of Educational Statistics, 11, Estimated kindergarten retention effect
38
% Bias to Invalidate versus p-value: a better language?
Kindergarten Retention *** Catholic Schools ** * Df=973 based on Morgan’s analysis of Catholic school effects, Functional form not sensitive to df
39
Beyond *, **, and *** P values % bias to invalidate
sampling distribution framework Must interpret relative to standard errors Information lost for modest and high levels of robustness % bias to invalidate counterfactual framework Interpret in terms of case replacement Information along a continuous distribution Rosenbaum, P. R. (2005). Sensitivity analysis in observational studies. Encyclopedia of statistics in behavioral science, 4,
40
% Bias Necessary to Invalidate the Inference and Confidence Interval
Lower bound of confidence interval “far from 0” estimate exceeds threshold by large amount Confidence Interval Closer to 0? } } } δ # δ #
41
% Bias to Invalidate the Inference and Bounding
Bounding (e.g., Altonji et al., Manski) produces statements such as: “If unobserved covariates are similar to observed covariates the lowest estimate would be xxx.” focus on estimate Very different from: “The unobserved factors would have to account for xxx% of the estimate to invalidate the inference.” pragmatic focus on inference Altonji, J.G., Elder, T., and Taber, C. (2005). An Evaluation of Instrumental Variable Strategies for Estimating the Effects of Catholic Schooling. Journal of Human Resources 40(4): Manski, C. F. (1990). Nonparametric bounds on treatment effects. The American Economic Review, 80(2),
42
Policy Sociology: Ordered Thresholds Relative to Transaction Costs
Definition of threshold: the point at which evidence from a study would make one indifferent to policy choices Changing beliefs, without a corresponding change in action. Changing action for an individual (or family) Increasing investments in an existing program. Initial investment in a pilot program where none exists. Dismantling an existing program and replacing it with a new program.
43
Range of Thresholds for Kindergarten Retention Inference
44
Application of % bias to Invalidate and Inference to Randomized Experiment with Concern about External Validity Chetty, R., Hendren, N., & Katz, L. F. (2016). The effects of exposure to better neighborhoods on children: New evidence from the Moving to Opportunity experiment. American Economic Review, 106(4),
45
OLS for Intent to Treat Intent to treat is just OLS; TOT is IV
46
Note the TOT standard error is more than double. OLS with sensitivity!
52 Controls in model with controls.
48
Estimated Effect Threshold 20% o the estimated effect would have to be due to bias to invalidate the inference of an effect of MTO for children under 13 on total earnings
49
Fundamental Problem of Inference and Approximating the Unsampled Population with Observed Data (External Validity) How many cases would you have to replace with cases with zero effect to change the inference? Assume threshold is: δ# =4: 1- δ# / =1-4/6=.33 =(1/3) 9 10 11 3 4 5 7 7 7 6 4
50
Interpretation of % Bias Necessary to Invalidate the Inference: Sample Representativeness for Moving to Opportunity To invalidate the inference: 20% of the estimate must be due to sampling bias to invalidate inference of an effect of Moving to Opportunity on all earnings for those under 13 You would have to replace 20% of the cases (about 585 people) with cases in which MTO had no effect to invalidate the inference We have quantified the discourse about the concern of external validity
51
Pragmatism and the Fundamental Problem of External Validity
Pre-experiment population ≠Post -experiment population Public Sociology Action $ X Y Fundamental problem of external validity: The more influential a study the more different the pre and post populations, the less the results apply to the post experimental population (Ben-David; Kuhn) All the more so if it is due to the design (Burtless, 1995)
52
Applications Regression based: Propensity scores
“Doubly Robust” (implies linear model at final stage) Diff in Diff Regression discontinuity Linear model, with proper functional form, in final stage Comparative interrupted time series Confound be associated with pre vs post intervention or with treatment groups All those that applied to regression, plus … Counterfactual comparisons Hong, Pearl Any time differences in means are compared: Propensity matching Weighted least squares (Logistic, Multilevel): Must replace a case with cases equaling same weight Must assume weights don’t change Must believe standard errors if statistical significance used as a threshold
53
The Impact of a Confounding Variable: Correlational Framework Accounting for the dual relationships associated with an alternative factor Smoking X Lung Cancer Y Need to modify to show new values rcv∙xrcv∙y Confounding Variable CV 57
54
The Impact of a Confounding Variable: Accounting for the dual relationships associated with an alternative factor, linear relationships rx∙y Retention X Achievement Y rxy|cv rcv∙x Each correlation is weighted in proportion to the size of the other Need to modify to show new values rcv∙y rcv∙xrcv∙y Mother’s education CV 58
55
How Regression Works: Partial Correlation
Partial Correlation: correlation between x and y, where x and y have been controlled for the confounding variable Impact Implied by multivariate: Β=[X’X]-1X’Y CORRELATIONS /VARIABLES=y s confound /PRINT=TWOTAIL NOSIG /MISSING=PAIRWISE. proc corr data=one; var y s confound; run; momed retention achievement 1 -.075 -.152 .203
56
The Impact of a Confounding Variable: Accounting for the dual relationships associated with an alternative factor, linear relationships rx∙y Retention X Achievement Y rxy|cv rcv∙x Each correlation is weighted in proportion to the size of the other Need to modify to show new values rcv∙y rcv∙xrcv∙y Mother’s education CV impact 60
57
Use the Impact of a Confounding Variable on a Regression Coefficient for Robustness Analysis
Statements such as: An omitted variable must have an impact of qqq to invalidate the inference of an effect of X on Y. An omitted variable must be correlated at rrr with X and Y to invalidate an inference of an effect of X on Y. I call this the Impact Threshold of a Confounding Variable (ITCV).
58
Quantifying the Robustness of the Inference: Calculating the Impact Threshold of a Confounding Variable on a Regression Coefficient Step 1: Establish Correlation Between predictor of interest and outcome Step 2: Define a Threshold for Inference Step 3: Calculate the Threshold for the Impact Necessary to Invalidate the Inference Frank, K "Impact of a Confounding Variable on the Inference of a Regression Coefficient." Sociological Methods and Research, 29(2),
59
Estimated Effect of Retention on Reading Scores (Hong and Raudenbush)
60
Obtain t critical, estimated effect and standard error: Hong and Raudenbush, Kindergarten Retention
df > 500, t critical=-1.96 Estimated effect ( ) = -9.01 Standard error=.68 From: Hong, G. and Raudenbush, S. (2005). Effects of Kindergarten Retention Policy on Children’s Cognitive Growth in Reading and Mathematics. Educational Evaluation and Policy Analysis. Vol. 27, No. 3, pp. 205–224
61
Step 1: calculate treatment effect as correlation
Establish Correlation and Threshold for Kindergarten Retention on Achievement Step 1: calculate treatment effect as correlation t taken from HLM: =-9.01/.68=-13.25 n is the sample size =7639 q is the number of parameters estimated ( predictor+omitted variable=2) Step 2: calculate threshold for inference (e.g. statistical significance) Where t is critical value for df>200 r# can also be defined in terms of effect sizes
62
Step 3a: Calculate the Impact Necessary to Invalidate the Inference
Assume rx∙cv =ry∙cv (maximizes the difference between rxy and rxy|cv ). Then rx∙cv x ry∙cv = rx∙cv x rx∙cv = ry∙cv x ry∙cv = impact and Set rx∙y|cv =r# and solve for impact to find the impact threshold of a confounding variable (ITCV). ITCV < 0 because rxy < 0 The inference would be invalid if |impact of a confounding variable| > .130.
63
Step 3b: Component correlations and Interpretation
ITCV=.130 If rx∙cv = ry∙cv =r, then impact= rx∙cv x ry∙cv =r2 An omitted variable must be correlated at .361 with retention and with achievement (with opposite signs) to invalidate the inference of an effect of retention on achievement. Conceptualize multiple omitted vars Latent variable Declining conditional impacts
64
Impact Threshold for a Confounding variable: Path Diagram
65
Look at both Correlations
Impacts above the blue line invalidate the inference Imbens 2003, page 130; Carnegie Harada and Hill 2016 Imbens, G. W. (2003). Sensitivity to exogeneity assumptions in program evaluation. American Economic Review, 93(2), Smallest impact to invalidate inference: rxcv=rycv=.361 Imbens, G. W. (2003). Sensitivity to exogeneity assumptions in program evaluation. American Economic Review, 93(2), Carnegie N.B., Harada M., Hill J.L. Assessing Sensitivity to Unmeasured Confounding Using a Simulated Potential Confounder (2016) Journal of Research on Educational Effectiveness, 9 (3) , pp
66
Other Features of Impact Threshold for a Confounding Variable (Frank, 2000)
Covariates already in the model can report ITCV as partial or zero-order correlations Hazlett and Cinelli Making Sense of Sensitivity: Extending Omitted Variable Bias prefer zero-order to Imbens’ 2003. Suppression (or type II error for estimate not sigificant) Derived initially in terms of estimate and standard error Can apply to unreliably measured covariate
67
Path Diagram is Fundamentally Sociological
68
Comparison of Frameworks
Case replacement Good for experimental settings (treatments) or linear models Think in terms of cases (people or schools) counterfactual or unsampled population Assume equal effect of replacing any case Or weighted cases with weighted replacements Good for comparing across studies Correlational Uses causal diagrams (Pearl) Linear models only Think in terms of correlations, variables ITCV good for internal validity, not good for comparing between studies (different thresholds make comparison difficult)
69
New Directions Integrated framework
Full Bayesian approach that applies to internal and external validity (Tenglong Li) Value added: for a teacher or school rated as ineffective, how many students would you have to replace with average students to achieve an effective rating? (Qinyun Lin) Application to Comparative Interrupted Time Series (CITS) – Spiro Maroulis Mediation (Qinyun Lin) Non-linear (e.g., logistic) or multilevel regression Inference made from estimate/standard error (not deviance) Can be thought of as weighted least squares, assumes weights not related to replacement of cases Propensity score matching How many bad matches would you have to have to invalidate inference (Rubin liked this idea)
70
Professional learning communities
District administrators Networked improvement communities Professional learning communities Social Media Formal leaders Assignment of Students to Teachers
71
Replacement of Cases in Value Added Example
When we replace students the change of VAM: differences between original students and replacement students the other students who are not replaced do NOT change their scores in the replacement process A teacher could argue that her VAM would have been better with different students. How many? This Figure illustrates the student replacement idea with a toy data. As in the previous example, teacher Ashley has a value added score of 0.14, which could be the average of her students’ gain scores in a very simple value added setting. But threshold of proficiency is 0.15. Assume Ashley has 20 students, whose gain scores have a distribution represented as black and grey parts shown in Figure 3. Hypothetically, to improve Ashley’s value added score from 0.14 to 0.15, we can replace two students (the grey parts) with two grade average students whose gain scores are 0.16 (the white parts with black outline). This counterfactual thought experiment tells us that via replacing only two students with grade-average students, Ashely could achieve the threshold of proficiency. Or we can say that 10% of her students with grade average students to get her above the threshold.
72
Before replacement After replacement
Case replacement with Spillover for SUTVA Violations Before replacement Green: Original students that were replaced; Red: Replacement students that can trigger spillover effects; Black: Original students who stay in the class. Spillover effects Let’s look at this figure to illustrate this replacement idea. We still think about Ashley’s 20 students. The first figure here represents the 20 students’ distribution of gain scores before replacement. Then we think about replacing the students indicated by the green part with the other students represented by the red part. These two group of students have comparable gain scores but different background characteristics. Due to the different characteristics, the new students bring spillover effects to those students who stay in the class. That is why the students indicated by the black move to the right a little bit. And this is exactly the spiilover effect and also the change in the teacher’s value-added score after the replacement. After replacement
73
Policy, Public, and Pragmatic Sociologies
Policy Sociology Public Sociology Action $ X Y Pragmatic Sociology: Is there enough evidence to act? Cornfield You cannot prove! Fisher
74
R Shiny app KonFound-it! (konfound-it.com/)
Quantifying What it Would Take to Change an Inference: Toward a Pragmatic Sociology Ken Frank R Shiny app KonFound-it! (konfound-it.com/)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.