Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Re-Evaluation of The Tennessee STAR Project

Similar presentations


Presentation on theme: "A Re-Evaluation of The Tennessee STAR Project"— Presentation transcript:

1 A Re-Evaluation of The Tennessee STAR Project
Alexander Lebedinsky, PhD. Adam Pendry

2 What is The STAR Project?
Conducted in 1985, the Tennessee student teacher achievement ratio (STAR) project was a random treatment control group study. It’s implementation was designed to isolate the treatment effect of smaller class sizes on student test scores. Students & Teachers Small 13-17 Regular 22-25 Regular +Aide 22-25

3 Further Details In total 80 schools from across the state of TN were involved. These schools voluntarily signed up, and were not randomly chosen. A school had to have a minimum of 57 students to be eligible (be able to fill one of each class type). The first year saw 6,500 students distributed across 330 classrooms.

4 Importance Mosteller (1995) The Tennessee Study of Class Size in the Early School Grades Cited 554 times as of June 17th 2014 Kruger (1999) Experimental Estimates of Education Production Function Cited 1282 times as of June 17th 2014 Boozer and Caccolia (2001) Inside the 'Black Box‘ of Project STAR: Estimation of peer effects using experimental data Cited 112 times as of June 17th 2014

5 Criticism Parental involvement and selection bias.
Given that the students’ parents were informed about the experiment some opted for their children to be in the treatment group. This lead to a selection bias towards the smaller class sizes. Attrition & Addition Rates Throughout the school year students would enter and exit the project leading to imperfect conditions. Although not too prevalent in Kindergarten, later grades were far more susceptible.

6 Precursor to Our Research ‘Bad Apples’
How do bad apples effect classroom performance?

7 Findings After calculating leave-out means to determine ‘bad apples’ we stumbled upon this. newid StudentScore ClassAvg ClassSd 1 21.93 6.52 6.02 2 20.35 6.62 6.27 3 19.65 6.66 6.37 4 11.86 7.15 7.05 5 7.83 7.40 6 6.84 7.46 7 6.19 7.50 7.14 8 5.40 7.55 7.13 9 5.36 10 5.17 7.57 7.12 11 4.78 7.59 7.11 12 3.89 7.65 7.09 13 2.53 7.73 7.03 14 1.91 7.77 7.00 15 1.37 7.80 6.97 16 1.07 7.82 6.95 17 0.08 7.88 6.88 Delete class id column Change student id to to 1 2 3, Change name of id column to “Student” Delete last two columns Fewer digits after the decimal point for scores

8 Shifting Focus The previous findings inspired the question is STAR truly random assignment? We seek to measure the non-randomness through the use of the first two moments. Two sample t-tests. Using leave-out means and unequal variances. Ratio of Variances. Compare ratios of groups defined by the previous t-test statistics

9 Two Sample T-test & Leave-Out Means
42.5 42.5 56.4 56.4 T-stat -5.3 73.3 73.3 73.8 73.8 T-stat 2.8 75.8 75.8 82.6 82.6 Leave-Out mean of 71.1 Leave-Out mean of 63.1 When we run the t-test we compare the excluded class average with that of the leave-out mean. The results from this example show a t-stat of -5.3 and 2.8 respectively, thus under our cut-off of 3 standard deviations the second class is only just under our specification of a ‘flagged’ class.

10 Original Dataset Results: Means
For a cut-off t-stat=3 36 classes were flagged across 17 schools. 16 Small Classes. 13/16 were positive outliers thus 81.25% of these were more than 3 standard errors above the school leave-out mean. 20 Regular and Regular with Aide Classes. 15/20 were negative outliers thus 75% of these were more than 3 standard errors below the school leave-out mean.

11 Ratio of Variances Calculated by the ratio between School variance and class variance. With larger variance in the denominator. Assuming no selection bias. We expect to see a ratio between class and school variance of 1.0. Assuming a selection bias. We would expect classes to have a smaller variance and thus a ratio < 1.0.

12 Ratio of Variances 25 50 75 100 Test Scores Assuming a selection bias where small classes were picked from the top end of the distribution and vice-versa. Regular Class Small Class

13 25 50 75 100 Test Scores Ratio of Variances Here we see the distributions of Small & Regular classes super imposed on the school. School Var. 19 85

14 Original Dataset Results: Variance Ratios
When comparing the ratio of variances between the flagged and non-flagged classes. Flagged classes had an average ratio of 0.85 Non-flagged classes had an average ratio of 0.97 The combined results of both moments tend to suggest some type of selection bias occurring in the dataset.

15 Simulation What if the STAR experiment was repeated one million times with built-in random assignment? What proportion of classes are flagged? What does the ratio of variances look like? Are Smaller Classes more likely to be positive outliers?

16 Simulation: Setup Using the original data.
Reshuffle the students within each of the 79 schools. Re-calculate the t-stats to flag any outlying classes. Repeat the entire operation one million times. From the final results calculate expected proportion of flagged classes. Compare the mean values for the variance ratios.

17 Simulation: Student Scores
Given that we are using the original data we had to be able to remove the treatment effect prior to re-shuffling. To do this we use the residuals from Kruger’s (1999) original regression. Using these values as a base ‘intelligence’ score for each student we then added on the beta coefficient values of student characteristics to calculate the new percentile score. This method allows us to reshuffle the students in absence of the treatment effect.

18 Simulation: Student Scores
SES: 1 Female: 0 Race: 1 Student: 2 SES: 0 Female: 1 Race: 1 Student: 3 SES : 0 Female: 0 Race: 0 Student: 4 SES: 1 Female: 0 Race: 1 Student: 5 SES: 1 Female: 1 Race: 0 Array of Student Residuals 𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑆𝑐𝑜𝑟𝑒= 𝛼+ 𝛽 1 𝑆𝑜𝑐𝑖𝑜𝐸𝑐𝑜𝑛𝑆𝑡𝑎𝑡𝑢𝑠+ 𝛽 2 𝐹𝑒𝑚𝑎𝑙𝑒+ 𝛽 3 𝑅𝑎𝑐𝑒+ 𝛽 1 = 𝛽 2 = 4.705 𝛽 3 = 9.540 Student 1 Score = 54 Student 2 Score = 34 Student 3 Score = 67 Student 4 Score = 23 Student 5 Score = 78

19 Simulation: Results When we held the treatment effect at 5.55.
The highest frequency of flagged classes was 8. Two times the treatment effect. The highest frequency was 14 Three times the treatment effect. The highest frequency was 32 Despite having a treatment effect three times larger than the original we still did not see 36 flagged classes.

20 Conclusion: Randomization
Using a combination of the original data and a simulation. Given that we did not see 36 flagged classes in the simulation of one million iterations we can conclude that the probability of such an event happening is virtually zero.

21 Conclusion: Regression
Results of the original regression Results with flagged schools removed Variable Parameter Estimate Standard Error P-Value Intercept 54.62 2.94 <.0001 Small Class Size 5.55 0.76 Regular Class Size 0.22 0.73 0.77 White 9.54 1.27 Female 4.70 0.60 Socio-Economic Status -13.21 Teacher Race -1.09 1.21 0.37 Teacher Exp. 0.27 0.06 Teacher Degree -1.01 0.79 0.20 Variable Parameter Estimate Standard Error t Value Intercept 54.48 3.59 <.0001 Small Class Size 2.92 0.96 0.0024 Regular Class Size 1.05 0.92 0.25 White 10.15 1.62 Female 4.41 0.76 Socio-Economic Status -13.86 0.87 Teacher Race -0.13 2.23 0.95 Teacher Exp. 0.16 0.08 0.05 Teacher Degree -1.19 0.93 0.20 Removing the flagged schools cuts the estimate of the treatment effect in half.

22 Further Investigation
Test other variables for randomization. Gender, race, socio-economic status, and age. Using simulation data. Run a regression for each iteration and calculate a treatment effect beta. Plot the distribution of the treatment effect for hypothesis testing. Remove flagged classes from each simulation and see what the reduction in treatment effect is. Compare with the reduction seen in the original data.


Download ppt "A Re-Evaluation of The Tennessee STAR Project"

Similar presentations


Ads by Google