Analyzing Stability in Colorado K-12 Public Schools Final Presentation Shelby Harold INFO 3110
What is Stability? Stability is a measures the amount of students who stayed within a school district during the year. Students leave the school district for many reasons Moving houses, opting into a better district, dropping out, graduating early, etc. The stability rate is the percentage of students who remained in a district without interruption throughout the school year. 𝐶𝑜𝑢𝑛𝑡 𝑜𝑓 𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑊ℎ𝑜 𝑅𝑒𝑚𝑎𝑖𝑛𝑒𝑑 𝑖𝑛 𝑡ℎ𝑒 𝐷𝑖𝑠𝑡𝑟𝑖𝑐𝑡÷ 𝑆𝑡𝑢𝑑𝑒𝑛𝑡𝑠 𝑎𝑡 𝑡ℎ𝑒 𝐵𝑒𝑔𝑖𝑛𝑛𝑖𝑛𝑔 𝑜𝑓 𝑡ℎ𝑒 𝑌𝑒𝑎𝑟 For example, if a school started with 100 students and 87 students remained in the district at the end of the year… The stability rate would be 87% Even if kids transferred in, it would be 87%
Why is it Important? Having a stable student base is essential for a school’s future success Schools that have higher stability rates perform better Teachers make better connections with the students and can tailor their teaching accordingly
Why is this important to me? My passion is reforming public K-12 schools in America. I am interested in what makes our schools work and what doesn't. If increasing stability is the way to better our public schools, then investigating what effects stability is important.
My Data Data collected and published by Colorado Department of Education Combination of 19 data sets Experimental Region: School Districts (179)
More Data Stuff SUBGROUPS Race Gender Grade Level Students with Disabilities English as Second Language Economically Disadvantaged Homeless Students VARIABLES Average Salary of Teachers Average Salary of Principals Funding per Pupil Percentage of Free and Reduced Lunch Graduation Rate Urban vs. Rural Average Home Value
Overall Aim of the Project Analyze what factors effect stability What subgroups are the most unstable? NOT analyzing the effects of stability, but rather what may affect it
Basic Descriptive Statistics of the Stability Rates in Colorado Mean: 81.52 SE Mean: .429 Standard Deviation: 5.744 Minimum: 63.4 Q1: 77.8 Median: 82.6 Q3: 85.6 Maximum: 93.7
Dot Plot of Overall Stability Rate The data looks pretty normally distributed so unfortunately there might not be a need for non-parametric statistics.
Is it really normally distributed? In Minitab, there is a normality test that determines how well a given set of data follows a normal distribution!! The p- value is very low so we can conclude that the data follows a normal distribution.
Correlation of Variables I constructed a correlation chart to determine if any of my variables were related. If two variables have too high of a correlation, it could distort regression results. There is no need for two variables that are highly correlated; one will do.
Pearson Correlation: -.66 Correlated Variables Pearson Correlation: .564 Pearson Correlation: -.66
Test 1: Is the mean stability rate between urban and rural schools different? Null: There is no statistically significant difference in the means of the urban and rural school district’s stability rates. Alternative: There is a statistically significant difference between the means of the urban and rural school district’s stability rates. Two Sample Test Binary Variable: Urban and Rural Continuous: Stability Rate
Urban schools look like they might be slightly more stable. Let’s see! Dot Plot Urban schools look like they might be slightly more stable. Let’s see!
Two Sample Test As predicted, the mean of the urban schools was higher. However, it is not statistically significant. * The p-value is too high and the confidence interval contains 0. Therefore, we fail to reject the null and must accept the alternative hypothesis. There is no difference in the means of urban and rural schools.
Test 2: Does funding have an effect on stability rate? Null: There is no correlation between funding per pupil and stability rate. Alternative: There is a correlation between funding per pupil and stability rate. Simple Linear Regression Continuous Input: Funding Per Pupil Continuous Output: Stability Rate In other words, does funding per pupil predict stability rate?
Scatter Plot It’s not looking so good Let’s run the regression anyways!
Simple Linear Regression Output The p-value of .852 is way too high!! R-squared is super low Because the p-value is higher than .1, we fail to reject the null. We cannot say that there is a correlation between funding per pupil and stability. We cannot use funding as a predictor.
What single factor is the best predictor of stability? Since funding was not a good predictor, I wanted to investigate what the single best predictor among my variables is. So I ran simple linear regressions on stability rate against each of my variables. The p-values are as follows… Graduation Rate: 0 Average Principal Salary: 0.05 Average Teacher Salary: 0.011 Drop-out rate: 0 Free and Reduced Lunch Percentage: 0 Average Home Value: 0.1 Urban v. Rural: 0.186 Clearly, I picked the worst predictor in my first regression analysis! Graduation Rate, Drop-Out Rate, and FRL Percentage would all be good predictors of stability.
Test 3: What variables affect stability rate? Null: None of the variables have significant p-values. Alternative: At least one variable has a significant p-value Multiple Linear Regression Single continuous output: Stability rate Multiple continuous inputs: All other variables
Multiple Linear Regression JMP Multiple Linear Regression Output
Eliminating Variables I used backward stepwise to delete any insignificant variables. The output below shows what I was left with. The R-Squared Value is still not the best, but all the variables have significant p-values
Equation Using the significant variables, we can develop an equation to predict stability rates. Stability Rate= 80.062+(0.000141*Average Teacher Salary)+ (-1.0091*Total Dropout Rate)+(-0.0587*Free and Reduced Lunch Percentage)
Test 4: Which subgroups are the most unstable? Using ANOVA
Gender
Gender There is no significant difference in stability rate between male and female students.
Race
ANOVA Test for Race There could be a significant difference between races
Graphic Representation of Race
Notable Conclusions on Race Hispanic and White students could have a similar mean. The biggest difference was between White students and Native Hawaiian students. Overall, Native Hawaiians were the most unstable race.
Grade Level
Tukey Pairwise Comparisons for Grade
Notable Grade Level Conclusions Even though, some grades have different means, the majority of the grades could have similar means. The most unstable group is half-day kindergarten with a stability rate of 18.74% Seniors have the highest mean stability rate.
Types of Students
ANOVA Test for Types of Students ESL Students are in a group of their own Migrant and Homeless students could have similar means
Notable Type of Student Conclusions Most of the subgroups of students could have a mean similar to the overall stability rate. Homeless students and migrant students are far less stable than other types of students. It might be useful to partner with a homeless shelter to keep students engaged and in one location.
My Favorite Findings The mean stability rate is only 81.5%! There is no difference in the means between urban and rural schools. I assumed urban schools would be much more unstable because of the types of jobs. I thought funding would have an impact on stability. If funding was low, students will leave the district. It turned out this was the worst predictor of stability. Grade level has very little to do with stability. Gender has no significant effect on stability. If school boards want to increase stability, their polices should target Native Hawaiians students, migrant students, and homeless students.