Download presentation
Presentation is loading. Please wait.
1
Examining Achievement Gaps
Liru Zhang Delaware Department of Education Shudong Wang, NWEA Paper presented at the 2016 National Conference on Student Assessment (NCSA)
2
Background Closing the achievement gaps has been a big challenge to the public education. Many called for cautions to avoid misleading results and inappropriate interpretations of using test scores to report achievement gaps (Hoover & Han, 1995; Willingham & Cole, 1997; Hill, 2000; Holland, 2001, 2002; Spratt, 2002). In the past decade or two, researchers have proposed different methodologies for valid comparisons between subgroups and meaningful interpretations of analysis results, such as t-test, Effective Size (Cohen, 1988), Cumulative Distribution Function (Holland, 2001), Log-Odd Ratio (Hill, 2000); Time Series Analysis (McClave and Benson, 1988), DIF, ANOVA, Time Series Regression Analysis, and model-based approaches (e.g., Structure Modeling). Rather than just use the different mean scores to determine the achievement gaps
3
About This Presentation
The current presentation is to demonstrate different methods with empirical data for achievement gaps. The methods include: (1) Descriptive statistics (2) Cohort Analyses with effective size and log-odds ratio for the magnitude of achievement gaps (3) Time Series Analyses for the trend of achievement gaps over time (4) Cumulative Distribution Function Analyses for the patterns of achievement gaps
4
Descriptive Statistics
Group Score N Range Min. Max. Mean SD African American SS 2321 560 2280 2840 103.42 Theta 7.05 -2.96 4.1 -0.23 1.3 SEM 164 19 183 43.73 15.24 White 3905 582 2862 119.67 7.34 4.38 0.64 1.51 115 18 133 36.55 13.53 N-count LOSS/HOSS fixes the min/Max scores; and influenced the mean, SD, and SEM … SD indicates the how stretch-out of test scores With Rasch model, the commonly used criterion for SEM is 1/4-1/3 ( ) to the SD AA= 44/103=.42; W=37/120=.30
5
Effective Size Guidelines (Cohen, 1988): d = 0.20 - small
d = medium d = large AREA made a strong recommendation about reporting the statistically significantly difference between two groups (e.g., t-test), the actual mean difference must be considered because of the large sample size. Effective Size (d) is calculated by comparison groups to determine the achievement gap for aggregated test scores. The magnitude of d varies about zero. If mean (a) is higher than mean (b), d is positive; if mean (c) is higher than mean (t), d is negative
6
Effective Size: Grade 3 Math
Female N: 4247 Mean: 432 SD: 42.8 Male N: 4499 Mean: 436 SD: 45.7 Mean Diff t-value (p<.20) Effective Size 0.09 (small) Guidelines (Cohen, 1988): d = small d = medium d = large The significant level of t-test is consistent to the result of ES The interpretation is the mean difference is about 9% of a SD
7
Effective Size: Grade 8 Math
A.A. N=3197 Mean=476 SD=31.7 White N=5601 Mean=509 SD=38.7 Mean Diff t-value (p<.001) Effective Size 0.89 (large) Guidelines (Cohen, 1988): d = small d = medium d = large The significant level of t-test is consistent to the result of ES The interpretation is that the mean difference between B/W is about 80% of a SD
8
Percentage of Proficiency Level (%)
Group Grade Content % of Proficient Log-Odds Ratio Yes No Diff 1 2 Female 3 Math 72.7 27.3 1.0 0.98 0.05 Male 73.7 26.4 1.03 10 Reading 74.7 25.4 -9.4 1.08 -0.45 65.3 34.7 0.63 A. A. 8 27.5 72.5 36.7 -0.97 1.55 White 64.2 35.8 0.59 39.3 60.7 31.2 -0.44 1.30 70.5 29.6 0.87 Log-Odds Ratio is used to convert the percentage of proficient students at each subgroup to an interval scale to derive a reliable comparison of gaps for aggregated data. Log-Odds Ratio can be obtained by taking the natural logarithm of the ratio of the proportion of proficient students divided by the proportion of not proficient students.
9
Log-Odds Ratio Log-Odds Ratio is usually used to convert percents to an interval scale in test equating and other statistical applications, which allows a meaningful comparison no matter where on the scale the change occurs (Hill, 2000). Log-Odds Ratio can be obtained by taking the natural logarithm of the ratio of the proportion of proficient students divided by the proportion of not proficient students to derive a reliable comparison of gaps for aggregated data.
10
Log-Odds Ratio It is easier to increase 10% if the starting point is in the 40-50% range compared to a starting point in the % range. Obviously a change of 10% in one range is not equivalent to the same 10% of change in another range (Spratt, 2002). For example, the percent changes from 60% to 70% and from 80% to 90% are both 10%, but the difference of the corresponding Log-Odds Ratio is ( =.442) for the former and .811 ( =.811) for the later one.
11
Log-Odds Ratio: Grade 8 Math
Year One African American 27.5% (Yes); 72.5% (No) White 64.2% (Yes); 35.8% (No) % Diff. 36.7% Log-Odds Ratio for: A.A.= -0.97; White= 0.59 Log-Odds Ratio Diff.: 1.55 Year Two African American 32.0% (Yes); 68.0% (No) White 66.2% (Yes); 33.7% (No) % Diff. 34.0% Log-Odds Ratio for: A.A.= -0.75; White= 0.68 Log-Odds Ratio Diff.: 1.43 Log-Odds Ratio Diff between the two years: 0.12 Interpretations:
12
Time Series Analysis Time Series Analysis (McClave and Benson, 1988) is applied to analyze the trend of achievement gaps over time. The Effective Size (d) can be used to calculate the Index Number (I). The base year index number is served as the baseline for longitudinal comparisons. Where: Index Number (I)= (d (target year) / d (base year)) * 100 d (target year) – The d-value of the year to be examined d (base year) – The d-value of the baseline year
13
Time Series Analysis A linear trend is represented as a straight line: - A positive linear trend indicates that overall the average performance for a group of students form a gradually rising line; - A negative linear line indicates a gradually declining one, even though wide variation of individual scores or the comparison of the first and the last scores does not show statistically significant difference. A quadratic trend is represented as a simple curve: - A positive quadratic trend indicates that scores form a curve with one or both ends higher than the center; - A negative quadratic trend curve indicates a simple curve with the center is higher than one or both ends.
14
Time Series Analysis in Mathematics by Race
Vertical scale – Index Number Horizontal scale - year
15
Time Series Analysis in Reading by Gender
Vertical scale – Index Number Horizontal scale - year
16
Cumulative Distribution Function
Cumulative Distribution Function (Holland, 2002) is used to examine the patterns of achievement gaps and the changes in gaps between comparison groups over time. The CDF curve is denoted by the following formula: F (x) = P (X ≤ x) Where: The right-hand side represents the probability that the random variable X takes on a value less than or equal to x.
17
Cumulative Distribution Function (CDF)
Both mean and standard deviation have influence on CDFs. Changes in mean values change the location of the CDF along the score scale; changes in the slope of the CDF are the result of changes in SD. The achievement gap can be displayed by examining CDFs for the shape, the slope, the space between the two curves and the location of cut scores for proficiency level between the comparison groups. The gaps depend on the range of test scores, which would provide a more comprehensive pattern of achievement gaps. Using the whole group of scores obviously leads more complex ideas about the gaps than the simple difference or ‘gap’ between two averaged scores (Holland, 2002).
18
Grade 3 Mathematics by Gender
(451, 508, 529) (17 vs. 21; 74 vs. 80; 89 vs. 92)
19
Grade 3 Mathematics by Gender
(451, 508, 529)
20
Grade 8 Mathematics (cut scores 493, 531, 549)
21
Grade 8 Mathematics by Race
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.