Download presentation
Presentation is loading. Please wait.
1
Using Hierarchical Growth Models to Monitor School Performance: The effects of the model, metric and time on the validity of inferences THE 34TH ANNUAL NATIONAL CONFERENCE ON LARGE-SCALE ASSESSMENT June 20 – 23 2004 Boston, MA Pete Goldschmidt, Kilchan Choi, Felipe Martinez UCLA Graduate School of Education & Information Studies Center for the Study of Evaluation National Center for Research on Evaluation, Standards, and Student Testing
2
Purpose: We use several formulations of multilevel models to investigate four related issues: one, whether (and when) the metric matters using longitudinal models to make valid inferences about school quality; two, whether different longitudinal models yield consistent inferences regarding school quality; three, the tradeoff between additional time points and missing data; and four, the stability of school quality inferences across longitudinal models using differing number of occasions We examine three types of models: Longitudinal Growth Panel Models Longitudinal School Productivity Models Longitudinal Program Evaluation Models:
3
Longitudinal Growth Panel Models (LGPM) Research Questions: Inferences affected by test metric (Scale Scores Vs. NCEs)? Estimates for Growth Estimates of school effects Estimates of Program effects
4
LGPM Longitudinal Panel Design Keep track of students’ achievement form one grade to the next E.g., collect achievement scores at Grades 2, 3, 4, and 5 for students in a school Focus on students’ developmental processes What do students’ growth trajectories look like?
5
Choice of Metric: IRT-based scale scores Vertically equated scores across grades and years Theoretically represent growth on a continuum that can measure academic progress over time Change from year to year is an absolute measure of academic progress Change represents a relative position from year to year not absolute growth in achievement Relative standing compared to a norming population Scale Scores:Normal Curve Equivalents:
6
Student Characteristics 2
7
Sampling Conditions for Monte Carlo Study 3
8
Summary Parameter Estimates Compared 4
9
Summary of Estimates Compared Using Rank Order Correlations 5
10
Summary of Results Describing SAT-9 Reading Achievement 6
11
Percent Reduction in Between School Variation in Growth 7
12
Correlations between Estimated Coefficients – Model 4 11
13
Correlation Pattern between Sampling Condition and Model – Reading SAT-9 Growth 12
14
Comparison of relative Bias to the Effect Size of Growth 13
15
Relationship between Relative Bias in NCEs for Initial Status 14
16
17
17
Relationship between Relative Bias in NCEs for Growth 15
18
Longitudinal School Productivity Model (LSPM) Research Questions: Inferences affected by test metric (Scale Scores Vs. NCEs)? Estimates for growth Estimates of school effects Estimates of “Type A” and “Type B” effects
19
LSPM Multiple-cohorts design (Willms & Raudenbush, 1989; Bryk et. al., 1998) Monitor student performance at a school for a particular grade over years E.g., collect achievement scores for 3 rd grade students attending a school in 1999, 2000, and 2001 Focus on schools’ improvement over subsequent years
20
Research Question To what extent does the choice of the metric matter when the focus is school improvement over time (NCE vs. scale score)? A Multiple-cohort school productivity model is used as the basis for inferences about school performance
21
3-level Hierarchical Model for measuring school improvement Model I : Unconditional School Improvement Model Level-1 (within-cohort) model: Y ijt = β jt0 + r ijt * β jt0 : estimates of performance for school j (j = 1,.., J) at cohort t (t = 0,1,2,3,4) Level-2 (between-cohort, within-school) model: β jt0 = j0 + j1 Time tj + u jt * j0 : status at the first year (i.e., Time tj = 0) or initial status for school j * j1 : yearly improvement / growth rate during the span of time for school j Level-3 (between-school) model: j0 = 00 + V j0 * 00 : grand mean initial status j1 = 10 + V j1 * 10 : grand mean growth rate
22
Model II: Student characteristics Level-1 (within-cohort) model: Y ijt = β jt0 + β jt1 SPED ijt + β jt2 LowSES ijt + β jt3 LEP ijt + β jt4 Girl ijt +β jt5 Minority ijt + r ijt Level-2 (between-cohort, within-school) model: β jt0 = j0 + j1 Time tj + u jt Level-3 (between-school) model: j0 = 00 + V j0 j1 = 10 + V j1
23
Model III: Student characteristics & School intervention indicator Level-1 (within-cohort) model: Y ijt = β jt0 + β jt1 SPED ijt + β jt2 LowSES ijt + β jt3 LEP ijt + β jt1 Girl ij4 +β jt5 Minority ijt + r ijt Level-2 (between-cohort, within-school) model: β jt0 = j0 + j1 Time tj + u jt Level-3 (between-school) model: j0 = 00 + 01 LAAMP j + V j0 j1 = 10 + 11 LAAMP j + V j1
24
Model IV: Full Model Level-1 (within-cohort) model: Y ijt = β jt0 + β jt1 SPED ijt + β jt2 LowSES ijt + β jt3 LEP ijt + β jt1 Girl ij4 +β jt5 Minority ijt + r ijt Level-2 (between-cohort, within-school) model: β jt0 = j0 + j1 Time tj + u jt Level-3 (between-school) model: j0 = 00 + 01 (%Minority j ) + 02 (%LowSES j ) + 03 (%LEP j ) + 04 LAAMP j + V j0 j1 = 10 + 11 (%Minority j ) + 12 (%LowSES j ) + 13 (%LEP j ) + 14 LAAMP j + V j1
25
Comparison of Key Parameters: NCE vs. Scale Score Type A effect: includes effects of school policies and practice, educational context, and wider social influences Type B effect: Includes the effects of tractable policies and practices, but excludes factors that lie outside the control of the school
26
NCE Results vs. Scale Score Results School Ranking (rank-order corr.) School improvement / growth rate parameter (corr. between estimates) Effect size (par. est. / s.d. of outcome) Statistical significance of the effect of the school intervention indicator variable
27
Conclusion NCE vs. scale score – for the purpose of measuring school improvement under the multiple-cohort design: Differences are minimal in terms of: school ranking school improvement / growth rate effect size statistical significance of the effect of the school intervention indicator
28
Conclusion (cont’d) Results are consistent across sampling conditions, models, and content area
29
Demographic
30
Samples
31
Reduction in Standard Error (SE) for Average Growth b/t Schools
32
100% Sample SE Reduction AVG GROWTH
33
85%-70%-55% Samples SE Reduction AVG GROWTH
34
Reduction in Standard Error (SE) for Average Status b/t Schools
35
100% Sample SE Reduction AVG STATUS
36
85%-70%-55% Samples SE Reduction AVG STATUS
37
Reduction in Tau for Average Growth b/t Schools
38
100% Sample Tau Reduction AVG GROWTH
39
85%-70%-55% Samples Tau Reduction AVG GROWTH
40
Reduction in Tau for Average STATUS b/t Schools
41
100% Sample Tau Reduction AVG STATUS
42
85%-70%-55% Samples Tau Reduction AVG STATUS
43
Program Effect on AVG GROWTH b/t Schools by Sample and Number of Occasions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.