Download presentation
Presentation is loading. Please wait.
Published byMakayla Ayling Modified over 9 years ago
1
Exploring the Full-Information Bifactor Model in Vertical Scaling With Construct Shift Ying Li and Robert W. Lissitz
2
contents QuestionsMethodsResults
3
1.Unidimensionality of tests at each grade level 2.Test construct invariance across grade ╳ bifactor model Vertical scaling
4
1.Computational simplicity for estimation 2.Ease of interpretation 3.Vertical scaling problems across grades Questions
5
Purpose to propose and evaluate a bifactor model for IRT vertical scaling that can incorporate construct shift across grades while extracting a common scale, to evaluate the robustness of the UIRT model in parameter recovery, and to compare parameter estimates from the bifactor and UIRT models
6
Method Bifactor Model Gibbons and Hedeker (1992) generalized the work of Holzinger and Swineford (1937) to derive a bifactor model for dichotomously scored item response data. 0 : general factor or ability i0 : discrimination parameter for general factor s :group special factor or ability is : discrimination parameter for group-special factor d i :overall multidimensional item difficulty
7
simulation Common item design Bifactor model for generating data Concurrent calibration –Stable results
8
Sample size Number or percentage of common items –Test length:60 Variance of grade-specific factor: degree of construct shift 100 replication
9
Data generation Item discrimination parameters were set deliberately and repeatedly at 1.2, 1.4, 1.6, 1.8, 2.0, and 2.2 for the general dimension, and fixed at 1.7 for grade-specific
10
Identifications of Bifactor Model Estimation For the general dimension, the variance of the general latent dimension was fixed to 1, and the discrimination parameters (loadings) were freely estimated in the study For the grade-specific dimensions (s = 1, 2,..., k), the discrimination parameters (s = 1, 2,..., k) (loadings) were fixed to the true parameter value 1.7, so that the variances of the grade-specific dimensions could be freely estimated the common items answered by multiple groups were restricted so that they would have unique item parameters in the multiple-group concurrent calibration
11
Model Estimation Multigroup concurrent calibration was implemented The computer program IRTPRO (Cai, Thissen, & du Toit, 2011), using marginal maximumlikelihood estimation (MML) with an EM algorithm, was used to estimate the models
12
Evaluation criteria
13
RMSE = root mean square error; SE = standard error; SS = sample size; CI = common item; VR = variance of grade-specific factor. Results
14
Person parameter person parameter estimates of the general dimension were better recovered than that of the grade-specific dimensions when the degree of construct shift was small or moderate sample size the estimation accuracy
15
Group parameter overestimated
16
UIRT discrimination: overestimated Difficulty: well recovered construct shift person & group mean
17
ANOVA Effects for the Simulated Factors Three-way tests of between-subject effects (ANOVA) –bias, RMSE, and SE Sample sizeDegree of Construct shift Percentage of common item Bifactor modelsmall~ moderate No~small Small Bias in d UIRTsmall~ moderate Large : & Group Mean ability Large bias in d & d & SE Group Mean ability grade-specific variance parameter Large: SE
18
comparison UIRT: overestimate discrimation parameter person and group mean parameter: Less accurate
19
Real data 2006 fall Michigan mathematics assessments Grade 3, 4, 5 Randomly 4000 examinees Bifactor vs UIRT
20
Variance estimation
21
R=0.983
22
Discussion sample size the estimation accuracy & stability Variance of grade-specific dimension stability Be caution about construct shift Polytomounsly/mixed item format Incorporate covariates longitudinal studies
23
Common item measure two group-specific ability ? the item discriminate parameter fixed to the true value Multidimensional IRT?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.