DIF detection using OLR

DIF detection using OLR
Paul K. Crane, MD MPH Internal Medicine University of Washington

Outline Statistical background DIFdetect package
What do we do when we find DIF? DIF adjustments to PARSCALE code How good are adjusted scores? Discussion

Statistical background
Recall definition of DIF: when demographic characteristic(s) interfere with relationship expected between ability level and responses to an item A conditional definition; have to control for ability level, or else we can’t differentiate between DIF and differential test impact

Logistic regression applied to DIF detection
Swaminathan and Rogers (1990) Tested two models: P(Y=1|X, group)=f(β1X+β2*group+β3*X*group) P(Y=1|X)=f(β1X) Compared the –2 log likelihoods of these two models to a chi squared distribution with 2 df Uniform and non-uniform tested at same time

Camilli and Shepard (1994) Recommended a two step procedure, to first test for non-uniform DIF and then for uniform DIF P(Y=1|X, group)=f(β1X+β2*group+β3*X*group) P(Y=1|X, group)= f(β1X+β2*group) P(Y=1|X)=f(β1X) -2 log likelihoods of each pair of models compared to determine non-uniform DIF and uniform DIF in two separate steps

Millsap and Everson (1994) Dismissive of “observed score” techniques such as logistic regression X contains several items that have DIF, so adjusting for X is theoretically problematic Advocated latent approaches such as IRT for DIF detection Very influential publication

Zumbo (1999) Extended Swaminathan and Rogers framework to ordinal logistic regression case to handle polytomous items Did not address latent trait; also used a single step rather than two steps

Crane, van Belle, Larson (2004)
Pointed out that logistic regression model is a re-parameterization of the IRT model as long as IRT-derived θ estimates are used as ability scores Addressed multiple hypothesis testing of non-uniform DIF; no difference between four different techniques of adjusting

Crane et al. (2004) – 2 Biggest change in terms of specific criteria for uniform DIF Recognized that non-uniform and uniform DIF were analogous to effect modification and confounding Employed epidemiological thinking about how to detect confounding relationships from the data

Crane et al. (2004) – 3 Same models used (though now θ not X)
P(Y=1|θ, group)= f(β1θ+β2*group) P(Y=1|θ)=f(β1’θ) Determine the impact of including the group term on the magnitude of the relationship between θ and item responses Determine size of |(β1-β1’)/β1|. If this is large, uniform DIF (confounding) is present Maldonado and Greenland simulation study on confounder selection strategies

Work still pending “Optimal” criteria for uniform and non-uniform DIF are unknown Adjust α for multiple hypotheses? How many multiple hypotheses? Effect size for non-uniform DIF? In huge data sets, likely to have a significant interaction term What proportional change in β1 is significant UDIF?

DIFdetect package Can download from the web
STATA-based user friendly package

Outline revisited Statistical background DIFdetect package
What do we do when we find DIF? DIF adjustments to PARSCALE code How good are adjusted scores? Discussion

What to do when we find DIF?
Educational settings often items with DIF are discarded Unattractive option for us Tests are too short as it is; lose variation Lose precision DIF doesn’t mean that the item doesn’t measure the underlying construct at all, just that it does so differently in different groups

What do we do – 2 Need a technique to incorporate items found to have DIF differently than DIF-free items Precedent for this approach in Reise, Widaman, and Pugh (1993) Constrain parameters for DIF-free items to be identical across groups Estimate parameters for items found with DIF separately in appropriate groups

Compensatory DIF Compensatory DIF occurs when DIF in some items leads to erroneous findings in other items Both false-positive and false-negative DIF findings Iterative process for each covariate until stable solution is reached (i.e., same items identified with DIF on separate runs of DIFdetect)

Adjustments to PARSCALE
Create a new dataset that treat items according to their DIF status No DIF 1 DIF 2 No DIF 3 Group 1 Missing Group 2 Group 3

Modified data set 0001 12XX2 0002 12XX4 0003 01XX3 … 0132 1X2X2

PARSCALE code Need new lines (new blocks) for all new items that we create We are automating this step as an extension to DIFdetect Current best advice is to use a huge table in Word Creation of new items is easy; we have STATA code for creation of virtual items

Preparation of data for PARSCALE

Reminder of PARSCALE tips
When outfiling from STATA, use wide format Use commas Change missing values to .x Open the file in Word and replace “.x” with X Remember to change 2-digit numbers to their appropriate letters

It gets complicated… This is the CASI, first run of education DIF, after looking at gender and age :

Table helps with PARSCALE code

Adjusted scores related to dementia and CIND
In the ACT study, controlling for CASI score (continuous): odds ratio of 2.9 ( ) for low DIF-adjusted IRT score (among those with low CASI scores) Adjusted for gender, education, and age Strict 2-stage sample design  verification bias In the CSHA, controlling for 3MS score (continuous): weighted odds ratio of 1.6 ( ) for dementia for low DIF-adjusted IRT score, and 1.4 ( ) for CIND Adjusted for education and language Sampling and weighting to deal with verification bias

Incorporation of adjusted scores into analyses
Here we are in novel territory Is there a reason not to adjust scores for DIF? Questions and comments

Comparison of OLR with other techniques
OLR is more flexible (can look at continuous constructs, e.g., education, without dichotomizing or grouping) DIFdetect is very fast When using IRT-derived θ scores, a re-parameterization of IRT analyses DIFdetect OLR incorporates epidemiology concepts of confounding and effect modification Teresi (ed) special issue of Medical Care to come out

DIF detection using OLR

Similar presentations

Presentation on theme: "DIF detection using OLR"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

DIF detection using OLR

Similar presentations

Presentation on theme: "DIF detection using OLR"— Presentation transcript:

Similar presentations

About project

Feedback