1 Differential Item Functioning in Mplus Summer School Week 2.

Slides:



Advertisements
Similar presentations
Correlational and Differential Research
Advertisements

DIF Analysis Galina Larina of March, 2012 University of Ostrava.
Structural Equation Modeling Using Mplus Chongming Yang Research Support Center FHSS College.
Logistic Regression Psy 524 Ainsworth.
Confirmatory factor analysis GHQ 12. From Shevlin/Adamson 2005:
Mixture modelling of continuous variables. Mixture modelling So far we have dealt with mixture modelling for a selection of binary or ordinal variables.
What is Interaction for A Binary Outcome? Chun Li Department of Biostatistics Center for Human Genetics Research September 19, 2007.
Latent Growth Curve Modeling In Mplus:
Overview of field trial analysis procedures National Research Coordinators Meeting Windsor, June 2008.
Statistical Analysis SC504/HS927 Spring Term 2008 Week 17 (25th January 2008): Analysing data.
© UCLES 2013 Assessing the Fit of IRT Models in Language Testing Muhammad Naveed Khalid Ardeshir Geranpayeh.
Chapter 14 Inferential Data Analysis
Ordinal Logistic Regression “Good, better, best; never let it rest till your good is better and your better is best” (Anonymous)
Mixture Modeling Chongming Yang Research Support Center FHSS College.
Structural Equation Modeling Using Mplus Chongming Yang, Ph. D
Unit 6: Standardization and Methods to Control Confounding.
Wednesday PM  Presentation of AM results  Multiple linear regression Simultaneous Simultaneous Stepwise Stepwise Hierarchical Hierarchical  Logistic.
DIFFERENTIAL ITEM FUNCTIONING AND COGNITIVE ASSESSMENT USING IRT-BASED METHODS Jeanne Teresi, Ed.D., Ph.D. Katja Ocepek-Welikson, M.Phil.
Session 1 1 Check installations 2 Open Mplus 3 Type basic commands 4 Get data read in, spat out &read in again 5 Run an analysis 6 What has it done?
Simple Linear Regression
Part 2 DIF detection in STATA. Dif Detect - Stata Developed by Paul Crane et al, Washington University based on Ordinal logistic regression (Zumbo, 1999)
STRONG TRUE SCORE THEORY- IRT LECTURE 12 EPSY 625.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State.
SAS PROC IRT July 20, 2015 RCMAR/EXPORT Methods Seminar 3-4pm Acknowledgements: - Karen L. Spritzer - NCI (1U2-CCA )
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
Estimation Kline Chapter 7 (skip , appendices)
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Confirmatory Factor Analysis Psych 818 DeShon. Construct Validity: MTMM ● Assessed via convergent and divergent evidence ● Convergent – Measures of the.
Chapter 3.1.  Observational Study: involves passive data collection (observe, record or measure but don’t interfere)  Experiment: ~Involves active data.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Session 1 1 Check installations 2 Open Mplus 3 Type basic commands 4 Get data read in, spat out &read in again 5 Run an analysis 6 What has it done?
1 DIF. 2 Winsteps: MFQ & DIF 3 Sample 2500 “boys” and 2500 “girls” All roughly 14 years old Data collected from ALSPAC hands-on clinic Short-form (13-item)
Measurement Models: Identification and Estimation James G. Anderson, Ph.D. Purdue University.
METHODS Sample: The Institute for Survey Research of Temple University conducted face-to-face interviews for the 1995 National Alcohol Survey (NAS). The.
Differential Item Functioning. Anatomy of the name DIFFERENTIAL –Differential Calculus? –Comparing two groups ITEM –Focus on ONE item at a time –Not the.
CFA: Basics Beaujean Chapter 3. Other readings Kline 9 – a good reference, but lumps this entire section into one chapter.
The Impact of Missing Data on the Detection of Nonuniform Differential Item Functioning W. Holmes Finch.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
IMPORTANCE OF STATISTICS MR.CHITHRAVEL.V ASST.PROFESSOR ACN.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Estimation Kline Chapter 7 (skip , appendices)
ALISON BOWLING CONFIRMATORY FACTOR ANALYSIS. REVIEW OF EFA Exploratory Factor Analysis (EFA) Explores the data All measured variables are related to every.
Tutorial I: Missing Value Analysis
Demonstration of SEM-based IRT in Mplus
1 Ordinal Models. 2 Estimating gender-specific LLCA with repeated ordinal data Examining the effect of time invariant covariates on class membership The.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Nonparametric Statistics
Quantitative methods and R – (2) LING115 December 2, 2009.
The Invariance of the easyCBM® Mathematics Measures Across Educational Setting, Language, and Ethnic Groups Joseph F. Nese, Daniel Anderson, and Gerald.
Chapter 17 STRUCTURAL EQUATION MODELING. Structural Equation Modeling (SEM)  Relatively new statistical technique used to test theoretical or causal.
Chapter 13 LOGISTIC REGRESSION. Set of independent variables Categorical outcome measure, generally dichotomous.
Methods of Presenting and Interpreting Information Class 9.
Nonparametric Statistics
EHS Lecture 14: Linear and logistic regression, task-based assessment
The University of Manchester
Dr. Siti Nor Binti Yaacob
A Different Way to Think About Measurement Development:
Correlation, Regression & Nested Models
CJT 765: Structural Equation Modeling
Maximum Likelihood & Missing data
Virginia Tech, Educational Research and Evaluation
Nonparametric Statistics
The University of Manchester
His Name Shall Be Revered …
Confirmatory factor analysis
DIF detection using OLR
Evaluating Multi-item Scales
Presentation transcript:

1 Differential Item Functioning in Mplus Summer School Week 2

Differential Item Functioning Differential item functioning (DIF) occurs when people from different groups (e.g gender or ethnicity) with the same underlying latent trait score have a different probability of responding to an item in a particular way. Group differences in item responses (or on latent variables) do not reflect DIF per se (e.g females score higher than males on a particular item or scale). DIF is only present if people from different groups with the same underlying ability (or trait level) have a different probability of response. Reise, Widaman, Pugh 1993; Psych Bulletin, Vol 114, Embretson,S.E., Reise,S.P. (2000). Item Response Theory for Psychologists. Definition from Laura Gibbons: ‘when a demographic characteristic interferes with relationship expected between ability level and responses to an item’

DIF – Measurement Non-Invariance If the probablity of item response is the same (among different sub-groups with the same underlying ability) measurement invariance is assumed If the probablity of response is different (among different sub-groups with the same underlying ability) than measurement non-invariance is assumed.

Types of DIF Uniform: DIF occurs uniformly at all levels along the latent trait Non-Uniform : DIF does not occur equally at all points on the latent trait (e.g. gender differences in response) may only be evident at high or low levels of the construct Crane et al (2004) describe uniform DIF to be analogous to ‘confounding’ in epidemiology and non-uniform DIF with ‘effect modification’ – i.e. interaction between trait level, group assignment and item responses

Example of Item with Uniform DIF From: Jones R (2006), Medical Care Volume 44, Number 11 Suppl 3, (Figure 2)

Example of Item with Non-Uniform DIF From: Mellenbergh, G. (1989)

7 Definition of DIF (Mellenbergh, 1989) Item Group

8 Definition of DIF (Mellenbergh, 1989) P(u = 1| G, θ) = P(u = 1| θ) An item is unbiased if... i.e. the probability of an item response only depends on the values x of the variable X Item Group Trait

9 Definition of DIF (Mellenbergh, 1989) P(u = 1| G, θ) ≠ P(u = 1| θ) An item is biased if... i.e. the probability of an item response depends on the combination of values x of the variable X and values g of the variable G Item Group Trait

Differential Item Functioning Important first step in the evaluation of test bias For construct validity items of a scale ideally should have little or no DIF Items should function in the same way across subgroups of respondents who have the same underlying ability (or level on the latent trait) Presence of DIF may compromise comparison across subgroups – give misleading results Confound interpretation of observed variables Camilli and Shephard, 1994

Methods to identify DIF Parametric – Mantel-Haenszel (MH) (Holland & Thayer, 1988) Non-parametric methods – Logistic regression (Zumbo, 1999) – Ordinal logistic regression (Crane et al, 2004) – MIMIC models (Muthen, 2004) – Multiple group models – IRT based methods (Thissen, 1991) Good review by Teresi (2006) Medical Care Vol 44

What to do if DIF present Remove items? 1) Ok if you have a large item pool and the item can be replaced with a item measuring similar threshold / discrimination parameters 2) But dropping items might adversely affect the content validity of the instrument. 3) May end up with an instrument that is not comparable to other research using that instrument Look for causes of DIF What do all the DIF items have in common e.g. – Are they all negatively or positively worded – Are they all at end of study – Readability etc How do they differ from the invariant items?

How to adjust for DIF Adjust for DIF in the model – in Mplus can do this by adding direct effect between the covariate and the item Crane et al (2004, 2006) a) items without DIF have item parameters estimated from whole sample – (anchors) b) items with DIF have parameters estimated separately in different subgroups

Two Examples of Identifying DIF Mplus : MIMIC Model (Multiple Indicators, Multiple Causes) Uniform DIF Stata: DIFd program (Crane et al, 2004) Non-Uniform and Uniform DIF

Mplus Example - MIMIC Model: BCS70 Externalising (Conduct) Scale 03 Teenager often destroys belongings 04 Teenager frequently fights with others 10 Teenager sometimes takes others' things 14 Teenager is often disobedient 18 Teenager often tells lies 19 Teenager bullies others Mother’s rating of teenager on Rutter Scale age 16 Ordinal 3 category scale (0=does not apply, 1=applies somewhat, 2=certainly applies)

CFA Model for BCS70 Externalising Observed items F1 Conduct problems RUT03 RUT04 RUT10 RUT14 RUT18 Latent Variable ε ε ε ε ε SEX

Mimic Model Stages of identifying potential DIF 1. Run CFA model without covariates 2. Include MIMIC model (add covariate but no direct effects) 3. Add paths from covariate to indicator constrained to 0 - i.e.assuming there is no direct effect (Y1 on 4. Check modification indices 5. Add direct path from covariate to indicator for indicator with highest modification indices - rerun model 6. Repeat steps 4 & 5 until there are no further significant modification indices, evaluate model fit and significance of the direct effects

Stage 1-3: Mplus CFA USEVARIABLES are rut03 rut04 rut10 rut14 rut18 sex; CATEGORICAL are rut03 rut04 rut10 rut14 rut18; Missing are all ( ); ANALYSIS: ESTIMATOR IS wlsmv; ITERATIONS = 1000; CONVERGENCE = ; MODEL: CONDUCT by rut03 rut04 rut10 rut14 rut18; ! (define latent variable) OUTPUT: SAMPSTAT STANDARDIZED RES MOD(10) ; CONDUCT on sex; ! (MIMIC model - add regression latent var on SEX rut03- rut18 on !(assume no direct effect of sex on item)

CFA Mimic Model Observed items Conduct problems SEX RUT03 RUT04 RUT10 RUT14 RUT18 CovariateLatent Variable O n

Check MOD indices M.I. E.P.C. Std E.P.C. StdYX E.P.C. ON Statements RUT03 ON SEX RUT04 ON SEX Include item with largest MI as a direct effect in model RUT03 on SEX; RUT04- RUT18 ON Recheck mod indices and repeat if necessary

Stage 4: Mplus MIMIC DIF USEVARIABLES are rut03 rut04 rut10 rut14 rut18 sex; CATEGORICAL are rut03 rut04 rut10 rut14 rut18; Missing are all ( ); ANALYSIS: ESTIMATOR IS wlsmv; ITERATIONS = 1000; CONVERGENCE = ; MODEL: CONDUCT by rut03 rut04 rut10 rut14 rut18; ! (define latent variable) RUT04 ON sex ; ! MI in run 4b so may not be required ?? OUTPUT: SAMPSTAT STANDARDIZED RES MOD(10) ; CONDUCT on sex; ! (MIMIC model - add regression latent var on SEX ) rut04- rut18 on !(assume no direct effect of sex on item) rut03 on sex; !(adds direct effect of sex on item 03)

CFA Mimic Model (DIF) Observed EXT items Conduct problems SEX RUT03 RUT04 RUT10 RUT14 RUT18 O n CovariateLatent Variable

CFA Mimic model fit 1. CFA2&3 MIMIC4a. MIMIC (1 direct effect) 4b. MIMIC (2 direct effects) Chi Square37.9 (df=3)131.8 (df=7)51.9 (df=6)38.1 (df=5) CFI TLI RMSEA WRMR

Mplus results Model (2) Initial Mimic Model (no direct effects) Estimate S.E. Est./S.E. P-Value Std CONDUCT ON SEX Model 4(b) Add 2 direct effects Estimate S.E. Est./S.E. P-Value Std CONDUCT ON SEX RUT03 ON SEX RUT04 ON SEX

Mplus results Model (2) Initial Mimic Model (no direct effects) Estimate S.E. Est./S.E. P-Value Std CONDUCT ON SEX Model 4(b) Add 2 direct effects Estimate S.E. Est./S.E. P-Value Std CONDUCT ON SEX RUT03 ON SEX RUT04 ON SEX Is this practically meaningful?

In a Graded Response Model... Analysis: TYPE = general missing h1 ; estimator=mlr ; algorithm=integration ; MODEL: RUT16EX BY rut03 rut04 rut10 rut14 rut18; ! rut19; Rut16ex on sex; ! rut03- rut18 on Output: residual modindices(1.00) sampstat standardized tech1 tech5 cinterval;

In a Graded Response Model... Analysis: TYPE = general missing h1 ; estimator=mlr ; algorithm=integration ; MODEL: RUT16EX BY rut03 rut04 rut10 rut14 rut18; ! rut19; Rut16ex on sex; ! rut03- rut18 on Output: residual modindices(1.00) sampstat standardized tech1 tech5 cinterval;

In a Graded Response Model... Analysis: TYPE = general missing h1 ; estimator=mlr ; algorithm=integration ; MODEL: RUT16EX BY rut03 rut04 rut10 rut14 rut18; ! rut19; Rut16ex on sex; rut03 ON sex; rut04- rut18 on Output: residual modindices(1.00) sampstat standardized tech1 tech5 cinterval; Odds ratios

Exercise Work through MIMIC modelling stages using multivariate probit regression model implemented by WLSMV (equivalent to normal ogive IRT model for polyomous items) 2....using the Graded Response Model implemented by full-information maximum likelihood (MLR) Note: references are included at the end of the next (DifDetect) presentation