Differential Item Functioning. Anatomy of the name DIFFERENTIAL –Differential Calculus? –Comparing two groups ITEM –Focus on ONE item at a time –Not the.

Differential Item Functioning

Anatomy of the name DIFFERENTIAL –Differential Calculus? –Comparing two groups ITEM –Focus on ONE item at a time –Not the whole test FUNCTIONING –All we have is the item performance (1 or 0). –Not about the content, format of item Is there any Differential Item Functioning between groups?

Why do we care about DIF? Validation process of test –Bias-Free against minorities Necessary but not sufficient –Inference or interpretation beyond statistics data must be involved Bias? DIF? Impact? –DIF: Conditional on ability –Bias: Pejorative in nature –Impact: Not conditional on ability

Definition of DIF An item has no DIF if the probability of getting the item right is dependent only on ability, not on group membership. An item has DIF if the probability of getting the item right is dependent on group membership (and possibly on ability).

Causes & Types of DIF Causes –Construct irrelevant variance –Opportunity to learn Types –Adverse –Benign

Causes (k-12) Construct Irrelevant Variance Opportunity to Learn Benign Adverse MP Responsibility Field Client

Some DIF Examples Meaning of “ascend” in MCAS vocabulary test Potato Salad example in NAEP Biology test Train schedule in urban area in LSAT logical reasoning problem Color of lemon from ETS

Empirical Evidence It is a kind of Function. Inputs: –Item response vector –Total score –Group indicator Output: –A number called DIF index

Feverish World of DIF Every categorical data analysis method can be used, since the DIF index is just simply a mathematical function with an item response vector as the main input. –Mantel-Haenszel method –Standardization method –Logistic regression method –Dimensionality analysis –IRT based methods

One question, many answers Mantel-Haenszel method –Differences in constant odds ratio Standardization method –Differences in proportion of correct Logistic regression method –Group variable coefficient estimates Dimensionality analysis –Second dimension of data IRT based methods –Area between two ICCs

Area between two ICCs Male Female

DIF in MP Standardization method Index describing the degree of DIF –Standardized P-Difference Comparing groups –Male-Female –White-Black –White-Hispanic Minimum 200 examinees in one group

Classification of DIF A: [-0.05 ~ 0.05]negligible B: [-0.1 ~ -0.05) and (0.05 ~ 0.1]low C: outside the [-0.1 ~ 0.1]high CC A: [-0.05 ~ 0.05]negligible B: [-0.1 ~ -0.05) and (0.05 ~ 0.1]low C: outside the [-0.1 ~ 0.1]high AB B

Some more Jargon Matching variable –Conditional variable –Total score, theta score, external measure Focal group –Study group Base group –Reference group

White GroupBlack Group  Item of Interest Base groupFocal group

White GroupBlack Group White GroupBlack Group We can now study this item of interest for both the White group and the Black group

Impact vs. DIF Impact –Difference between two groups in performance on item level (and total score level) DIF –Difference between two groups in performance on item level AFTER groups matched with respect to the ability

Standardized P-Difference 1)Match the different groups by score level 2)At every score level get the proportion correct for each group 3)Apply weighting to the difference of proportion correct 4)Accumulate these weighted differences across all score levels 5)Divide the sum of the weighted difference by the sum of the weights

Formal Definition of Standardized P-Difference w m : Weighting factor at score level m P fm : Proportion correct of the focal group P bm : Proportion correct of the base group

Summation (Σ)

Does it work? If we know which items have DIF in advance, we can test the method to see whether it catches the DIF properly or not. We simulated data from a 40 item test. One item had DIF: we made it more difficult for one group than another. We ran the Standardized P-Difference procedure to evaluate the DIF for each item. Ideally, the method would make the right decision on each item.

Data Simulation plan Examinees –2000 examinees in focal group and 8000 in base group –Focal group ability: ~N (0,1) –Base group ability: ~N (1,1) Items –40 MC items only –41 score levels (from 0 to 40) DIF setting –Only 1 item having DIF –The focal group difficulty parameter is 1.0 higher than the base group one. –The others have the same item parameters for both groups.

ITEM 26

ITEM 27

ITEM 26 ITEM 27

Some more complexity? Double differential functioning? –Discriminant parameter or point-by-serial correlation How big is big? –Hypothetical testing Spoiled onion in the basket? –Purification of the criterion Polytomous item –Testlet DIF

Differential Item Functioning. Anatomy of the name DIFFERENTIAL –Differential Calculus? –Comparing two groups ITEM –Focus on ONE item at a time –Not the.

Similar presentations

Presentation on theme: "Differential Item Functioning. Anatomy of the name DIFFERENTIAL –Differential Calculus? –Comparing two groups ITEM –Focus on ONE item at a time –Not the."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Differential Item Functioning. Anatomy of the name DIFFERENTIAL –Differential Calculus? –Comparing two groups ITEM –Focus on ONE item at a time –Not the.

Similar presentations

Presentation on theme: "Differential Item Functioning. Anatomy of the name DIFFERENTIAL –Differential Calculus? –Comparing two groups ITEM –Focus on ONE item at a time –Not the."— Presentation transcript:

Similar presentations

About project

Feedback