You got WHAT on that test? Using SAS PROC LOGISTIC and ODS to identify ethnic group Differential Item Functioning (DIF) in professional certification exam questions Steve Grilli Life Office Management Association
PRESENTATION OUTLINE Introduce LOMA Intro to Educational Stats LOMA’s SAS Item Analysis Program “DIF” Defined Logistic Regression LOMA’s SAS DIF Identification Program Conclusions
About LOMA Founded in an international association of insurance and financial services companies Located in Atlanta, GA... with local partners around the world Purpose: to facilitate information sharing, improve company operations and management, provide industry-specific employee development
LOMA By the Numbers 80+ years of experience 1,200+ members in 80 countries 13 professional education programs Courses available in 7 languages 100,000+ annual examination enrollments More than 10,000 attendees to conferences & meetings each year 1,200 individuals serve on more than 50 LOMA committees
Educational Statistics: “Classical” Item Analysis vs. IRT Item Response Theory – three parameter Rasch model with discrimination parameter a, difficulty b, and “pseudo-guessing” parameter c. Used in Computer Adaptive Testing (CAT)
Classical Item Analysis Biserial correlation between performance on a dicotomous test item (X=1 if student got it correct; 0 otherwise), and a continuous variable – score on the entire exam.
ITEM ANALYSIS – SAS CODE /* CALCULATE BISERIAL CORRELATIONS FOR AN ARREA OF EXAM QUESTIIONS */ DATA NEXT; SET PXDAT; SET ADD; SET YI; ARRAY P PX1-PX&R; ARRAY ZCAL 3 Z1-Z&R; ARRAY BISA 3 BISA1-BISA&R; ARRAY BIS 3 BIS1-BIS&R; ARRAY YI YIMEAN1-YIMEAN&R; DO OVER P; ZCAL=PROBIT(P); BISA=.39894/EXP((ZCAL*ZCAL)/2); END; DO OVER BIS; BIS=((YI-YMEAN)/YSTD)*(P/BISA); END; PROC TRANSPOSE DATA=NEXT OUT=BIS PREFIX=BIS; VAR BIS1-BIS&R;
ITEM ANALYSIS – SAS OUTPUT ITEM ANALYSIS PAPER EXAMS COURSE 290 FORM 1265 04M COURSE: 290 ITEM: 1 1 2* OMIT 1,180 UPPER 3RD 1,181 MIDDLE 3RD 1,180 LOWER 3RD 3,541 TOTAL BISERIAL CORRELATION: CONFIDENCE: COURSE: 290 ITEM: * 5 6 OMIT 1,180 UPPER 3RD 1,181 MIDDLE 3RD 1,180 LOWER 3RD 3,541 TOTAL BISERIAL CORRELATION: CONFIDENCE: 100.0
ITEM ANALYSIS – SAS OUTPUT PAPER ITEM ANALYSIS EXCEPTION REPORT COURSE 290 FORM 1265 04M ERROR CODES E1: BISERIAL CORRELATION LESS THAN.200 E2: FEWER THAN 50% OF THE UPPER GROUP CHOSE RIGHT ANSWER E3: 25% OR MORE OF UPPER GROUP CHOSE A SPECIFIC DISTRACTOR E4: DISCRIMINATION CONFIDENCE LESS THAN 90% (50 OR MORE STUDENTS) (NOTE PROBLEM ANSWERS IN PARENTHESIS FOR E2 AND E3) ITEM PROBLEMS 53 E1 E4 71 E3(1)
DIFFERENTIAL ITEM FUNCTIONING (DIF) “ an item displays DIF if examinees from different groups have differing probabilities or likelihoods of success on the item after conditioning or matching on the ability the item is intended to measure” -- NCME “ an item displays DIF if examinees from different groups have differing probabilities or likelihoods of success on the item after conditioning or matching on the ability the item is intended to measure” -- NCME DIF is a necessary but not a sufficient condition for item bias DIF is a necessary but not a sufficient condition for item bias Item bias exists when members of one group are less likely to answer an item correctly because of some aspect of the item or the testing situation that in not relevant to the purpose of the testing. Item bias exists when members of one group are less likely to answer an item correctly because of some aspect of the item or the testing situation that in not relevant to the purpose of the testing.
TYPES OF DIF Two types of DIF: Uniform and Non-Uniform. Uniform DIF is when one group’s advantage is roughly constant across the ability scale. Non-Uniform DIF occurs when the advantage varies at different ability levels; i.e., ability and group membership interact
DIF DETECTION Experts recommend the use of logistic regression to detect DIF LOMA chose this method for its conceptual clarity, its ability to detect non- uniform DIF, and the ease with which existing SAS software could be employed in its detection
LOGISTIC REGRESSION
LOMA DIF LOGISTIC MODEL Theta is ability measure (score on the exam) E is education, 1 if BA or higher; 0 otherwise G is group membership – generally US vs China Theta x G is the interaction term to test for non-uniform DIF G x E is interaction of group and education
DIF LOGISTIC MODEL: SAS CODE PROC LOGISTIC DESCENDING ; ODS OUTPUT TypeIII=MODEL&I GlobalTests=GT&I; CLASS EDCODE (PARAM=REF REF='A') GRP (PARAM=REF REF='US'); MODEL RES&I=GRADE EDCODE GRP GRP*GRADE GRP*EDCODE/ SELECTION=STEPWISE INCLUDE=1 SLE=.01 SLS=.01 HIER=MULTIPLE;
SAS DIF PROGRAM: OUTPUT DIFFERENTIAL ITEM FUNCTIONING REPORT COURSE M REFERENCE GROUP: UNITED STATES FOCAL GROUP: CHINA ITEM MODEL PREDICTORS DIF TYPE LR CHI SQ CONFIDENCE 1 SCORE % 2 SCORE % 3 SCORE % 4 SCORE, GROUP UNIFORM % 5 SCORE, GROUP UNIFORM % 6 SCORE, GROUP UNIFORM % 7 SCORE, GROUP UNIFORM % 8 SCORE % 9 SCORE, GROUP, GROUP*SCORE NON-UNIFORM %
SAS DIF PROGRAM: FILE OUTPUT [290,M04,17] 1=US v CH -- S -- NONE [290,M04,18] 1=US v CH -- S -- NONE [290,M04,19] 1=US v CH -- S, G, G*S -- NON-U [290,M04,20] 1=US v CH -- S, ED, G -- U (2)
Item: S0420-E (QID=9780)CR: 4DiffiEst: 81Codes: mc; 0, r TextRef: O&S, c. 11, pp Mandatory: Y Most of an insurer's customers can be characterized as either external or internal. However, some customers have characteristics of both internal and external customers. One example of an insurance customer who has characteristics of both internal and external customers is (1) a third-party administrator (2)a policy beneficiary (3)an individual policyowner (4)a general agent ItemRegionGroupN1234*56OmitCRR,Bis.,Conf.,E,DIF S /04 All Upper 3rd1, , 0.466, 100.0, DIF: US v CH -- S, ED, G -- U (2) Middle 3rd1, Lower 3rd1, Total3, S /04 US/Can Upper 3rd , 0.264, 100.0, E4 Middle 3rd Lower 3rd Total S /04 Int'l Upper 3rd1, , 0.527, Middle 3rd1, Lower 3rd1, Total3, pp. 288, 289
CONCLUSIONS Need to monitor DIF due to increasing globalization SAS PROC LOGISTIC and ODS feature simple and effective means of DIF detection