1 IRT basics: Theory and parameter estimation Wayne C. Lee, David Chuah, Patrick Wadlington, Steve Stark, & Sasha Chernyshenko.

1 IRT basics: Theory and parameter estimation Wayne C. Lee, David Chuah, Patrick Wadlington, Steve Stark, & Sasha Chernyshenko

2 Overview How do I begin a set of IRT analyses? What do I need? Software Data What do I do? Input/ syntax files Examination of output On-line!

3 “Eye-ARE-What?” Item response theory (IRT) Set of probabilistic models that… Describes the relationship between a respondent’s magnitude on a construct (a.k.a. latent trait; e.g., extraversion, cognitive ability, affective commitment)… To his or her probability of a particular response to an individual item

4 But what does that buy you? Provides more information than classical test theory (CTT) Classical test statistics depend on the set of items and sample examined IRT modeling not dependent on sample examined Can examine item bias/ measurement equivalence and provide conditional standard errors of measurement

5 Before we begin… Data preparation Raw data must be recoded if necessary (negatively worded items must be reverse coded such that all items in the scale indicate a positive direction) Dichotomization (optional) Reducing multiple options into two separate values (0, 1; right, wrong)

6 Calibration and validation files Data is split into two separate files Calibration sample for estimating IRT parameters Validation sample for assessing the fit of the model to the data Data files for the programs that we will be discussing must be in ASCII/ text format

7 Investigating dimensionality The models presented make a common assumption of unidimensionality Hattie (1985) reviewed 30 techniques Some propose the ratio of the 1 st eigenvalue to the 2 nd eigenvalue (Lord, 1980) On-line we describe how to examine the eigenvalues following Principal Axis Factoring (PAF)

8 PAF and scree plots If the data are dichotomous, factor analyze tetrachoric correlations Assume continuum underlies item responses Dominant first factor

9 Two models presented The Three Parameter Logistic model (3PL) For dichotomous data E.g., cognitive ability tests Samejima's Graded Response model For polytomous data where options are ordered along a continuum E.g., Likert scales Common models among applied psychologists

10 The 3PL model Three parameters: a = item discrimination b = item extremity/ difficulty c = lower asymptote, “pseudo-guessing” Theta refers to the latent trait

11 Effect of the “a” parameter Small “a,” poor discrimination

12 Effect of the “a” parameter Larger “a,” better discrimination

13 Effect of the “b” parameter Low “b,” “easy item”

14 Effect of the “b” parameter Higher “b,” more difficult item “b” inversely proportional to CTT p

15 Effect of the “c” parameter c=0, asymptote at zero

16 Effect of the “c” parameter “low ability” respondents may endorse correct response

17 Estimating 3PL parameters DOS version of BILOG (Scientific Software) Multiple files in directory, but small size overall Easier to estimate parameters for a large number of scales or experimental groups Data file must be saved as ASCII text ID number Individual responses Input file (ASCII text)

18 BILOG input file (*.BLG) AGREEABLENESS CALIBRATION FOR IRT TUTORIAL. >COMMENT >GLOBAL DFN='AGR2_CAL.DAT', NIDW=4, NPARM=3, OFNAME='OMIT.KEY', SAVE; >SAVE SCO = 'AGR2_CAL.SCO', PARM = 'AGR2_CAL.PAR', COV = 'AGR2_CAL.COV'; >LENGTH NITEMS=(10); >INPUT SAMPLE=99999; (4A1,10A1) >TEST TNAME=AGR; >CALIB NQPT=40, CYC=100, NEW=30, CRIT=.001, PLOT=0; >SCORE MET=2, IDIST=0, RSC=0, NOPRINT; Title line

19 BILOG input file (*.BLG) AGREEABLENESS CALIBRATION FOR IRT TUTORIAL. >COMMENT >GLOBAL DFN='AGR2_CAL.DAT', NIDW=4, NPARM=3, OFNAME='OMIT.KEY', SAVE; >SAVE SCO = 'AGR2_CAL.SCO', PARM = 'AGR2_CAL.PAR', COV = 'AGR2_CAL.COV'; >LENGTH NITEMS=(10); >INPUT SAMPLE=99999; (4A1,10A1) >TEST TNAME=AGR; >CALIB NQPT=40, CYC=100, NEW=30, CRIT=.001, PLOT=0; >SCORE MET=2, IDIST=0, RSC=0, NOPRINT; Data File Name Characters in ID field Parameters File for missing

20 BILOG input file (*.BLG) AGREEABLENESS CALIBRATION FOR IRT TUTORIAL. >COMMENT >GLOBAL DFN='AGR2_CAL.DAT', NIDW=4, NPARM=3, OFNAME='OMIT.KEY', SAVE; >SAVE SCO = 'AGR2_CAL.SCO', PARM = 'AGR2_CAL.PAR', COV = 'AGR2_CAL.COV'; >LENGTH NITEMS=(10); >INPUT SAMPLE=99999; (4A1,10A1) >TEST TNAME=AGR; >CALIB NQPT=40, CYC=100, NEW=30, CRIT=.001, PLOT=0; >SCORE MET=2, IDIST=0, RSC=0, NOPRINT; Requested files for: Scoring, Parameters, Covariances

21 BILOG input file (*.BLG) AGREEABLENESS CALIBRATION FOR IRT TUTORIAL. >COMMENT >GLOBAL DFN='AGR2_CAL.DAT', NIDW=4, NPARM=3, OFNAME='OMIT.KEY', SAVE; >SAVE SCO = 'AGR2_CAL.SCO', PARM = 'AGR2_CAL.PAR', COV = 'AGR2_CAL.COV'; >LENGTH NITEMS=(10); >INPUT SAMPLE=99999; (4A1,10A1) >TEST TNAME=AGR; >CALIB NQPT=40, CYC=100, NEW=30, CRIT=.001, PLOT=0; >SCORE MET=2, IDIST=0, RSC=0, NOPRINT; Number of items Sample size

22 BILOG input file (*.BLG) AGREEABLENESS CALIBRATION FOR IRT TUTORIAL. >COMMENT >GLOBAL DFN='AGR2_CAL.DAT', NIDW=4, NPARM=3, OFNAME='OMIT.KEY', SAVE; >SAVE SCO = 'AGR2_CAL.SCO', PARM = 'AGR2_CAL.PAR', COV = 'AGR2_CAL.COV'; >LENGTH NITEMS=(10); >INPUT SAMPLE=99999; (4A1,10A1) >TEST TNAME=AGR; >CALIB NQPT=40, CYC=100, NEW=30, CRIT=.001, PLOT=0; >SCORE MET=2, IDIST=0, RSC=0, NOPRINT; FORTRAN statement for reading data Name of scale/ measure

23 BILOG input file (*.BLG) AGREEABLENESS CALIBRATION FOR IRT TUTORIAL. >COMMENT >GLOBAL DFN='AGR2_CAL.DAT', NIDW=4, NPARM=3, OFNAME='OMIT.KEY', SAVE; >SAVE SCO = 'AGR2_CAL.SCO', PARM = 'AGR2_CAL.PAR', COV = 'AGR2_CAL.COV'; >LENGTH NITEMS=(10); >INPUT SAMPLE=99999; (4A1,10A1) >TEST TNAME=AGR; >CALIB NQPT=40, CYC=100, NEW=30, CRIT=.001, PLOT=0; >SCORE MET=2, IDIST=0, RSC=0, NOPRINT; Estimation specifications (not the default for BILOG)

24 BILOG input file (*.BLG) AGREEABLENESS CALIBRATION FOR IRT TUTORIAL. >COMMENT >GLOBAL DFN='AGR2_CAL.DAT', NIDW=4, NPARM=3, OFNAME='OMIT.KEY', SAVE; >SAVE SCO = 'AGR2_CAL.SCO', PARM = 'AGR2_CAL.PAR', COV = 'AGR2_CAL.COV'; >LENGTH NITEMS=(10); >INPUT SAMPLE=99999; (4A1,10A1) >TEST TNAME=AGR; >CALIB NQPT=40, CYC=100, NEW=30, CRIT=.001, PLOT=0; >SCORE MET=2, IDIST=0, RSC=0, NOPRINT; Scoring: Maximum likelihood, no prior distribution of scale scores, no rescaling

25 Phase one output file (*.PH1) CLASSICAL ITEM STATISTICS FOR SUBTEST AGR NUMBERNUMBER ITEM*TEST CORRELATION ITEM NAMETRIEDRIGHT PERCENTLOGIT/1.7 PEARSONBISERIAL --------------------------------------------------------------------- 1 0001 1500.01158.00.772 0.72 0.535 0.742 20002 1500.0 991.00.661 0.39 0.421 0.545 3 0003 1500.01354.00.903 1.31 0.290 0.500 40004 1500.01187.00.791 0.78 0.518 0.733 5 0005 1500.0 970.00.647 0.360.566 0.728 6 0006 1500.0 1203.00.802 0.820.362 0.519 7 0007 1500.0 875.00.5830.20 0.533 0.674 8 0008 1500.0 810.00.540 0.09 0.473 0.594 9 0009 1500.01022.00.681 0.45 0.415 0.542 10 0010 1500.0 869.00.579 0.19 0.426 0.538 --------------------------------------------------------------------- Can indicate problems in parameter estimation

26 Phase two output file (*.PH2) CYCLE 12: LARGEST CHANGE = 0.00116 -2 LOG LIKELIHOOD = 15181.4541 CYCLE 13: LARGEST CHANGE = 0.00071 [FULL NEWTON STEP] -2 LOG LIKELIHOOD = 15181.2347 CYCLE 14: LARGEST CHANGE = 0.00066 Check for convergence

27 Phase three output file (*.PH3) Theta estimation Scoring of individual respondents Required for DTF analyses

28 Parameter file (specified, *.PAR) AGREEABLENESS CALIBRATION FOR IRT TUTORIAL. >COMMENT 1 10 10 0001AGR 111 1.130784 1.533393 -0.737439 0.652148 0.147203 0.101834 0.185726 0.135455 0.078989 0.053688 0002AGR 211 0.360630 0.870309 -0.414371 1.149018 0.132796 0.087236 0.097709 0.098866 0.129000 0.054461 0003AGR 311 1.474175 0.743095 -1.983831 1.345723 0.197127 0.108974 0.084487 0.250499 0.153003 0.087578 0004AGR 411 1.196368 1.256263 -0.952323 0.796012 0.090901 0.087856 0.114710 0.123613 0.072684 0.042937 0005AGR 511 0.544388 1.403904 -0.387767 0.712300 0.056774 0.071490 0.133486 0.080438 0.067727 0.026086 0006AGR 611 0.892399 0.777440 -1.147869 1.286273 0.173882 0.093109 0.082096 0.152846 0.135828 0.075829 0007AGR 711 0.174395 1.369223 -0.127368 0.730341 0.088135 0.083777 0.159712 0.085084 0.085190 0.032376 “a” “b”“c” (32X,2F12.6,12X,F12.6)

29 PARTO3PL output (*.3PL) 0001AGR 111 1.130784 1.533393 -0.737439 0.652148 0.147203 0002AGR 211 0.360630 0.870309 -0.414371 1.149018 0.132796 0003AGR 311 1.474175 0.743095 -1.983831 1.345723 0.197127 0004AGR 411 1.196368 1.256263 -0.952323 0.796012 0.090901 0005AGR 511 0.544388 1.403904 -0.387767 0.712300 0.056774 0006AGR 611 0.892399 0.777440 -1.147869 1.286273 0.173882 0007AGR 711 0.174395 1.369223 -0.127368 0.730341 0.088135 0008AGR 811 0.042231 0.979045 -0.043135 1.021403 0.056546 0009AGR 911 0.441586 0.839144 -0.526234 1.191691 0.129646 0010AGR 1011 0.104452 0.879683 -0.118738 1.136773 0.101087 a b c

30 Scoring and covariance files Like the *.PAR file, specifically requested *.COV - Provides parameters as well as the variances/covariances between the parameters Necessary for DIF analyses *.SCO - Provides ability score information for each respondent

31 Samejima's Graded Response model Used when options are ordered along a continuum, as with Likert scales v = response to the polytomously scored item i k = particular option a = discrimination parameter b = extremity parameter

32 Sample SGR Plot “Low option” “High option” Low discrimination (a=0.4)

33 Sample SGR Plot Better discrimination (a=2)

34 Running MULTILOG MULTILOG for DOS Example with DOS batch file INFORLOG with MULTILOG INFORLOG is typically interactive Process automated with batch file and an input file (described on-line) *.IN1 (parameter estimation) *.IN2 (scoring)

35 The first input file (*.IN1) CALIBRATION OF AGREEABLENESS GRADED RESPONSE MODEL >PRO IN RA NI=10 NE=1500 NCHAR=4 NG=1; >TEST ALL GR NC=(5,5,5,5,5,5,5,5,5,5); >EST NC=50; >SAVE; >END; 5 01234 1111111111 2222222222 3333333333 4444444444 5555555555 (4A1,10A1) Title line

36 The first input file (*.IN1) CALIBRATION OF AGREEABLENESS GRADED RESPONSE MODEL >PRO IN RA NI=10 NE=1500 NCHAR=4 NG=1; >TEST ALL GR NC=(5,5,5,5,5,5,5,5,5,5); >EST NC=50; >SAVE; >END; 5 01234 1111111111 2222222222 3333333333 4444444444 5555555555 (4A1,10A1) Number of items, examinees, characters in the ID field, single group

37 The first input file (*.IN1) CALIBRATION OF AGREEABLENESS GRADED RESPONSE MODEL >PRO IN RA NI=10 NE=1500 NCHAR=4 NG=1; >TEST ALL GR NC=(5,5,5,5,5,5,5,5,5,5); >EST NC=50; >SAVE; >END; 5 01234 1111111111 2222222222 3333333333 4444444444 5555555555 (4A1,10A1) SGR model Number of options for each item

38 The first input file (*.IN1) CALIBRATION OF AGREEABLENESS GRADED RESPONSE MODEL >PRO IN RA NI=10 NE=1500 NCHAR=4 NG=1; >TEST ALL GR NC=(5,5,5,5,5,5,5,5,5,5); >EST NC=50; >SAVE; >END; 5 01234 1111111111 2222222222 3333333333 4444444444 5555555555 (4A1,10A1) Number of cycles for estimation End of command syntax

39 The first input file (*.IN1) CALIBRATION OF AGREEABLENESS GRADED RESPONSE MODEL >PRO IN RA NI=10 NE=1500 NCHAR=4 NG=1; >TEST ALL GR NC=(5,5,5,5,5,5,5,5,5,5); >EST NC=50; >SAVE; >END; 5 01234 1111111111 2222222222 3333333333 4444444444 5555555555 (4A1,10A1) Five characters Denoting five options

40 The first input file (*.IN1) CALIBRATION OF AGREEABLENESS GRADED RESPONSE MODEL >PRO IN RA NI=10 NE=1500 NCHAR=4 NG=1; >TEST ALL GR NC=(5,5,5,5,5,5,5,5,5,5); >EST NC=50; >SAVE; >END; 5 01234 1111111111 2222222222 3333333333 4444444444 5555555555 (4A1,10A1) Recoding of options for MULTILOG

41 The second input file (*.IN2) SCORING AGREEABLENESS SCALE SGR MODEL >PRO SCORE IN RA NI=10 NE=1500 NCHAR=4 NG=1; >TEST ALL GR NC=(5,5,5,5,5,5,5,5,5,5); >START; Y >SAVE; >END; 5 12345 1111111111 2222222222 3333333333 4444444444 5555555555 (4A1,10A1) Scoring Yes to INFORLOG (parameters in a separate file)

42 Running MULTILOG Run the batch file *.IN1  *.LS1 (*.lis file renamed as *.ls1) ensure that the data were read in and the model specified correctly also provides a report of the estimation procedure with the estimated item parameters Things of note…

43 0ITEM 1: 5 GRADED CATEGORIES P(#) ESTIMATE (S.E.) A 1 1.99 (0.12) B( 1) 2 -3.03 (0.18) B( 2) 3 -2.35 (0.11) B( 3) 4 -0.98 (0.06) B( 4) 5 2.01 (0.10) 0 @THETA: -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 I(THETA): 1.08 1.04 1.05 0.81 0.49 0.35 0.47 0.79 0.99 0 OBSERVED AND EXPECTED COUNTS/PROPORTIONS IN CATEGORY(K): 1 2 3 4 5 OBS. FREQ. 21 44 277 1050 108 OBS. PROP. 0.01 0.03 0.18 0.70 0.07 EXP. PROP. 0.01 0.03 0.19 0.70 0.07 “a” includes a 1.7 scaling factor Frequencies for each option Collapsing options

44 Scoring output *.IN2  *.LS2 Last portion of the file contains the person parameters (estimated theta, standard error, the number of iterations used, and the respondent's ID number).

45 What now? Review Data requirements for IRT Two models: 3PL (dichotomous), SGR (polytomous), more on-line! MODFIT Can plot IRF’s, ORF’s Model-data fit: Input parameters, validation sample

1 IRT basics: Theory and parameter estimation Wayne C. Lee, David Chuah, Patrick Wadlington, Steve Stark, & Sasha Chernyshenko.

Similar presentations

Presentation on theme: "1 IRT basics: Theory and parameter estimation Wayne C. Lee, David Chuah, Patrick Wadlington, Steve Stark, & Sasha Chernyshenko."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 IRT basics: Theory and parameter estimation Wayne C. Lee, David Chuah, Patrick Wadlington, Steve Stark, & Sasha Chernyshenko.

Similar presentations

Presentation on theme: "1 IRT basics: Theory and parameter estimation Wayne C. Lee, David Chuah, Patrick Wadlington, Steve Stark, & Sasha Chernyshenko."— Presentation transcript:

Similar presentations

About project

Feedback