The ABC’s of Pattern Scoring Dr. Cornelia Orr
Vocabulary Measurement – Psychometrics is a specialized application Classical test theory Item Response Theory – IRT (AKA logistic trait theory) 1 – 2 – 3 parameter IRT models Pattern Scoring
General & Specialized Measurement Assigning numbers to objects or events Ex. – time, height, earthquakes, hurricanes, stock market Psychometrics Assigning numbers to psychological characteristics Ex. – personality, IQ, opinion, interests, knowledge
Different Theories of Psychometrics Classical Test Theory Item discrimination values Item difficulty values (p-values) Guessing (penalty) Number correct scoring Item Response Theory Item discrimination values Item difficulty values Guessing (pseudo-guessing) values Pattern scoring Similar constructs – Different derivations
Different Methods of Scoring Number-Correct Scoring Simple Mathematics Raw scores (# of points) Mean, SD, SEM, % correct Number right scale Score conversions Scale scores, percentile ranks, etc. Pattern Scoring Complex Mathematics Maximum likelihood estimates Item statistics, student’s answer pattern, SEM Theta scale (mean=0, standard dev=1) Score conversions Scale scores, percentile ranks, etc.
Comparison: Number Correct and Pattern Scoring Similarities The relationship of derived scores is the same For example, a scale score obtained in a test corresponds to the same percentile for both methods. Differences Methods of deriving scores The number of scale scores possible Number right = limited to the number of items IRT = unlimited or limited by the scale (ex. 100-500)
Choosing the Scoring Method Which model? Simple vs. Complex? Best estimates? Advantages/Disadvantages? Ex. – Why do the same number correct get different scale scores? Ex. – Flat screen TV – how do they do that?
Disadvantages of IRT and Pattern Scoring Complex Mathematics – Technical Difficult to explain Difficult to understand It doesn’t add up! Perceived as Hocus Pocus
Advantages of IRT and Pattern Scoring Better estimates of an examinee’s ability the score that is most likely, given the student’s responses to the questions on the test (maximum likelihood scoring) More information about students and items are used More reliability than number right scoring Less measurement error (SEM)
Item Characteristic Curve (ICC)
Examples 4 examinees’ response patterns (1=correct) 5 Items (Effects of Item Discrimination) No Type a b c 1 MC 0.0250 300.000 0.2 2 MC 0.0200 300.000 0.2 3 MC 0.0150 300.000 0.2 4 MC 0.0100 300.000 0.2 5 MC 0.0050 300.000 0.2 4 examinees’ response patterns (1=correct) Pattern SEM SS 12345 11100 39 300 01110 46 278 00111 61 258 10011 94 260
Examples 4 examinees’ response patterns (1=correct) 5 items (Effects of item difficulty) No Type a b c 1 1 MC 0.0150 250.000 0.1 2 1 MC 0.0150 275.000 0.1 3 1 MC 0.0150 300.000 0.1 4 1 MC 0.0150 325.000 0.1 5 1 MC 0.0150 350.000 0.1 4 examinees’ response patterns (1=correct) Pattern SEM SS 12345 11100 43 300 01110 43 305 00111 43 299 43 310 Missing easy items can result in a lower scores.