Download presentation
Presentation is loading. Please wait.
Published byDylan Booth Modified over 9 years ago
1
Modern Test Theory Item Response Theory (IRT)
2
Limitations of classical test theory An examinee’s ability is defined in terms of a particular test The difficulty of a test item is defined in terms of a particular group of test-takers In short, “examinee characteristics and test item characteristics cannot be separated: each can be interpreted only in the context of the other” (Hambleton, et.al, 1991, p. 3) Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: SAGE Publications, Inc.
3
Joe and the 8-item test Joe’s Ability Score: 8 Score: 0 Item 1Item 8 Very Easy Test Item 1Item 8 Very Hard Test Narrow Hard Test Item 1 Item 8 Score: 3 Adapted from: Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA Press.
4
Non-linearity of scores Joe’s Ability Tom’s Ability Item 1Item 8 Joe’s Ability Tom’s Ability Joe’s Ability Tom’s Ability Item 1Item 8Item 1Item 8 Score = 0Score = 8 Score = 4
5
Latent trait and performance Latent Variable (True Score) Form 1 score Form 2 score Form 3 score Item 1 Response Item 2 Response Item 3 Response Error 1 Error 2 Error 3 Classical Test Theory Item Response Theory Latent Variable 1 0 1 0 1 0 Embretson, S. E. (1999). Issues in the measurement of cognitive abilities. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement (pp. 1-15). Mahwah, NJ: Lawrence Erlbaum Associates.
6
Item Response Theory (IRT) The performance of an examinee on a test item can be predicted (explained) by latent traits As a persons level of the underlying trait increases, the probability of a correct response to an item increases This relationship (person and item) can be visualized by an Item Information Curve (ICC) (Hambleton, et.al., 1991)
7
Understanding Item Characteristic Curves Imagine a continuum of vocabulary knowledge SleepySomnolent Oscitant Thorndike, R. M. (1999). IRT and intelligence testing: Past, present, and future. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement (pp. 17-36). Mahway, NJ: Lawrence Erlbaum Associates.
8
Understanding ICC (2) (Thorndike, 1999, p. 20)
9
Item Difficulty
10
Item Discrimination
11
3-Parameter Model
12
Vocabulary ICC revisited
13
Basic IRT concept PROB(Item Passed)=FUNCTION[(TraitLevel)-(ItemDifficulty)]
14
Assumptions of IRT Unidimensionality – only one ability is measured by a set of items on a test Local independence – examinee’s responses to any two items are statistically independent 1-parameter model – no guessing, item discrimination is the same for all items 2-parameter model – no guessing
15
Advantages of IRT Sample-free item calibration Test-free person measurement Item banking facility Computer delivery of tests Test tailoring facility Score reporting facility Item bias detection Henning, G. (1987). A guide to language testing: development, evaluation, research. Boston: Heinle & Heinle.
16
Linking items across test forms As long as there are some common items (linking items), person ability estimates can be made from performance on different items Items common to Test A and B (Henning, 1987, p. 133)
17
Score reporting facility (McNamara, 1996, p.201)
18
Test tailoring facility An untailored standardized test gives maximum information near its mean Imagine that a university required a score above 67 to be admitted and above 82 to be exempt from language classes A tailored test can be “loaded” with items that provide maximum information at the cut-points
19
Computerized testing Computer-delivered tests –Tests which use a computer rather than pencil and paper for test content delivery –Items can take advantage of computer’s multimedia capabilities Computer-adaptive tests –Test is created “on the fly” to match examinee’s ability level Web-based tests –Delivered over the World Wide Web –Test-takers can access from anywhere
20
Adaptive testing Sands, W. A., & Waters, B. K. (1997). Introduction to ASVAB and CAT. In W. A. Sands & B. K. Waters & J. R. McBride (Eds.), Computerized adaptive testing (pp. 3-10). Washington: American Psychological Association.
21
CAT advantages Increased efficiency –More able examinees are not bored with easy questions –Less able examinees are not frustrated with incredibly difficult questions Immediate feedback is possible Examinees can work at own pace Audiovisual material can be incorporated Potential for “on demand” testing
22
CAT Challenges Technical sophistication required to develop and administer CAT Need for large item pool Overexposure of best items Ensuring consistency of measures and content across candidates Public perception of computer-based scores –Completely infallible –Completely bogus
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.