MEASUREMENT Goal To develop reliable and valid measures using state-of-the-art measurement models Members: Chang, Berdes, Gehlert, Gibbons, Schrauf, Weiss
Why Item Response Theory? Classical Test Theory (Traditional) Item Response Theory (Modern) Measures of precision fixed for all scores Precision measures vary across scores Longer scales increase reliability Shorter, targeted scales can be equally reliable (Short Form) Scale properties are sample dependent Item & scale properties are invariant within a linear transformation (DIF) Comparing person scores dependent on item set Person scores comparable across different item sets (CAT) Comparing respondents requires parallel scales Different scales can be placed on a common metric (Instrument Linking/Equating) Mixed item formats leads to unbalanced impact on total scale scores Easily handles mixed item formats Summed scores are on an ordinal scale Scores on interval scale Graphical tools for item and scale analysis
Item Response Theory (IRT) A family of mathematical descriptions of what happens when a person meets a test or survey question Relates characteristics of items (item parameters) and characteristics of persons (person latent traits) to the probability of a correct or rating/categorical response Models the test-taking behavior at the item level Item response theory (IRT) is a statistical theory consisting of mathematical models expressing the probability of endorsing a particular response to a test or survey item as a function of the abilities or latent traits of the persons and of certain characteristics of the item
Item-Person Map Person Latent Trait Item Location Poor Good Q Q Q Q Q Q Q Q Q Q Q Q Likely (“easy”) Unlikely (“hard”) Q Q Q Q Q Q Q Q Q Q Q Q Item Location Chang & Gehlert (2002).
Dichotomous Unidimensional IRT Models 1-PL (Rasch) Difficulty (b) 2-PL Discriminating (a) 3-PL Guessing (c)
Polytomous IRT Models Polytomous 1-PL (threshold) Partial Credit Rating Scale 2-PL (threshold & discriminating) Nominal Graded Response Generalized Partial Credit 2=Yes, Limited a little 1=Yes, Limited a lot 3=No, Not Limited at all 1 2 3 * Vigorous activities, such as running, lifting heavy objects, participating in strenuous sports
Potential Advantages of Using IRT in “Geriatric” Pain Assessment Refine existing instruments Evaluate item and scale characteristics Evaluate different response formats Detect differential item functioning Evaluate person fit (clinical diagnosis) Equate/Link instruments Establish item banks and brief forms Develop computerized adaptive testing
Item Banking and CAT A B C D E F Item Pool (Sets of Questions) IRT Q new Item Pool (Sets of Questions) IRT Q Q Q Q Item Bank (Catalogued; Hierarchically Structured) CAT Brief Forms
Principles of Adaptive Testing IRT pre-calibrated item bank Initial item selection Test scoring method Item selection during test administration Stopping rules A procedure for estimating a person’s trait or ability level A procedure fro choosing, from an available item bank, the item that is maximally informative at a person’s current trait-level estimate A termination rule used to discontinue item administration
Item Bank Set of carefully IRT-calibrated questions Items covers entire latent trait continuum Items represent differing amounts of trait Items represent differing amounts of information Basis for tailored/adaptive testing Items can be selected to maximize precision and retain clinical relevance
Item Banking is Inter-disciplinary Psychometricians Information scientists Clinicians/healthcare providers Outcomes researchers Content experts …
Approaches to Develop Item Banks Top-Down Approach Bottom-Up Approach
Development and Maintenance of an Item Bank How to best calibrate existing items? Model selection Whose item parameters to use? Standardization? Generic vs. disease-specific Item parameter drift Anchor or Re-calibrate? How to write and best test new items?
Adaptive Test An adaptive test is a tailored, individualized measure which involves selecting a set of test items for each individual that best measures the psychological characteristics of that person (Weiss, 1985) Weiss DJ. Adaptive testing by computer. J Consult Clin Psychol. Dec 1985;53(6):774-789.
Why Computerized Adaptive Testing? Adaptive testing selects questions based on previous responses Tailored item and test difficulties Eliminates floor and ceiling effects Require fewer questions to arrive at an accurate estimate Automate question administration, data recording, scoring, and prompt reporting Allows for immediate feedback Adaptive testing is a process of test administration in which items are selected on the basis of the examinee’s responses to previously administered items CAT is a special type of computerized testing that targets the “difficulty” of questions to the “ability” of examinees
CAT Algorithm Score Item Estimate Latent Trait (Theta) Administer Item of Median Difficulty (or Screening Item) Score Item Estimate Latent Trait (Theta) Termination Criterion Satisfied Choose and Administer Next Item with Maximum Information No Yes Stop
Increase of Accuracy of Ability or Latent Trait Estimation in CAT For each item added to the test, the width of the interval decreases. Item 1-5 Item 1-4 Item 1-3 Item 1-2 Item 1 Ability ()
Potential Problems with CAT in Pain and Health Outcomes Measurement Context effects Unbalanced content Time frame Response categories Multidimensionality
What kind of short form? Question 1 0 I do not feel sad. 1 I feel sad Rarely or none of the time (less than 1 day) Some or a little of the time (1-2 days) Occasionally or a moderate amount of time (3-4 days) All of the time (5-7 days) 1. I was bothered by things that usually don't bother me Question 1 0 I do not feel sad. 1 I feel sad 2 I am sad all the time and I can’t snap out of it. 3 I am so sad or unhappy that I can’t stand it. Are you basically satisfied with your life? True/False
MORE Research Still Needed for Effective CAT Implementation Item production Item statistics Item exposure Maintaining a valid bank of items for test construction Fairness Delivery options Effects of modes of administration Cost-benefit considerations
Infrastructure of a National Geriatric Pain Item Bank Subscriber Public Individual Researchers Pharm. Industries Non-profit Institutions Government Agencies National “Central” Item Bank Customized Information Retrieval; CAT; (automated) Brief Form Collector Analyzer Builder Retriever Consortium Approval IRT Analyses Item Parameters
An Integrated Solution for Pain and Outcomes Assessments Chang, C.-H., & Yang, D. (2003, April 15). Patient-Reported Outcomes Information Technology: The PROsITTM System. ISPOR CONNECTIONS, 9(2), 5-6.