Computerized Adaptive Testing in Clinical Substance Abuse Practice: Issues and Strategies Barth Riley Lighthouse Institute, Chestnut Health Systems
Overview CAT Basics CAT in Clinical Assessment Triage of individuals to support clinical decision making Measuring Multiple Dimensions Identifying Persons with Atypical Presentation of Symptoms
Evidence-Based Practice Requires accurate diagnosis, treatment placement, and outcomes monitoring Assessment over a wide range of domains The cost of evidence-based assessment is: Time Respondent Burden Increased staff resources (including training
Improving Efficiency The use of screeners and short-form instruments has significantly improved the efficiency of the assessment process Can help determine whether a full assessment is warranted But not a substitute for a full assessment Lack of precision Floor and ceiling effects Limited content validity
CAT Basics
CAT Process Decreased Difficulty Typical Pattern of Responses Increased Difficulty Middle Difficulty Score is calculated and the next best item is selected based on item difficulty +/- 1 Std. Error CorrectIncorrect
Logical Components of CAT Start Rule Item Selection Measure Estimation Stop Rule(s)
The Start Rule Used to select first item What measure is assigned to the respondent prior to selecting the first item? Can be an arbitrary value (0 on the logit scale) or can be based on previously gathered information.
Item Information Item Difficulty = 0.5 too difficult too easy Maximum information, Trait level = 0.5
CAT in Clinical Assessment
Clinical Decision Making How severe are the symptoms? What type of treatment is most appropriate? Can CAT be used to answer these questions more efficiently?
Strategy Starting Rules Using screener measures to set the initial measure and select the first item Variable Stop Rules Tight precision around cut points Less precision away from cut points
Riley, Conrad, Dennis & Bezruczko, 2007 Used CAT to place persons into low, moderate and high levels of substance abuse and dependency. Substance Problem Scale (SPS) is a 16 item instrument measuring recency of substance use. When was the last time you drank alcohol?
Defining Cut Points Cut points can be established by examining where persons with different levels of severity fall onto the measurement continuum.
The Start Rules Random: Randomly select an item between -0.5 and 0.5 logits of severity. Screener: Select most informative item relative to measure on a previously administered screener (SDScr).
The Variable Stop Rule Stop rules set for low, mid and high range of severity. Mid range stop rule was set to SE=0.35 for all simulations. Low and High range stop rule: SE=0.5 to 0.75
CAT Standard Error Middle range where decisions and made and precision is controlled High & Low ranges where there is little impact on clinical decisions and precision is allowed to vary more
Start Rule Using Screener Select item Administer item Estimate Measure, SE Stop? End test Yes No High range? Mid range? Low range stop rule High range stop rule Mid range stop rule Yes No CAT Algorithm
Results Screener starting rule improved CAT efficiency by 7 percent CAT reduced the number of required items by 13 to 66% CAT to full-measure correlations ranged from.87 to.99 Classification of persons into treatment groups based on CAT and full measure (kappa coefficients) ranged from.66 to.71.
Results Variable stop rules improved efficiency by 15-38% Efficiency depended on definition of the mid range of severity Screener start rule and variable stop rules resulted in accurate and efficient estimation of substance abuse severity.
Measuring Multiple Dimensions
Assessment on Multiple Dimensions Instruments often measure multiple constructs In CAT, treating a multidimensional item bank as unidimensional is problematic: Some subdimensions may not be adequately measured Particularly if subdimensions are not highly correlated with each other
Strategy: Content Balancing Set an item “quota” for each subscale Maximum number of subscale items to administer during the CAT An item is selected if: Its subscale quota has not been met Provides maximum information
Internal Mental Distress Scale The IMDS consists of the following subscales: Depression Symptom Scale Anxiety/Fear Symptom Scale Traumatic Distress Scale Homicidal/Suicidal Scale
Variations of Content Balancing Screener: Administers screener items first; no further content balancing. Mixed: Administers screener items, then uses content balancing for remaining items. Full: Uses content balancing throughout CAT session.
Variations of Content Balancing In mixed and full content balancing, the following target number of items is administered from the IMDS subscales: Depression: 5 Anxiety: 5 Trauma: 5 Homicidal/Suicidal: 3
Content Balancing Results ScaleN ItemsNoneScreenerMixedFull Depression ≥ 199.1%100% ≥ %76.7% 100% Homicidal/ Suicidal ≥ % 100% ≥ 3 8.2%7.8% 100% Anxiety ≥ 1100% ≥ 3100% Trauma ≥ 1100% ≥ 399.7%100%
CAT to Full-Scale Correlations ScaleNoneScreenerMixedFull IMDS Depression Homicidal/ Suicidal Anxiety Trauma Average r
Placement into Triage Groups MeasureNoneScreenerMixedFull IMDS Depression Homicidal/ Suicidal Anxiety Trauma Average Kappa
Results Content balancing had the greatest impact on homicidal/suicidal scale. Mixed content balancing provided best overall results
Identifying Persons with Atypical Presentation of Symptoms
Implications Implications: Clients sometimes endorse severe clinical symptoms that are not reflected by overall scores on standard assessments. Misfit in clinical assessment can reflect: Difficulty understanding the assessment Cross-cultural effects Differential effects of treatment on some symptoms but not others Unusual symptom profiles
Clinical Implications Results reveal subgroups who endorse severe symptoms without endorsement of milder symptoms. Atypical Suicide profile Substance dependence symptoms with abuse symptoms Persons who commit serious crimes (murder, rape) who have not committed less serious criminal offenses.
Person Fit Statistics Person fit statistics are the most common means of detecting atypical responders. Here is a typical (predicted by IRT) pattern of responding: Here is an example of an atypical response pattern:
Fit Statistics in CAT Become less sensitive as the number of administered items decreases. In CAT, items are usually selected in which each possible response to the item is equally likely. Items for which unusual responses are given may not be administered by the CAT.
Outfit by Number of Items Admin. Items Outfit Categories < 0.75 Proto Typical Typical > 1.33 Atypical %48.1%21.7% %51.1%14.6% 838.4%53.2%8.4% 458.2%40.0%1.8%
Strategies Item selection strategies Unidimensional Approach Examine response patterns for items representing a second- order construct, such as internal mental distress Fit statistics: detects all atypical symptom patterns Multidimensional Approach Compare subdimension measures Detection of a specific response pattern Is the persons level of suicide ideation greater than their level of depression? How big a difference in measures? Combination of the above
Does Item Selection Matter? Atypicalness Category NoneScreenerMixedFull IMDS Proto Typical26.7%34.6%48.3%50.5%49.2% Typical69.0%58.7%40.8%38.9%38.4% Atypical4.3%6.5%10.9%10.6%12.4% Kappa
CAT to Full-Measure Person Fit CAT* Statistic Full-Measure Outfit r=.73 EiEi r=.31 Homicidal/Suicidal – Depression r=.08 Logistic Regression Correct %91.6% * Using full content balancing
Suicide-Depression Profile CAT* Statistic Full Measure H/S a - Depression Outfitr =.11 Eir = -.54 H/S-Depressionr =.92 Multiple RegressionR 2 =.86 * Using full content balancing a Homicidal-Suicidal Scale measure
Conclusions Fit statistics and examination of subscale scores appear to capture different response patterns. Using effective item selection methods in conjunction with multiple measures of person fit improves our ability to detect atypical symptom patterns.
Potential of CAT in Clinical Practice Reduce respondent burden Reduce staff resources Reduce data fragmentation Streamline complex assessment procedures Assist in clinical decision making Identify persons with atypical profiles
Contact Information A copy of this presentation will be at: For information on this method and a paper on it, please contact Barth Riley at