Computerized Adaptive Testing in Clinical Substance Abuse Practice: Issues and Strategies Barth Riley Lighthouse Institute, Chestnut Health Systems.

Computerized Adaptive Testing in Clinical Substance Abuse Practice: Issues and Strategies Barth Riley Lighthouse Institute, Chestnut Health Systems

Overview CAT Basics CAT in Clinical Assessment Triage of individuals to support clinical decision making Measuring Multiple Dimensions Identifying Persons with Atypical Presentation of Symptoms

Evidence-Based Practice Requires accurate diagnosis, treatment placement, and outcomes monitoring Assessment over a wide range of domains The cost of evidence-based assessment is: Time Respondent Burden Increased staff resources (including training

Improving Efficiency The use of screeners and short-form instruments has significantly improved the efficiency of the assessment process Can help determine whether a full assessment is warranted But not a substitute for a full assessment Lack of precision Floor and ceiling effects Limited content validity

CAT Basics

CAT Process Decreased Difficulty Typical Pattern of Responses Increased Difficulty Middle Difficulty Score is calculated and the next best item is selected based on item difficulty +/- 1 Std. Error CorrectIncorrect

Logical Components of CAT Start Rule Item Selection Measure Estimation Stop Rule(s)

The Start Rule Used to select first item What measure is assigned to the respondent prior to selecting the first item? Can be an arbitrary value (0 on the logit scale) or can be based on previously gathered information.

Item Information Item Difficulty = 0.5 too difficult too easy Maximum information, Trait level = 0.5

CAT in Clinical Assessment

Clinical Decision Making How severe are the symptoms? What type of treatment is most appropriate? Can CAT be used to answer these questions more efficiently?

Strategy Starting Rules Using screener measures to set the initial measure and select the first item Variable Stop Rules Tight precision around cut points Less precision away from cut points

Riley, Conrad, Dennis & Bezruczko, 2007 Used CAT to place persons into low, moderate and high levels of substance abuse and dependency. Substance Problem Scale (SPS) is a 16 item instrument measuring recency of substance use. When was the last time you drank alcohol?

Defining Cut Points Cut points can be established by examining where persons with different levels of severity fall onto the measurement continuum.

The Start Rules Random: Randomly select an item between -0.5 and 0.5 logits of severity. Screener: Select most informative item relative to measure on a previously administered screener (SDScr).

The Variable Stop Rule Stop rules set for low, mid and high range of severity. Mid range stop rule was set to SE=0.35 for all simulations. Low and High range stop rule: SE=0.5 to 0.75

CAT Standard Error Middle range where decisions and made and precision is controlled High & Low ranges where there is little impact on clinical decisions and precision is allowed to vary more

Start Rule Using Screener Select item Administer item Estimate Measure, SE Stop? End test Yes No High range? Mid range? Low range stop rule High range stop rule Mid range stop rule Yes No CAT Algorithm

Results Screener starting rule improved CAT efficiency by 7 percent CAT reduced the number of required items by 13 to 66% CAT to full-measure correlations ranged from.87 to.99 Classification of persons into treatment groups based on CAT and full measure (kappa coefficients) ranged from.66 to.71.

Results Variable stop rules improved efficiency by 15-38% Efficiency depended on definition of the mid range of severity Screener start rule and variable stop rules resulted in accurate and efficient estimation of substance abuse severity.

Measuring Multiple Dimensions

Assessment on Multiple Dimensions Instruments often measure multiple constructs In CAT, treating a multidimensional item bank as unidimensional is problematic: Some subdimensions may not be adequately measured Particularly if subdimensions are not highly correlated with each other

Strategy: Content Balancing Set an item “quota” for each subscale Maximum number of subscale items to administer during the CAT An item is selected if: Its subscale quota has not been met Provides maximum information

Internal Mental Distress Scale The IMDS consists of the following subscales: Depression Symptom Scale Anxiety/Fear Symptom Scale Traumatic Distress Scale Homicidal/Suicidal Scale

Variations of Content Balancing Screener: Administers screener items first; no further content balancing. Mixed: Administers screener items, then uses content balancing for remaining items. Full: Uses content balancing throughout CAT session.

Variations of Content Balancing In mixed and full content balancing, the following target number of items is administered from the IMDS subscales: Depression: 5 Anxiety: 5 Trauma: 5 Homicidal/Suicidal: 3

Content Balancing Results ScaleN ItemsNoneScreenerMixedFull Depression ≥ 199.1%100% ≥ 3 79.1%76.7% 100% Homicidal/ Suicidal ≥ 1 20.5% 100% ≥ 3 8.2%7.8% 100% Anxiety ≥ 1100% ≥ 3100% Trauma ≥ 1100% ≥ 399.7%100%

CAT to Full-Scale Correlations ScaleNoneScreenerMixedFull IMDS0.982 0.9780.971 Depression0.9570.9370.956 Homicidal/ Suicidal 0.5990.8280.9640.945 Anxiety0.9620.9470.9560.957 Trauma0.9680.9740.9720.969 Average r0.8940.9340.9650.960

Placement into Triage Groups MeasureNoneScreenerMixedFull IMDS.867.871.863.841 Depression.909.911.753.749 Homicidal/ Suicidal.312.067.917.902 Anxiety.803.759.811.790 Trauma.836.850.847.837 Average Kappa.745.692.838.824

Results Content balancing had the greatest impact on homicidal/suicidal scale. Mixed content balancing provided best overall results

Identifying Persons with Atypical Presentation of Symptoms

Implications Implications: Clients sometimes endorse severe clinical symptoms that are not reflected by overall scores on standard assessments. Misfit in clinical assessment can reflect: Difficulty understanding the assessment Cross-cultural effects Differential effects of treatment on some symptoms but not others Unusual symptom profiles

Clinical Implications Results reveal subgroups who endorse severe symptoms without endorsement of milder symptoms. Atypical Suicide profile Substance dependence symptoms with abuse symptoms Persons who commit serious crimes (murder, rape) who have not committed less serious criminal offenses.

Person Fit Statistics Person fit statistics are the most common means of detecting atypical responders. Here is a typical (predicted by IRT) pattern of responding: 11111101000000000 Here is an example of an atypical response pattern: 110111110100000111

Fit Statistics in CAT Become less sensitive as the number of administered items decreases. In CAT, items are usually selected in which each possible response to the item is equally likely. Items for which unusual responses are given may not be administered by the CAT.

Outfit by Number of Items Admin. Items Outfit Categories < 0.75 Proto Typical 0.75-1.33 Typical > 1.33 Atypical 1630.2%48.1%21.7% 1234.3%51.1%14.6% 838.4%53.2%8.4% 458.2%40.0%1.8%

Strategies Item selection strategies Unidimensional Approach Examine response patterns for items representing a second- order construct, such as internal mental distress Fit statistics: detects all atypical symptom patterns Multidimensional Approach Compare subdimension measures Detection of a specific response pattern Is the persons level of suicide ideation greater than their level of depression? How big a difference in measures? Combination of the above

Does Item Selection Matter? Atypicalness Category NoneScreenerMixedFull IMDS Proto Typical26.7%34.6%48.3%50.5%49.2% Typical69.0%58.7%40.8%38.9%38.4% Atypical4.3%6.5%10.9%10.6%12.4% Kappa.27.32.48.50--

CAT to Full-Measure Person Fit CAT* Statistic Full-Measure Outfit r=.73 EiEi r=.31 Homicidal/Suicidal – Depression r=.08 Logistic Regression Correct %91.6% * Using full content balancing

Suicide-Depression Profile CAT* Statistic Full Measure H/S a - Depression Outfitr =.11 Eir = -.54 H/S-Depressionr =.92 Multiple RegressionR 2 =.86 * Using full content balancing a Homicidal-Suicidal Scale measure

Conclusions Fit statistics and examination of subscale scores appear to capture different response patterns. Using effective item selection methods in conjunction with multiple measures of person fit improves our ability to detect atypical symptom patterns.

Potential of CAT in Clinical Practice Reduce respondent burden Reduce staff resources Reduce data fragmentation Streamline complex assessment procedures Assist in clinical decision making Identify persons with atypical profiles

Contact Information A copy of this presentation will be at: www.chestnut.org/li/posters For information on this method and a paper on it, please contact Barth Riley at bbriley@chestnut.org bbriley@chestnut.org

Computerized Adaptive Testing in Clinical Substance Abuse Practice: Issues and Strategies Barth Riley Lighthouse Institute, Chestnut Health Systems.

Similar presentations

Presentation on theme: "Computerized Adaptive Testing in Clinical Substance Abuse Practice: Issues and Strategies Barth Riley Lighthouse Institute, Chestnut Health Systems."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Computerized Adaptive Testing in Clinical Substance Abuse Practice: Issues and Strategies Barth Riley Lighthouse Institute, Chestnut Health Systems.

Similar presentations

Presentation on theme: "Computerized Adaptive Testing in Clinical Substance Abuse Practice: Issues and Strategies Barth Riley Lighthouse Institute, Chestnut Health Systems."— Presentation transcript:

Similar presentations

About project

Feedback