Measurement Challenges in Addiction Treatment Research Michael L. Dennis, Ph.D. Chestnut Health Systems, Bloomington, IL Presentation at the International.

Slides:



Advertisements
Similar presentations
Standardized Scales.
Advertisements

Chapter 8 Flashcards.
The GAIN-Q (GQ): Development and Validation of a Substance Abuse and Mental Health Brief Assessment Janet C. Titus, Ph.D. Michael L. Dennis, Ph.D. Lighthouse.
Variation in DSM-IV Symptom Severity Depending on Type of Drug and Age: A Facets Analysis Michael L. Dennis, Ph.D. Chestnut Health Systems, Bloomington,
The Main Profiles Of Treatment Planning Needs Among Adolescents Presenting For Substance Abuse Treatment Based On Cluster Analysis Rodney R. Funk, Michael.
1 Intervening in the Recovery Process Michael L. Dennis, Ph.D. Christy K Scott, Ph.D. Chestnut Health Systems, Bloomington &Chicago, IL U.S.A. Presentation.
Predictors of Change in HIV Risk Factors for Adolescents Admitted to Substance Abuse Treatment Passetti, L. L., Garner, B. R., Funk, R., Godley, S. H.,
 Approximately 750,000 adolescent girls become pregnant each year (The Alan Guttmacher Institute, 2006). Though the adolescent pregnancy rate had been.
Client Profiles in the Offender Re- entry Program (ORP) and the Need to Address the Twin Issues of Trauma and Crime Michael Dennis, Ph.D. and Vinetha Belur,
Practical Applications of Measurement to Addiction Research (“Why do we care?”) Michael L. Dennis, Ph.D. Chestnut Health Systems, Bloomington, IL Presentation.
Chapter 19 Methods Appendix GAIN Coordinating Center (11/21/2012). Normal, IL: Chestnut Health Systems. November Available from
1 The Child and Family Traumatic Stress Intervention A family based model for early intervention and secondary prevention Steven Berkowitz, M.D. Steven.
Consistent with earlier research, these data found a high rate of co- occurring Axis-I psychiatric disorders. While there was substantial overall agreement,
Reading the Dental Literature
Trajectories of criminal behavior among adolescent substance users during treatment and thirty-month follow-up Ya-Fen Chan, Ph.D., Rod Funk, B.S., & Michael.
Trauma Issues with Specific Populations: Adolescents & Transition Age Youth OVERVIEW Michael Dennis, Ph.D. and Megan Catlin, M.S. Chestnut Health Systems,
What’s New in DSM-5 For Clinicians Working with Mandated Populations State Specialty Court Conference DuAne L. Young The Change Companies®
Patricia C. Post, Psy.D., Licensed Psychologist
Validity, Sampling & Experimental Control Psych 231: Research Methods in Psychology.
TYPES OF MENTAL ILLNESS. OVERVIEW DEPRESSION ANXIETY SUBSTANCE ABUSE.
Item Response Theory. Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel.
Meta-analysis & psychotherapy outcome research
Practical Meta-Analysis -- D. B. Wilson
FOUNDATIONS OF NURSING RESEARCH Sixth Edition CHAPTER Copyright ©2012 by Pearson Education, Inc. All rights reserved. Foundations of Nursing Research,
Chapter 9 Flashcards. measurement method that uses uniform procedures to collect, score, interpret, and report numerical results; usually has norms and.
Chapter 7 Correlational Research Gay, Mills, and Airasian
FINAL REPORT: OUTLINE & OVERVIEW OF SURVEY ERRORS
Computerized Adaptive Testing in Clinical Substance Abuse Practice: Issues and Strategies Barth Riley Lighthouse Institute, Chestnut Health Systems.
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013.
1 Reducing the duration and cost of assessment with the GAIN: Computer Adaptive Testing.
Kendon ConradBarth Riley University of Illinois at Chicago Michael L. Dennis Chestnut Health Systems.
Making all research results publically available: the cry of systematic reviewers.
Are the results valid? Was the validity of the included studies appraised?
Inference for regression - Simple linear regression
Audrey J. Brooks, PhD University of Arizona CA-AZ node.
Using Research/Evaluation Questions to Define Data Collection and Findings: Findings from the FY 2004 KTOS Follow-up Study Robert Walker, Allison Mateyoke-Scrivener,
Frequency and type of adverse events associated with treating women with trauma in community substance abuse treatment programs T. KIlleen 1, C. Brown.
1 Validation of DSM-IV Substance Use Disorder by Substance and Age Using Rasch Michael L. Dennis, Ph.D.,* Kendon Conrad** and Rodney Funk* *Chestnut Health.
Program Evaluation. Program evaluation Methodological techniques of the social sciences social policy public welfare administration.
AOD Use and Mental Health Disparities during Pregnancy and Postpartum Victoria H. Coleman, Ph.D. & Michael L. Dennis, Ph.D. Chestnut Health Systems, Bloomington,
Introduction Neuropsychological Symptoms Scale The Neuropsychological Symptoms Scale (NSS; Dean, 2010) was designed for use in the clinical interview to.
Chapter 11 Subset of Overview by Mental Health Disorders GAIN Coordinating Center (11/21/2012). Normal, IL: Chestnut Health Systems. November Available.
By C. Kohn Waterford Agricultural Sciences.   A major concern in science is proving that what we have observed would occur again if we repeated the.
Consumer behavior studies1 CONSUMER BEHAVIOR STUDIES STATISTICAL ISSUES Ralph B. D’Agostino, Sr. Boston University Harvard Clinical Research Institute.
“The Effect of Patient Complexity on Treatment Outcomes for Patients Enrolled in an Integrated Depression Treatment Program- a Pilot Study” Ryan Miller,
Participants were recruited from 6 drug free, psychosocial treatment (PT) and 5 methadone maintenance (MM) programs (N = 628) participating in a NIDA Clinical.
Chapter 4 – Research Methods in Clinical Psych Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Introduction Introduction Alcohol Abuse Characteristics Results and Conclusions Results and Conclusions Analyses comparing primary substance of abuse indicated.
Health Disparities Webinar 2/28/2013 Michael L. Dennis, Chestnut Health Systems. Normal, IL Available from
Introduction Results and Conclusions Categorical group comparisons revealed no differences on demographic or social variables. At admission to treatment,
Chapter 6 Subset of Overview by Gender GAIN Coordinating Center (11/21/2012). Normal, IL: Chestnut Health Systems. November Available from
Chapter 13 Subset of Overview by Crime and Violence GAIN Coordinating Center (11/21/2012). Normal, IL: Chestnut Health Systems. November Available.
Introduction Results and Conclusions Analyses of demographic and social variables revealed that women were more likely to have children, be living in a.
META-ANALYSIS, RESEARCH SYNTHESES AND SYSTEMATIC REVIEWS © LOUIS COHEN, LAWRENCE MANION & KEITH MORRISON.
Predicting Stage Transitions in the Development of Nicotine Dependence Carolyn E. Sartor, Hong Xian, Jeffrey F. Scherrer, Michael Lynskey, William True,
EBM --- Journal Reading Presenter :呂宥達 Date : 2005/10/27.
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Chapter 9 Subset of Overview by Risk of Homelessness GAIN Coordinating Center (11/21/2012). Normal, IL: Chestnut Health Systems. November Available.
Introduction Results and Conclusions Comparisons of psychiatric hospitalization rates in the 12 months prior to and after baseline assessment revealed.
REGRESSION MODEL FITTING & IDENTIFICATION OF PROGNOSTIC FACTORS BISMA FAROOQI.
Randomized Controlled CTN Trial of OROS-MPH + CBT in Adolescents with ADHD and Substance Use Disorders Paula Riggs, M.D., Theresa Winhusen, PhD., Jeff.
Medication Adherence and Substance Abuse Predict 18-Month Recidivism among Mental Health Jail Diversion Program Clients Elizabeth N. Burris 1, Evan M.
Delaware Pretrial Risk Assessment Validation & Lessons Learned Presented at NCJA Baltimore Regional Meeting June 2016.
Kimberly Jeffries Leonard, Ph.D.
Do Adoptees Have Lower Self Esteem?
Chapter 4 Research Methods in Clinical Psychology
The role of Emotion Regulation Difficulties and Anxiety Sensitivity
Lecture 4: Meta-analysis
Treatment for PTSD and SUD:
Item Analysis: Classical and Beyond
Presentation transcript:

Measurement Challenges in Addiction Treatment Research Michael L. Dennis, Ph.D. Chestnut Health Systems, Bloomington, IL Presentation at the International Conference on Outcome Measurement, September 11, 2008, Bethesda, MD. This presentation supported by National Institute on Drug Abuse (NIDA) grant no R37 DA11323 and Center for Substance Abuse Treatment (CSAT), Substance Abuse and Mental Health Services Administration (SAMHSA) contract The opinions are those of the author and do not reflect official positions of the consortium or government. Available on line at or by contacting Joan Unsicker at 720 West Chestnut, Bloomington, IL 61701, phone: (309) , fax: (309) ,

Objectives are to...  Examine why more traditional clinical trials type researchers need to care about measurement  Provide explicit practical examples of how addressing measurement in Addiction Research can help improve it

Since the early 1960s, Jacob Cohen and colleagues has suggest that clinical trials research should:  Focus on Statistical power, which is - the probability of finding what you are looking for given that it is there  Combine data from multiple clinical trials into meta analyses, which can be used as - a more stable estimate of truth - to evaluate the accuracy of our early estimates and how methods can be improved

In a review of over 200 meta analyses of medical, social and legal studies published between , Lipsey consistently found  Less than a third of the individual articles coded even mentioned - the statistical power of their core contrast - reliability, validity, or sensitivity of their outcome measure  That relative to final effect size estimated from the meta analysis, the studies averaged less than 50% power - in other words, it was more accurate to flip a coin than to use a statistical test the way they were being used “on average” in the published literature

Movement to Improve the Methodological Quality of Clinical Trials Research  In 1993 a group of 30 experts (medical journal editors, clinical trialists, epidemiologists, and methodologists) met in Ottawa to try to identify methodological gaps in the literature  In 1996 this growing group issued the Consolidated Standards of Reporting Trials (CONSORT;  Since 2000, NIH has required DSMB on all Phase 3 and multi-site phase 2 studies (Notice OD-00-38) – which also push CONSORT  Today virtually every major medical, psychiatric, psychological, criminological, and social journal has signed onto CONSORT

Basic ways to increase power  Increase sample size  Increase observations  Target a higher severity/less heterogeneous sample  Increase implementation  Reduce measurement error  Reduce unexplained variance (which may be systematic)  More accurately model error and unexplained variance in analysis While the most common approach, these are also the most expensive and logistically difficult to do Today’s focus

Observed Effect Size as a function of “True” effect size (Cohen’s d) and reliability of dependent variable No Measurement Error “Observed” Effect size goes down with lower reliability

Sample size required for 80% power as a function of “True” effect size (Cohen’s d) and reliability of dependent variable A reliability of.7 doubles sample size requirements Increasing reliability from.4 to.7 cuts sample size requirements by over 50%

Proportion of Inconsistencies (100%)* Duration (in Minutes)* Denial/Misrepresentation (Staff Rating)* Context Effect (Staff Report) Proportion of Missing Data (100%) Atypicalness (Outfit in Logits) Randomness (Infit in Logits) <- Cohen's d a \a Cohen's d (Post Certification - Pre Certification)/Pooled STD * p<.05 Impact of Comprehensive Data Collection Protocol Certification on Measurement Issues Source: GAIN coordinating center

Major improvement over the first 15 interviews Most improvements have occurred by 60 interviews Source: GAIN coordinating center Staff Experience Matters as well

Key Advantages of Creating Scales and Indices for Clinical Research  One of the lowest cost ways to reduce measurement error and increase statistical power  Reduce clinical omissions and backtracking for validity checks  Increase conceptual robustness, interpretability and make it easier to explain to others  Facilitates profiling over a large number of items

Impact of Number of Items on Reliability (Alpha) Observed by Average Inter-item Correlation Generally target.7 to.9 Behavioral Measures (e.g., how many days, times) have high reliability and max out around 3-5 items Covert Scales (e.g., MMPI), summative indices, and other measures with low inter item R may take 30 items (or more) Symptom counts related to a syndrome or latent construct usually max out in 5-13 items

Note you can also create a summary measures across different sources of data Source: Lennox et al 2006 (CFI=.98)

Formal Measurement Models Can Be Used to  Place people along a more reliable/sensitive ruler (aka common or latent factor)  Look at the slope/ discrimination of items (primarily 2 parameter IRT)  Related items in terms of their average severity  Look at the match/mismatch of people and item locations (primarily Rasch / 1 parameter IRT)  Study real differences by primary substance, gender, race, age or other groups  Identify potential bias at the item and test level by gender, race or other groups  Identify atypical patterns of answers (e.g. outfit)  Identify random response patterns (e.g., infit) or less valid response patterns  Replace missing data (whether small amounts or do to computer adaptive testing

Impact of Item Discrimination (aka steepness of slope) on Sample Size Requirements 16-36% reduction in sample size IRT is generally more efficient if the items have low or varied discrimination Rasch is generally more efficient if the items have high discrimination

Raw v Rasch v IRT Scales (my take)  Raw, Rasch and IRT scales generally correlated over.95 and vary by less than 5% in sample size requirements/power  Raw scales are the easiest to calculate (even by hand) and get most of the benefit. On the down side items are not equal, rarely helps you build theory, and require separate approaches to handle missing data  Rasch scales focus on high discriminate items, fitting the data to a common measurement model that is very efficient when comparing items and people and theories. On the down side they assume your focus is on building an interval ruler, that item slopes are similar and that you want to compare subgroups of people with each other or over time  IRT scales focus on fitting the measurement model to the data (opposite of Rasch), explaining additional variance by adding parameters for slope and guessing, and are particularly useful when you have a preexisting items with a wide range of discrimination. On the down side they are more difficult to calculate, require multiple iterations and larger sample sizes.

Structure of GAIN’s Psychopathology Measures and Validity Checks Example of how scales can also be inter-related and used for validation Higher scores associated with alcohol and drug abuse medication (methadone, naltrexone, antaabuse, buprenorphine) and/or substance induced legal, mental health, physical health, and withdrawal problems Higher scores associated with greater dysfunction (e.g., dropping out of school, unemployment, financial problems, homelessness) Higher scores associated with mental health treatment (e.g., anti depressants, seritonin reuptake inhibitors (SSRI), monoamine oxidase inhibitors (MAOI) sedatives) and/or a history of traumatic victimization, and/or high levels of stress Higher scores associated with mental health treatment (e.g., Ritalin, Adderall, lithium), special/alternative education, school or work problems, gambling and other evidence of impulse control problems, and/or anti-social/borderline personality disorders Higher scores associated with arrests, detention/jail time, probation, parole, size of drug habit

Internalizing Disorder Subscale Item Calibrations when Considering Diagnoses Separately Logits DepressionAnxietyTraumaSuicidal Increasing Severity Most common has narrow range of variation Small to major gaps in measure

Logits Suicidal Trauma Anxiety Depression Somatic Increasing Severity Internalizing Disorder Subscale Item Calibrations Considered as a Second Order Factor

On-Going Debates About SUD Concept Formal assumption that symptoms of “physiological dependence” (either tolerance or withdrawal) are markers of high severity Debate about whether “abuse” symptoms should be dropped, thought of as early dependence, or thought of as moderate/high severity markers that warrant treatment even in the absence of a full syndrome Debate about whether to treat diagnostic orphans (1-2 symptoms of dependence) as abuse or continue to ignore them Concern about whether the current symptoms (which were based primarily on adult data) are appropriate for use with adolescents Concern about the sensitivity to change

Sample Characteristics Adolescents: <18 (n=2474) Young Adult: (n=344) Adults: 26+ (n=661) Male 74%58%47% Caucasian 48%54%29% African American 18%27%63% Hispanic 12%7%2% Average Age Substance Disorder 85%82%90% Internal Disorder 53%62%67% External Disorder 63%45%37% Crime/Violence 64%51%34% Residential Tx 31%56%74% Current CJ/JJ invol. 69%74%45% Note: all significant, p <.01

Item Relationships Across Substances Rasch Severity Measure Desp.PH/MH (+0.10) Give up act. (+0.05) Can't stop (+0.05) Time Cons. (-0.21) Loss of Contro (-0.10) Hazardous (-0.03) Despite Legal (+0.10) Role Failure (-0.12) Fights/troub. (0.17) Time Cons Role Failure Fights/troub. Loss of Control Hazardous Tolerance Can't stop Give up act. Desp.PH/MH Despite Legal Withdrawal Tolerance (0.00) Withdrawal (+0.34) Physiological Sx: While Withdrawal is High severity, Tolerance is only Moderate Dependence Sx: Other dependence Symptoms spread over continuum Abuse Sx: Abuse Symptoms are also spread over continuum 1 st dimension explains 75% of variance (2 nd explains 1.2%) Average Item Severity (0.00)

Symptom Severity Varied by Drug Easier to endorse hazardous use for ALC/CAN Rasch Severity Measure ALC AMP CAN COC OPI ALC AMP CAN COC OPI Time Cons. Role Failure Fights/troub. Loss of Control Hazardous Tolerance Can't stop Give up act. Desp.PH/MHDespite Legal Withdrawal AVG (0.00) ALC (-0.44) AMP (+0.89) CAN (-0.67) COC (-0.22) OPI (+0.44) Easier to endorse fighting/ trouble for ALC/CAN Easier to endorse time consuming for CAN Easier to endorse moderate Sx for COC/OPI Easier to endorse despite legal problem for ALC/CAN Easier to endorse Withdrawal for AMP/OPI Withdrawal much less likely for CAN

Symptom Severity Varied Even More By Age Rasch Severity Measure < < Time Cons. Role Failure Fights/troub. Loss of Control Hazardous Tolerance Can't stop Give up act. Desp.PH/MH Despite Legal Withdrawal < Age Adults more likely to endorse most symptoms More likely to lead to fights among Adol/YA Hazardous use more likely among Adol/YA Continued use in spite of legal problems more likely among Adol/YA

Rasch Severity by Past Month Status NoneDiagnostic Orphan in early remission Diagnostic Orphan Lifetime SUD in early remission Lifetime SUD in CE 45+ days Abuse Only Dependence Only Both Abuse and Dependence Rasch Severity Measure Diagnostic Orphans (1-2 dependence symptoms) are lower, but still overlap with other clinical groups

Severity by Past Year Symptom Count Rasch Severity Measure 1. Better Gradation 2. Still a lot of overlap in range

Severity by Weighted (past month=2, past year=1) Number of Substance x SUD Symptoms Rasch Severity Measure Better Gradation 2. Much less overlap in range

Construct Validity (i.e., does it matter?) FrequencyOf Use Past Week WithdrawalEmotionalProblemsRecovery Environment Social Risk DSM diagnosis \a Symptom Count Continuous \b \a Categorized as Past year physiology dependence, non-physiological dependence, abuse, other \b Raw past year symptom count (0-11) \c Symptoms weighted by recency (2=past month, 1=2-12 months ago, 0=other) Past year Symptom count did better than DSM Weighted Symptom Rasch \c Rasch does a little Better still

Implications for SUD Concept  “Tolerance” is not a good marker of high severity; withdrawal (and substance induced health problems are)  “Abuse” symptoms are consistent with the overall syndrome and represent moderate severity or “other reasons to treat in the absence of the full blown syndrome”  Diagnostic orphans are lower severity, but relevant  Pattern of symptoms varies by substance and age, but all symptoms are relevant  “Adolescents” experienced the same range of symptoms, though they (and young adults) were particularly more likely to be involved with the law, use in hazardous situations, and to get into fights at lower severity  Symptom Counts appear to be more useful than the current DSM approach to categorizing severity  While weighting by recency & drug delineated severity, it did not improve construct validity