Evaluating the Significance of Individual Change

Slides:



Advertisements
Similar presentations
How does PROMIS compare to other HRQOL measures? Ron D. Hays, Ph.D. UCLA, Los Angeles, CA Symposium 1077: Measurement Tools to Enhance Health- Related.
Advertisements

PROMIS: The Right Place at the Right Time? David Cella, Ph.D. Department of Medical Social Sciences Northwestern University Chair, PROMIS Steering Committee.
15-minute Introduction to PROMIS Ron D. Hays, Ph.D UCLA Division of General Internal Medicine & Health Services Research Roundtable Meeting on Measuring.
Why Patient-Reported Outcomes Are Important: Growing Implications and Applications for Rheumatologists Ron D. Hays, Ph.D. UCLA Department of Medicine RAND.
1 8/14/2015 Evaluating the Significance of Health-Related Quality of Life Change in Individual Patients Ron Hays October 8, 2004 UCLA GIM/HSR.
1 Health-Related Quality of Life Ron D. Hays, Ph.D. - UCLA Department of Medicine: Division of General Internal Medicine.
Use of Health-Related Quality of Life Measures to Assess Individual Patients July 24, 2014 (1:00 – 2:00 PDT) Kaiser Permanente Methods Webinar Series Ron.
Patient-Centered Outcomes of Health Care Comparative Effectiveness Research February 3, :00am – 12:00pm CHS 1 Ron D.Hays, Ph.D.
Health-Related Quality of Life as an Indicator of Quality of Care May 4, 2014 (8:30 – 11:30 PDT) HPM216: Quality Assessment/ Making the Business Case for.
1 Assessing the Minimally Important Difference in Health-Related Quality of Life Scores Ron D. Hays, Ph.D. UCLA Department of Medicine October 25, 2006,
Measuring Health-Related Quality of Life Ron D. Hays, Ph.D. UCLA Department of Medicine RAND Health Program UCLA Fielding School of Public Health
Introduction to the Patient-Reported Outcomes Measurement Information System (PROMIS) UCLA Center for East-West Medicine 2428 Santa Monica Blvd., Suite.
Measuring Health-Related Quality of Life
Overview of Health-Related Quality of Life Measures May 22, 2014 (1:00 – 2:00 PDT) Kaiser Methods Webinar Series 1 Ron D.Hays, Ph.D.
1 Health-Related Quality of Life Assessment as an Indicator of Quality of Care (HPM 216) Ron D. Hays April 11, 2013(8:30-11:30 am) Wilshire Blvd.
1 Session 6 Minimally Important Differences Dave Cella Dennis Revicki Jeff Sloan David Feeny Ron Hays.
Item Response Theory (IRT) Models for Questionnaire Evaluation: Response to Reeve Ron D. Hays October 22, 2009, ~3:45-4:05pm
Evaluating Self-Report Data Using Psychometric Methods Ron D. Hays, PhD February 6, 2008 (3:30-6:30pm) HS 249F.
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Psychometric Evaluation of Questionnaire Design and Testing Workshop December , 10:00-11:30 am Wilshire Suite 710 DATA.
Evaluating Self-Report Data Using Psychometric Methods Ron D. Hays, PhD February 8, 2006 (3:00-6:00pm) HS 249F.
Measurement of Outcomes Ron D. Hays Accelerating eXcellence In translational Science (AXIS) January 17, 2013 (2:00-3:00 pm) 1720 E. 120 th Street, L.A.,
Evaluating Multi-Item Scales Health Services Research Design (HS 225B) January 26, 2015, 1:00-3:00pm CHS.
Health-Related Quality of Life in Outcome Studies Ron D. Hays, Ph.D UCLA Division of General Internal Medicine & Health Services Research GCRC Summer Session.
Reducing Burden on Patient- Reported Outcomes Using Multidimensional Computer Adaptive Testing Scott B. MorrisMichael Bass Mirinae LeeRichard E. Neapolitan.
1. The STRATA Study. Kidney Transplant 1. The STRATA Study (Study of Transplant-Related Anemia Treated with Aranesp) Source Bloom RD, Bolin P, Gandra.
July 18, 2017, 9:00-11:00 am (Geffen Hall Seminar Room 112)
Psychometric Evaluation of Items Ron D. Hays
NIH: Patient-Reported Outcomes Measurement Information System (PROMIS®) Ron D. Hays Functional Vision and Visual Function November 10, 2016, 8:55-9:15am.
Evaluating Multi-Item Scales
PROMIS-29 V2.0 Physical and Mental Health Summary Scores Ron D. Hays
Health-Related Quality of Life Assessment in Outcome Studies
Proportion of patients reporting scores ≥age-matched and gender-matched normative PRO values at baseline and 24 weeks in the (A) AMBITION and (B) ADACTA.
Use of the SF-36 and other health-related quality of life measures to assess persons with disabilities  Ron D. Hays, PhD, Harlan Hahn, PhD, Grant Marshall,
Mean changes in the Short Form-36 subscales from baseline values for combined pulmonary fibrosis and emphysema (CPFE) (n=16), and chronic obstructive pulmonary.
Clinical Epidemiology and Global Health
Lung Resection Improves the Quality of Life of Patients With Symptomatic Bronchiectasis  Camilla Carlini Vallilo, MS, Ricardo Mingarini Terra, MD, PhD,
Health-Related Quality-of-Life Measures: Evidence from Tunisian Population Using the SF-12 Health Survey  Moheddine Younsi, PhD  Value in Health Regional.
The Impact of Pain Management on Quality of Life
A Multi-Dimensional PSER Stopping Rule
Poorer sleep quality among adult patients with pectus excavatum in Taiwan: A pilot study  Yeung-Leung Cheng, MD, PhD, Chou-Chin Lan, MD, PhD, Yao-Kuang.
Mu Qin et al. JACEP 2017;j.jacep
(A) Quality of life scores in individuals with and without constipation. (A) Quality of life scores in individuals with and without constipation. (B) Quality.
Correlations between observed patient-reported outcomes and disease activity scores at week 24. Correlations between observed patient-reported outcomes.
Clinical Epidemiology and Global Health
Patient-Centered Outcomes of Health Care
May 14, :00-12:00 noon (Geffen Hall 207)
Evaluating the Psychometric Properties of Multi-Item Scales
Evaluating Multi-item Scales
Multitrait Scaling and IRT: Part I
Health-Related Quality of Life as an indicator of Quality of Care
Health-Related Quality of Life Measures (HS249T: Decision Analysis and Cost-Effectiveness Analysis) Ron D. Hays, Ph.D. UCLA Division.
Mean individual and summative SF-36 V2 scores before and 3 months after ablation. Mean individual and summative SF-36 V2 scores before and 3 months after.
Evaluating Multi-item Scales
Use of the SF-36 and other health-related quality of life measures to assess persons with disabilities  Ron D. Hays, PhD, Harlan Hahn, PhD, Grant Marshall,
Percentages of patients reporting improvements from baseline ≥minimum clinically important difference (MCID) and number needed to treat (NNT) in (A) patient-reported.
Patient-Reported Indicators of Quality of Care
Spydergram of mean SF-36 domain scores at baseline and weeks 12 (A) and 24 (B) for sarilumab 150 mg and 200 mg+csDMARDs compared with placebo+csDMARDs.
Quality of life in patients with no-option critical limb ischemia underlines the need for new effective treatment  Ralf W. Sprengers, MD, Martin Teraa,
Spider plot of the unstandardised SF-36v2 subscales, comparing our HCM population with the mean for the general population (aged 45–54 years). Spider plot.
Proportion of patients reporting improvements from baseline in patient-reported outcomes (PROs) ≥ the MCID at (A) 16 weeks in OPTION, (B) 12 weeks in BREVACTA.
Estimating Minimally Important Differences (MIDs)
Evaluating Multi-Item Scales
Health-Related Quality of Life as an indicator of Quality of Care
Geffen Hall 122, Los Angeles, CA, June 10, 2019
GIM & HSR Research Seminar: October 5, 2018
LPA groups display vastly different outcomes.
UCLA Department of Medicine
Patient-reported Outcome Measures
Post hoc analysis of differences from placebo in the percentage of patients reporting improvements ≥MCID at week 24. Post hoc analysis of differences from.
Presentation transcript:

Evaluating the Significance of Individual Change May 14, 2018 (3:00 – 4:00 PDT) RCMAR Methods Seminar Series Hays, R. D., Brodsky, M., Johnston, M. F., Spritzer, K. L., & Hui, K. (2005). Evaluating the statistical significance of health-related quality of life change in individual patients. Evaluation and the Health Professions, 28, 160-171. Ron D.Hays, Ph.D. drhays@ucla.edu

Physical Functioning and Emotional Well-Being at Baseline for 54 Patients at UCLA-Center for East West Medicine Mental Physical MS = multiple sclerosis; ESRD = end-stage renal disease; GERD = gastroesophageal reflux disease. 2

Physical Functioning and Emotional Well-Being at Baseline for 54 Patients at UCLA-Center for East West Medicine Mental Physical MS = multiple sclerosis; ESRD = end-stage renal disease; GERD = gastroesophageal reflux disease. 3

Significant Improvement in all but 1 of SF-36 Scales (Change is in T-score metric) t-test prob. PF-10 1.7 2.38 .0208 RP-4 4.1 3.81 .0004 BP-2 3.6 2.59 .0125 GH-5 2.4 2.86 .0061 EN-4 5.1 4.33 .0001 SF-2 4.7 3.51 .0009 RE-3 1.5 0.96 .3400 EWB-5 4.3 3.20 .0023 PCS 2.8 3.23 .0021 MCS 3.9 2.82 .0067

Effect Size (Follow-up – Baseline)/ SDbaseline Cohen’s Rule of Thumb: ES = 0.20 Small ES = 0.50 Medium ES = 0.80 Large

Effect Sizes for Changes in SF-36 Scores 0.53 0.13 0.35 0.35 0.21 0.36 0.11 0.41 0.24 0.30 PFI = Physical Functioning; Role-P = Role-Physical; Pain = Bodily Pain; Gen H=General Health; Energy = Energy/Fatigue; Social = Social Functioning; Role-E = Role-Emotional; EWB = Emotional Well-being; PCS = Physical Component Summary; MCS =Mental Component Summary. 0.11 0.13 0.21 0.24 0.30 0.35 0.35 0.36 0.41 0.53

Defining a Responder: Reliable Change Index (RCI) Note: SDbl = standard deviation at baseline rxx = reliability

Significant Change > = 1.96

Amount of Change in Observed Score Needed To be Statistically Significant = 2.77 * SDbl * SQR (1- rxx) “Coefficient of reproducibility” Note: SDbl = standard deviation at baseline and rxx = reliability

How Reliability Relates to Amount of Individual Change Needed SQR (1- rxx) Change Needed 0.70 0.55 1.5 SD 0.80 0.44 1.2 SD 0.90 0.32 0.9 SD 0.95 0.22 0.6 SD 0.97 0.17 0.5 SD 2.77 * SQR (1- rxx) * SDbl

Amount of Change Needed for Significant Individual Change Effect Size 1.33 0.67 0.72 1.01 1.13 1.07 0.71 1.26 0.62 0.73 PFI = Physical Functioning; Role-P = Role-Physical; Pain = Bodily Pain; Gen H=General Health; Energy = Energy/Fatigue; Social = Social Functioning; Role-E = Role-Emotional; EWB = Emotional Well-being; PCS = Physical Component Summary; MCS =Mental Component Summary.

7-31% Improve Significantly % Improving % Declining Difference PF-10 13% 2% + 11% RP-4 31% + 29% BP-2 22% 7% + 15% GH-5 0% + 7% EN-4 9% SF-2 17% 4% + 13% RE-3 15% EWB-5 19% PCS 24% + 17% MCS 11%

Observational Study of ~1800 Chiropractic Patients in Treatment for Chronic Low Back or Neck Pain Slide 45 The reliabilities of the PROMIS-29 v2.0 measures in this sample were 0.85 or higher. Baseline means indicate that the sample of patients with chronic low back pain or neck pain are similar to the U.S. general population on emotional distress and better on social health. But they have worse physical functioning and more pain, fatigue, and sleep disturbance.

Slide 46 From baseline to the end of the study 3 months later, we found no change in emotional distress but statistically significant and small improvements on all the other measures. Note that improvements of about 3 points on the SF-36 summary scores over 3 months due to manipulation were found in the UK BEAM (back pain, exercise, and manipulation) randomized trial.

Slide 47 13-30% of the sample got significantly better (“responders”) on the different health measures using the coefficient of reproducibility, which requires a change of at least 2.77 times the SEM, and is equivalent to the Reliable Change Index.

Item Responses and Trait Levels Person 1 Person 2 Person 3 Trait Continuum

Computer Adaptive Testing (CAT)

The PROMIS Metric T Score www.healthmeasures.net Mean = 50 SD = 10 Referenced to US General Pop. T = 50 + (z * 10) www.healthmeasures.net

Normal Distribution Within 1 SD = 68.2%, 2 SDs =95.4%; 3 SDs = 99.6%

Reliability Target for Use of Measures with Individuals Reliability ranges from 0-1 0.90 or above is goal SE (SEM) = SD (1- reliability)1/2 Reliability (T-score) = 1 – (SE/10)2 Reliability = 0.90 when SE = 3.2 95% CI = true score +/- 1.96 x SE T = z*10 + 50

In the past 7 days … I was grouchy [1st question] Never [39] Rarely [48] Sometimes [56] Often [64] Always [72] Estimated Anger = 56.1 SE = 5.7 (rel. = 0.68)

In the past 7 days … I felt like I was ready to explode [2nd question] Never Rarely Sometimes Often Always Estimated Anger = 51.9 SE = 4.8 (rel. = 0.77)

In the past 7 days … I felt angry [3rd question] Never Rarely Sometimes Often Always Estimated Anger = 50.5 SE = 3.9 (rel. = 0.85)

In the past 7 days … I felt angrier than I thought I should [4th question] - Never Rarely Sometimes Often Always Estimated Anger = 48.8 SE = 3.6 (rel. = 0.87)

In the past 7 days … I felt annoyed [5th question] Never Rarely Sometimes Often Always Estimated Anger = 50.1 SE = 3.2 (rel. = 0.90)

In the past 7 days … I made myself angry about something just by thinking about it. [6th question] Never Rarely Sometimes Often Always Estimated Anger = 50.2 SE = 2.8 (rel = 0.92)

Thank you Finkleman, M. D. et al. (2010). Item selection and hypothesis testing for the adaptive measurement of change. Applied Psychological Measurement, 34, 238-254. Jabrayilov, R. et al. (2016). Comparison of classical test theory and item response theory in individual change assessment. Applied Psychological Measurement, 40, 559-572. Wang, C., & Weiss, D. J. (2018). Multivariate hypothesis testing methods for evaluating significant individual change. Applied Psychological Measurement, 42, 221-239. drhays@ucla.edu (310-794-2294) Powerpoint file available for downloading at: http://gim.med.ucla.edu/FacultyPages/Hays/ 28