Primer on Evaluating Reliability and Validity of Multi-Item Scales Questionnaire Design and Testing Workshop October 25, 2013, 3:30-5:00pm 10940 Wilshire.

Slides:



Advertisements
Similar presentations
Estimating Reliability RCMAR/EXPORT Methods Seminar Series Drew (Cobb Room 131)/ UCLA (2 nd Floor at Broxton) December 12, 2011.
Advertisements

How does PROMIS compare to other HRQOL measures? Ron D. Hays, Ph.D. UCLA, Los Angeles, CA Symposium 1077: Measurement Tools to Enhance Health- Related.
Remaining Challenges and What to Do Next: Undiscovered Areas Ron D. Hays UCLA Division of General Internal Medicine & Health Services Research
15-minute Introduction to PROMIS Ron D. Hays, Ph.D UCLA Division of General Internal Medicine & Health Services Research Roundtable Meeting on Measuring.
Why Patient-Reported Outcomes Are Important: Growing Implications and Applications for Rheumatologists Ron D. Hays, Ph.D. UCLA Department of Medicine RAND.
Methods for Assessing Safety Culture: A View from the Outside October 2, 2014 (9:30 – 10:30 EDT) Safety Culture Conference (AHRQ Watts Branch Conference.
1 Evaluating Multi-Item Scales Using Item-Scale Correlations and Confirmatory Factor Analysis Ron D. Hays, Ph.D. RCMAR.
Evaluating Multi-Item Scales Health Services Research Design (HS 225A) November 13, 2012, 2-4pm (Public Health)
Psychometric Evaluation of Survey Data Questionnaire Design and Testing Workshop October 24, 2014, 3:30-5:00pm Wilshire Blvd. Suite 710 Los Angeles,
Item Response Theory in Patient-Reported Outcomes Research Ron D. Hays, Ph.D., UCLA Society of General Internal Medicine California-Hawaii Regional Meeting.
Cross-Cultural Use of Measurements: Development of the Chinese SF-36 Health Survey Xinhua S. Ren, Ph.D. Boston University School of Public Health, Boston,
Performance of PROMIS® CATs Versus Short forms in Detecting Change Ron D. Hays October 3, 2011 International Association for Computer.
1 Health-Related Quality of Life as an Indicator of Quality of Care Ron D. Hays, Ph.D. HS216—Quality Assessment: Making the Business Case.
Use of Health-Related Quality of Life Measures to Assess Individual Patients July 24, 2014 (1:00 – 2:00 PDT) Kaiser Permanente Methods Webinar Series Ron.
Patient-Centered Outcomes of Health Care Comparative Effectiveness Research February 3, :00am – 12:00pm CHS 1 Ron D.Hays, Ph.D.
Health-Related Quality of Life as an Indicator of Quality of Care May 4, 2014 (8:30 – 11:30 PDT) HPM216: Quality Assessment/ Making the Business Case for.
SAS PROC IRT July 20, 2015 RCMAR/EXPORT Methods Seminar 3-4pm Acknowledgements: - Karen L. Spritzer - NCI (1U2-CCA )
1 Assessing the Minimally Important Difference in Health-Related Quality of Life Scores Ron D. Hays, Ph.D. UCLA Department of Medicine October 25, 2006,
Evaluating Health-Related Quality of Life Measures June 12, 2014 (1:00 – 2:00 PDT) Kaiser Methods Webinar Series Ron D.Hays, Ph.D.
Measuring Health-Related Quality of Life Ron D. Hays, Ph.D. UCLA Department of Medicine RAND Health Program UCLA Fielding School of Public Health
Development of Physical and Mental Health Summary Scores from PROMIS Global Items Ron D. Hays ( ) UCLA Department of Medicine
Measuring Health-Related Quality of Life
Overview of Health-Related Quality of Life Measures May 22, 2014 (1:00 – 2:00 PDT) Kaiser Methods Webinar Series 1 Ron D.Hays, Ph.D.
Introductory Statistics Options, Spring 2008 Stat 100: MWF, 11:00 Science Center C. Stat 100: MWF, 11:00 Science Center C. –General intro to statistical.
Measurement Issues by Ron D. Hays UCLA Division of General Internal Medicine & Health Services Research (June 30, 2008, 4:55-5:35pm) Diabetes & Obesity.
1 Health-Related Quality of Life Assessment as an Indicator of Quality of Care (HPM 216) Ron D. Hays April 11, 2013(8:30-11:30 am) Wilshire Blvd.
Impact of using a next button in a web-based health survey on time to complete and reliability of measurement Ron D. Hays (Rita Bode, Nan Rothrock, William.
1 Session 6 Minimally Important Differences Dave Cella Dennis Revicki Jeff Sloan David Feeny Ron Hays.
Presented By Ron D. Hays, Ph.D. April 8, 2010 (MNRS Pre-Conference Workshop) Domains of PROMIS and how they were developed Dynamic Tools to Measure Health.
Item Response Theory (IRT) Models for Questionnaire Evaluation: Response to Reeve Ron D. Hays October 22, 2009, ~3:45-4:05pm
Evaluating Self-Report Data Using Psychometric Methods Ron D. Hays, PhD February 6, 2008 (3:30-6:30pm) HS 249F.
Multi-item Scale Evaluation Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Psychometric Evaluation of Questionnaire Design and Testing Workshop December , 10:00-11:30 am Wilshire Suite 710 DATA.
Evaluating Self-Report Data Using Psychometric Methods Ron D. Hays, PhD February 8, 2006 (3:00-6:00pm) HS 249F.
Overlap between Subjective Well-being and Health-related Quality of Life. 3 Ron D. Hays, Ph.D. (Alina Palimaru) November 18, 2015 (11:30-12:00 noon) Geriatric.
Measurement of Outcomes Ron D. Hays Accelerating eXcellence In translational Science (AXIS) January 17, 2013 (2:00-3:00 pm) 1720 E. 120 th Street, L.A.,
Considerations in Comparing Groups of People with PROs Ron D. Hays, Ph.D. UCLA Department of Medicine May 6, 2008, 3:45-5:00pm ISPOR, Toronto, Canada.
Item Response Theory Dan Mungas, Ph.D. Department of Neurology University of California, Davis.
Patient-Reported Physical Functioning Ron D. Hays November 27, 2012 (11:15-11:30) UCLA Department of Medicine MCID for Orthopaedic Devices Silver Springs,
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire.
Impact of using a next button in a web-based health survey on time to complete and reliability of measurement Ron D. Hays (Rita Bode, Nan Rothrock, William.
Overview of Item Response Theory Ron D. Hays November 14, 2012 (8:10-8:30am) Geriatrics Society of America (GSA) Pre-Conference Workshop on Patient- Reported.
Estimating Minimally Important Differences on PROMIS Domain Scores Ron D. Hays UCLA Department of Medicine/Division of General Internal Medicine & Health.
Evaluating Multi-item Scales Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research HS237A 11/10/09, 2-4pm,
Evaluating Multi-Item Scales Health Services Research Design (HS 225B) January 26, 2015, 1:00-3:00pm CHS.
Health-Related Quality of Life in Outcome Studies Ron D. Hays, Ph.D UCLA Division of General Internal Medicine & Health Services Research GCRC Summer Session.
Health-Related Quality of Life (HRQOL) Assessment in Outcome Studies Ron D. Hays, Ph.D. UCLA/RAND GCRC Summer Course “The.
Reducing Burden on Patient- Reported Outcomes Using Multidimensional Computer Adaptive Testing Scott B. MorrisMichael Bass Mirinae LeeRichard E. Neapolitan.
Evaluating Patient-Reports about Health
Psychometric Evaluation of Items Ron D. Hays
UCLA Department of Medicine
Evaluating Patient-Reports about Health
UCLA Department of Medicine
Evaluating Multi-Item Scales
Ron D. Hays GIM-HSR Friday Noon Seminar Series November 4, 2016
Evaluating IRT Assumptions
Health-Related Quality of Life Assessment in Outcome Studies
May 14, :00-12:00 noon (Geffen Hall 207)
Evaluating the Psychometric Properties of Multi-Item Scales
Evaluating Multi-Item Scales
Evaluating Multi-item Scales
Multitrait Scaling and IRT: Part I
Evaluating Multi-item Scales
Estimating Minimally Important Differences (MIDs)
Evaluating Multi-Item Scales
Psychometric testing and validation (Multi-trait scaling and IRT)
Evaluating the Significance of Individual Change
UCLA Department of Medicine
Presentation transcript:

Primer on Evaluating Reliability and Validity of Multi-Item Scales Questionnaire Design and Testing Workshop October 25, 2013, 3:30-5:00pm Wilshire Blvd. Suite 710 Los Angeles, CA

Patient Reported Outcomes Measurement Information System (PROMIS®) Funded by the National Institutes of Health One domain captured is “anger” –Mood (irritability, frustration) –Negative social cognitions (interpersonal sensitivity, envy, disagreeableness) –Needing to control anger

Item Responses and Trait Levels Item 1 Item 2 Item 3 Person 1Person 2Person 3 Trait Continuum

Standard Error of Measurement (SEM) SEM = SD (1- reliability) 1/2 95% CI = true score +/ x SEM –If z-score = 0 and reliability = 0.90 –CI: -.62 to +.62 (width is 1.24 z-score units)

Reliability (0-1) or above for group comparisons or above for individual assessment z-scores (mean = 0 and SD = 1): –Reliability = 1 – SE 2 –So reliability = 0.90 when SE = 0.32 T-scores (mean = 50 and SD = 10): –Reliability = 1 – (SE/10) 2 –So reliability = 0.90 when SE = 3.2 T = 50 + (z * 10)

In the past 7 days … I was grouchy [1 st question] –Never [39] –Rarely [48] –Sometimes [56] –Often [64] –Always [72] Theta = 56.1 SE = 5.7 (rel. = 0.68)

In the past 7 days … I felt like I was ready to explode [2 nd question] –Never –Rarely –Sometimes –Often –Always Theta = 51.9 SE = 4.8 (rel. = 0.77)

In the past 7 days … I felt angry [3 rd question] –Never –Rarely –Sometimes –Often –Always Theta = 50.5 SE = 3.9 (rel. = 0.85)

In the past 7 days … I felt angrier than I thought I should [4 th question] - Never –Rarely –Sometimes –Often –Always Theta = 48.8 SE = 3.6 (rel. = 0.87)

In the past 7 days … I felt annoyed [5 th question] –Never –Rarely –Sometimes –Often –Always Theta = 50.1 SE = 3.2 (rel. = 0.90)

In the past 7 days … I made myself angry about something just by thinking about it. [6 th question] –Never –Rarely –Sometimes –Often –Always Theta = 50.2 SE = 2.8 (rel = 0.92)

Theta, SEM, and 95% CI  56 and 6 (reliability =.68) W = 22  52 and 5 (reliability =.77) W = 19  50 and 4 (reliability =.85) W = 15  49 and 4 (reliability =.87) W = 14  50 and 3 (reliability =.90) W = 12  50 and <3 (reliability =.92) W = 11

The following items are activities you might do during a typical day. Does your health limit you in these activities? 1.Vigorous activities, such as running, lifting heavy objects, participating in strenuous sports 2.Moderate activities, such as moving a table, pushing a vacuum cleaner, bowling, or playing golf 3.Lifting or carrying groceries 4.Climbing several flights of stairs 5.Climbing 1 flight of stairs 6.Bending, kneeling, or stooping 7.Walking more than a mile. 8.Walking several blocks. 9.Walking 1 block 10.Bathing or dressing yourself. No, not limited at all Yes, limited a little Yes, limited a lot

11. In the past 4 weeks, to what extent did health problems limit you in your everyday physical activities (e.g., walking and climbing stairs) –Not at all –Slightly –Moderately –Quite a bit –Extremely

12. How satisfied are you with your physical ability to do what you want to do? –Completely satisfied –Very satisfied –Somewhat satisfied –Somewhat dissatisfied –Very dissatisfied –Completely dissatisfied

13. When you travel around your community, does someone have to assist you because of your health? –Yes, all of the time –Yes, most of the time –Yes, some of the time –Yes, a little of the time –No, none of the time

14. Are you in bed or in a chair most or all of the day because of your health? –Yes, every day –Yes, most days –Yes, some days –Yes, occasionally –No, never

15. Are you able to use public transportation? –No, because of my health –No, for some other reason –Yes, able to use public transportation

Comparative Fit Index = 0.95; Root Mean Square Error of Approximation = 0.12 (Cronbach’s coefficent alpha = 0.94)

In the past 4 weeks, did health problems limit you in your everyday physical activities?

37 Item-scale correlation matrix

38 Item-scale correlation matrix

Evaluating Validity ScaleAgeObesityESRDNursing Home Resident Physical Functioning Medium (-) Small (-) Large (-) Depressive Symptoms ? Small (+) ? Cohen effect size rules of thumb (d = 0.2, 0.5, and 0.8): Small correlation = Medium correlation = Large correlation = r = d / [(d 2 + 4).5 ] = 0.8 / [( ).5 ] = 0.8 / [( ).5 ] = 0.8 / [( 4.64).5 ] = 0.8 / = Note: Often r’s of 0.10, 0.30 and 0.50 are cited as small, medium, and large.

Change on SF-36 Physical Functioning Scale by Self-reported Retrospective Rating of Change IntervalLot Better (n = 21) Little Better (n = 35) Same (n = 252) Little Worse (n = 113) Lot Worse (n = 30) 12 months4.99 a 0.32,b 0.46 b c c 6 months4.08 a b,c 0.89 b c c Note: Cell entries in the same row that share a letter do not differ significantly (p > 0.05) from one another (Duncan’s multiple range tests). SD of change was 7.74 for 12 months and 7.08 for 6 months.

Questions? Powerpoint file posted at URL below (freely available for you to use, copy or burn): Contact information: For a good time call or go to:

Appendix 1: For more information Hays, R. D., Morales, L. S., & Reise, S. P. (2000). Item Response Theory and Health Outcomes Measurement in the 21 st Century. Medical Care, 38 (Suppl.), II-28-II-42. Hays, R. D., Liu, H., Spritzer, K., & Cella, D. (2007). Item response theory analyses of physical functioning items in the Medical Outcomes Study. Medical Care, 45, S Cella, D., Riley, W., Stone, A., Rothrock, N., Reeve, B., Young, S., Amtmann, D., Bode, R., Buysse, D., Choi, S., Cook, K., DeVellis, R., DeWalt, D., Fries, J. F., Gershon, R., Hahn, E. A., Pilkonis, P., Revicki, D., Rose, M., Weinfurt, K., Lai, J., & Hays, R. D. (2010). Initial item banks and first wave testing of the Patient- Reported Outcomes Measurement Information System (PROMIS) network: Journal of Clinical Epidemiology, 63 (11),

Appendix 2: Item Response Theory (IRT) IRT models the relationship between a person’s response Y i to the question (i) and his or her level of the latent construct  being measured by positing b ik estimates how difficult it is for the item (i) to have a score of k or more and the discrimination parameter a i estimates the discriminatory power of the item.

44 Appendix 3: Intraclass Correlation and Reliability ModelIntraclass CorrelationReliability One- way Two- way fixed Two- way random BMS = Between Ratee Mean Square N = n of ratees WMS = Within Mean Square k = n of items or raters JMS = Item or Rater Mean Square EMS = Ratee x Item (Rater) Mean Square

Appendix 4: Confirmatory Factor Analysis Fit Indices Normed fit index: Non-normed fit index: Comparative fit index:  -  2 null model 2  2 null  2 model 2 - df nullmodel 2 null  df -1  - df 2 model  - 2 null df null 1 - RMSEA = SQRT (λ 2 – df)/SQRT (df (N – 1))