Ron D. Hays GIM-HSR Friday Noon Seminar Series November 4, 2016

Slides:



Advertisements
Similar presentations
A Model to Evaluate Recreational Management Measures Objective I – Stock Assessment Analysis Create a model to distribute estimated landings (A + B1 fish)
Advertisements

How does PROMIS compare to other HRQOL measures? Ron D. Hays, Ph.D. UCLA, Los Angeles, CA Symposium 1077: Measurement Tools to Enhance Health- Related.
15-minute Introduction to PROMIS Ron D. Hays, Ph.D UCLA Division of General Internal Medicine & Health Services Research Roundtable Meeting on Measuring.
Primer on Evaluating Reliability and Validity of Multi-Item Scales Questionnaire Design and Testing Workshop October 25, 2013, 3:30-5:00pm Wilshire.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
1 Health-Related Quality of Life as an Indicator of Quality of Care Ron D. Hays, Ph.D. HS216—Quality Assessment: Making the Business Case.
The emotional distress of children with cancer in China: An Item Response Analysis of C-Ped-PROMIS Anxiety and Depression Short Forms Yanyan Liu 1, Changrong.
Introduction Neuropsychological Symptoms Scale The Neuropsychological Symptoms Scale (NSS; Dean, 2010) was designed for use in the clinical interview to.
SAS PROC IRT July 20, 2015 RCMAR/EXPORT Methods Seminar 3-4pm Acknowledgements: - Karen L. Spritzer - NCI (1U2-CCA )
Use of CAHPS® Database by Researchers: Findings Related to Differences by Race and Ethnicity Ron D. Hays, Ph.D. RAND.
Development of Physical and Mental Health Summary Scores from PROMIS Global Items Ron D. Hays ( ) UCLA Department of Medicine
Measurement Issues by Ron D. Hays UCLA Division of General Internal Medicine & Health Services Research (June 30, 2008, 4:55-5:35pm) Diabetes & Obesity.
1 Differential Item Functioning in Mplus Summer School Week 2.
Measurement Models: Identification and Estimation James G. Anderson, Ph.D. Purdue University.
Item Response Theory (IRT) Models for Questionnaire Evaluation: Response to Reeve Ron D. Hays October 22, 2009, ~3:45-4:05pm
Multi-item Scale Evaluation Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Individual differences in response to intervention: An application of Integrative Data Analysis in Project KIDS Sara A. Hart Florida State University
Item Factor Analysis Item Response Theory Beaujean Chapter 6.
Overlap between Subjective Well-being and Health-related Quality of Life. 3 Ron D. Hays, Ph.D. (Alina Palimaru) November 18, 2015 (11:30-12:00 noon) Geriatric.
Item Response Theory Dan Mungas, Ph.D. Department of Neurology
Considerations in Comparing Groups of People with PROs Ron D. Hays, Ph.D. UCLA Department of Medicine May 6, 2008, 3:45-5:00pm ISPOR, Toronto, Canada.
Item Response Theory Dan Mungas, Ph.D. Department of Neurology University of California, Davis.
Overview of Item Response Theory Ron D. Hays November 14, 2012 (8:10-8:30am) Geriatrics Society of America (GSA) Pre-Conference Workshop on Patient- Reported.
Evaluating Multi-item Scales Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research HS237A 11/10/09, 2-4pm,
Latent Curve Modeling to Understand Achievement Emotions Gavin Brown Quant-DARE Methods Showcase Feb 22, 2016.
IRT Equating Kolen & Brennan, 2004 & 2014 EPSY
Instrument Development and Psychometric Evaluation: Scientific Standards May 2012 Dynamic Tools to Measure Health Outcomes from the Patient Perspective.
Study Population In , 1,154 Mexican origin youth (aged years) indicated which of 50 movies (randomly selected from a pool of 250 popular movies.
Stats Methods at IC Lecture 3: Regression.
Bootstrap and Model Validation
Evaluating Patient-Reports about Health
ScWk 298 Quantitative Review Session
Nonparametric Statistics
Chapter 4 Basic Estimation Techniques
Ron D. Hays September 12, 2006 Network Testing ~ 5 or 6pm
Psychometric Evaluation of Items Ron D. Hays
Further Validation of the Personal Growth Initiative Scale – II: Gender Measurement Invariance Harmon, K. A., Shigemoto, Y., Borowa, D., Robitschek, C.,
QDET2, Miami, FL, Hibiscus A
NIH: Patient-Reported Outcomes Measurement Information System (PROMIS®) Ron D. Hays Functional Vision and Visual Function November 10, 2016, 8:55-9:15am.
Vertical Scaling in Value-Added Models for Student Learning
Intelligence Andrea Mejia Spring 2017.
Running models and Communicating Statistics
A Different Way to Think About Measurement Development:
UCLA Department of Medicine
Evaluating Patient-Reports about Health
UCLA Department of Medicine
Evaluating Multi-Item Scales
Study Limitations and Future Directions See Handout for References
CJT 765: Structural Equation Modeling
Evaluating IRT Assumptions
Nonparametric Statistics
His Name Shall Be Revered …
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Logistic Regression.
Sean C. Wright Illinois Institute of Technology, APTMetrics
Spanish and English Neuropsychological Assessment Scales - Guiding Principles and Evolution Friday Harbor Psychometrics Workshop 2005.
Pediatric HITSS (PedHITSS)
Evaluating the Psychometric Properties of Multi-Item Scales
Evaluating Multi-item Scales
Multitrait Scaling and IRT: Part I
Evaluating Multi-item Scales
Item Response Theory Applications in Health Ron D. Hays, Discussant
Psychometric testing and validation (Multi-trait scaling and IRT)
Context and Methodology
GIM & HSR Research Seminar: October 5, 2018
UCLA Department of Medicine
Patient-reported Outcome Measures
Detecting Treatment by Biomarker Interaction with Binary Endpoints
MATH 2311 Review for Exam 2 10 Multiple Choice (70 points): Test 2
Presentation transcript:

Ron D. Hays GIM-HSR Friday Noon Seminar Series November 4, 2016 Differential Item Functioning Between English- and Spanish-Language on the Patient-Reported Outcomes Measurement Information System (PROMIS®) Physical Functioning Items for Children and Adolescents Ron D. Hays GIM-HSR Friday Noon Seminar Series November 4, 2016

Collaborators José Luis Calderón Karen L. Spritzer Steve P. Reise Sylvia H. Paz To be presented November 10, 2016 at International Conference on Questionnaire Design, Development, Evaluation, and Testing, Miami: https://ww2.amstat.org/meetings/qdet2/

In the past 7 days … 0 = Not able to do; 1 = With a lot of trouble; 2 =With some trouble; 3 = With a little trouble; 4 = With no trouble 29 upper extremity items I could button my shirt or pants. I could open a jar by myself. 23 mobility items I could run a mile. I could ride a bike. DeWitt et al., J Clin Epid, 2011, 64 (7), 794-804 21% of those 6-17 years old in U.S. speak language other than English at home. American Community Survey (2007)

Differential Item Functioning (DIF) Controlling for underlying upper extremity (mobility), Probability of picking a particular response (e.g.,”not able to do”) Differs when item is administered in English versus Spanish.

Samples English-language Spanish-language 5,091 children and adolescents from medical clinics in North Carolina and Texas and community schools in North Carolina. Spanish-language 605 children and adolescents of adults who are members of an internet panel. Hays, R. D., Liu, H., & Kapteyn, A. (2015). Use of internet panels to conduct surveys. Behavior Research Methods, 47 (3), 685-690. Adults completed demographic questions about child and then passed computer to child.

Spanish-Language Sample Hispanic Spanish-speaking adults Members of the Greenfield/Toluna internet panel (http://us.toluna.com) Average score on 4-item Short Acculturation Scale for Hispanics (SASH) was 2.6 for children. What language(s) do you read and speak? What language(s) do you usually speak at home? What language(s) do you usually speak with your friends? In which language(s) do you usually think? 1 = Only Spanish; 2 = Spanish better than English; 3 = Both equally; 4 = English better than Spanish; 5 = Only English

Spanish Translation Iterative process of forward translations, reconciliation, back translation, multiple reviews, and pre-test with cognitive debriefing (FACIT Translation Methodology). Each translated item pre-tested and debriefed in the US with 5 Spanish-speaking subjects from the general population to try to make sure translation is well understood and conceptually equivalent to the source.

Back-translation to Source Forward 1 Forward 2 Reconciliation Back-translation Comparison of Back-translation to Source Review 1 Review 2 Review 3 Assessment of Reviewers’ Recommendations Finalization Harmonization and Quality Review Formatting and Proofreading Pilot-Testing / Cognitive Debriefing Data Analysis Translation Finalized

Sample Demographics English (n = 5,091) Spanish (n = 605) % Female 52% 45% % Hispanic 18% 96% Age 8-12 years old 53% 50% 13-17 years old 47%

Confirmatory Factor Analysis Fit Indices 2  -  2 Normed fit index: Non-normed fit index: Comparative fit index: null model  2 2 2   null null - model df df null model  2 null - 1 df null 2  - df 1 - model model  - 2 df null null RMSEA = SQRT (λ2 – df)/SQRT (df (N – 1)) 10

Confirmatory Factor Analysis One-factor model fit the data well in the Spanish (n = 605) sample for upper extremity and mobility, respectively CFIs = 0.998 and 0.996 RMSEAs=0.036 and 0.054 Range of standardized factor loadings 0.824--0.962 (29 upper extremity items) 0.815—0.967 (23 mobility items) CFI>0.95; RMSEA = 0.05 (good)

Ordinal Logistic Regression http://CRAN.R-project.org/package=lordif Model 1 : logit P(ui >= k) = αk + β1 * mobility Model 2 : logit P(ui >= k) = αk + β1 * mobility + β2 * language Model 3 : logit P(ui >= k) = αk + β1 * mobility + β2 * language + β3 * mobility * language DIF assessment (log likelihood values and McFadden’s pseudo R2 >=0.02): - Overall: Model 3 versus Model 1 Non-uniform: Model 3 versus Model 2 Uniform: Model 2 versus Model 1 Note: Purified IRT score used for mobility (conditioning variable).

Upper extremity (4/29 dif) Mobility (7/23 dif) I could hold an empty cup. (harder for Spanish) I could pull open heavy doors. I could pour a drink from a full pitcher. I could open a jar by myself. (non-uniform) Mobility I could ride a bike. I could do sports and exercise that other kids my age could do. I could run a mile. I could walk upstairs without holding on to anything. I could keep up when I played with other kids. I used a walker, cane or crutches to get around. (non-uniform) I could turn my head all the way to the side. (non-uniform)

Impact of DIF (4 items) on Test Characteristic Curves: Upper Extremity

Impact of DIF at Individual Level: Upper Extremity

CAT-based Theta Estimates Using English (x-axis) and Spanish (y-axis) Parameters for 29 Upper Extremity Items (n = 605)

Impact of DIF (7 items) on Test Characteristic Curves: Mobility

Impact of DIF at Individual Level: Mobility

CAT-based Theta Estimates Using English (x-axis) and Spanish (y-axis) Parameters for 23 Mobility Items (n = 605)

Questions?

Stocking-Lord Linking Constants Spanish calibrations transformed linearly so that their TCC most closely matches English TCC. a* = a/A and b* = A * b + B Optimal values of A (slope) and B (intercept) transformation constants found through multivariate search to minimize weighted sum of squared distances between TCCs of English and Spanish transformed parameters Stocking, M.L., & Lord, F.M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7, 201-210.