1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods.

Slides:

Advertisements

Similar presentations

Test Development.

Advertisements

Agenda Levels of measurement Measurement reliability Measurement validity Some examples Need for Cognition Horn-honking.

Developing a Questionnaire

Cross Cultural Research

Conceptualization and Measurement

Survey Methodology Reliability and Validity EPID 626 Lecture 12.

Research Methodology Lecture No : 11 (Goodness Of Measures)

Item Analysis: A Crash Course Lou Ann Cooper, PhD Master Educator Fellowship Program January 10, 2008.

1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods.

CH. 9 MEASUREMENT: SCALING, RELIABILITY, VALIDITY

Issues Related to Assessment with Diverse Populations

Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.

Culture and psychological knowledge: A Recap

RESEARCH METHODS Lecture 18

Cross-Cultural Use of Measurements: Development of the Chinese SF-36 Health Survey Xinhua S. Ren, Ph.D. Boston University School of Public Health, Boston,

1 Measurement Measurement Rules. 2 Measurement Components CONCEPTUALIZATION CONCEPTUALIZATION NOMINAL DEFINITION NOMINAL DEFINITION OPERATIONAL DEFINITION.

Development of Questionnaire By Dr Naveed Sultana.

CAHPS Overview Clinician & Group Surveys: Practical Options for Implementation and Use AHRQ ANNUAL MEETING SEPTEMBER 18, 2011 Christine Crofton, PhD CAHPS.

Psychometric Properties of the Consumer Assessment of Healthcare Providers and Systems (CAHPS) Clinician and Group Adult Visit Survey September 11, 2012.

Keele Assessment of Participation (KAP): A new instrument for measuring participation restriction in population surveys Ross Wilkie, George Peat, Elaine.

Creating Assessments with English Language Learners in Mind In this module we will examine: Who are English Language Learners (ELL) and how are they identified?

K n o w l e d g e i n p r a c t i c e... The Alberta Context Tool (ACT) Carole A. Estabrooks Professor & Canada Research Chair Janet.

1 EPI 225 Measurement in Clinical Research Fall 2008 Anita L. Stewart, Ph.D. Institute for Health & Aging University of California, San Francisco.

Translation and Cross-Cultural Equivalence of Health Measures.

PERSONALITY ASSESSMENT ACROSS CULTURES Imported and Indigenous Instruments.

Construction and Evaluation of Multi-item Scales Ron D. Hays, Ph.D. RCMAR/EXPORT September 15, 2008, 3-4pm

1 Class 1 Concept Development and Concept Definitions September 20, 2007 Anita L. Stewart, Ph.D. Institute for Health & Aging University of California,

Instrumentation.

FDA Approach to Review of Outcome Measures for Drug Approval and Labeling: Content Validity Initiative on Methods, Measurement, and Pain Assessment in.

1 Copyright © 2011 by Saunders, an imprint of Elsevier Inc. Chapter 9 Examining Populations and Samples in Research.

How to Write a Critical Review of Research Articles

MEASUREMENT: VALIDITY Lu Ann Aday, Ph.D. The University of Texas School of Public Health.

1 The Patient Perspective: Satisfaction Survey Presented at: Disease Management Colloquium June 22, 2005 Shulamit Bernard, RN, PhD.

Use of CAHPS® Database by Researchers: Findings Related to Differences by Race and Ethnicity Ron D. Hays, Ph.D. RAND.

April Anderson-Vizcaya California State University Long Beach May 2012.

ScWk 240 Week 6 Measurement Error Introduction to Survey Development “England and America are two countries divided by a common language.” George Bernard.

1 Class 7 Measurement Issues in Research with Diverse Populations Including Health Disparities Research October 30, 2008 Anita L. Stewart Institute for.

Quantitative and Qualitative Approaches

1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods.

6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)

Confirmatory Factor Analysis Psych 818 DeShon. Construct Validity: MTMM ● Assessed via convergent and divergent evidence ● Convergent – Measures of the.

Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.

Principles of Instrument & Measurement Development Bonnie L. Halpern-Felsher, Ph.D. Professor University of California, San Francisco.

Measurement Validity.

Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.

Carol C. Korenbrot, Ph.D., Sabrina T. Wong, R.N., Ph.D., Anita L. Stewart, Ph.D., University of California San Francisco Collaborators Analytical Team.

Learning Objective Chapter 9 The Concept of Measurement and Attitude Scales Copyright © 2000 South-Western College Publishing Co. CHAPTER nine The Concept.

VALIDITY AND VALIDATION: AN INTRODUCTION Note: I have included explanatory notes for each slide. To access these, you will probably have to save the file.

1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Clinical Research with Diverse Communities.

Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.

1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Clinical Research with Diverse Communities.

Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.

1 Class 7 Measurement Issues in Research with Diverse Populations Including Health Disparities Research November 5, 2009 Anita L. Stewart Institute for.

1 Locating and Assessing the Usefulness of Health Measures for Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco.

Research Methodology and Methods of Social Inquiry Nov 8, 2011 Assessing Measurement Reliability & Validity.

1 Class 8 Measurement Issues in Diverse Populations Including Health Disparities Research November 17, 2005 Anita L. Stewart Institute for Health & Aging.

Measurement Issues General steps –Determine concept –Decide best way to measure –What indicators are available –Select intermediate, alternate or indirect.

Ch 9 Internal and External Validity. Validity  The quality of the instruments used in the research study  Will the reader believe what they are readying.

Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 15 Developing and Testing Self-Report Scales.

The Development and Validation of the Evaluation Involvement Scale for Use in Multi-site Evaluations Stacie A. ToalUniversity of Minnesota Why Validate.

CAHPS PATIENT EXPERIENCE SURVEYS AHRQ ANNUAL MEETING SEPTEMBER 2012 Christine Crofton, PhD CAHPS Project Officer.

Multivariate Analysis and Data Reduction. Multivariate Analysis Multivariate analysis tries to find patterns and relationships among multiple dependent.

Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.

Translation and Cross-Cultural Equivalence of Health Measures

Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.

PERSONALITY ASSESSMENT ACROSS CULTURES Imported and Indigenous Instruments.

Chapter 3 Selection of Assessment Tools. Council of Exceptional Children’s Professional Standards All special educators should possess a common core of.

Instrument Development and Psychometric Evaluation: Scientific Standards May 2012 Dynamic Tools to Measure Health Outcomes from the Patient Perspective.

Survey Methodology Reliability and Validity

Sample Power No reading, class notes only

Presentation transcript:

1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods EPI 222, Spring April 14, 2011

2 Overview of Class u Background: culture-specific versus generic measures u Conceptual and psychometric adequacy and equivalence –Adequacy in one group –Equivalence across groups u Modifying measures

3 Background u U.S. population becoming more diverse u Minority groups are being included in research due to: –NIH mandate (1993 – women and minorities) –Health disparities initiatives

4 Types of Diverse Groups u Health disparities research focuses on differences in health between … –Minority vs. non-minority –Lower income vs. others –Lower education vs. others –Limited English Proficiency (LEP) vs. others –…. and many others

5 Measurement Implications of Research in Diverse Groups u Most self-reported measures were developed and tested in mainstream, well-educated groups u Little information is available on appropriateness, reliability, validity, and responsiveness in diverse groups –Although this is changing rapidly

6 Measurement Adequacy vs. Measurement Equivalence u Adequacy - within a “diverse” group –concepts are appropriate and relevant –psychometric properties meet minimal criteria »Good variability »Reliable and valid »Sensitive to change over time u Equivalence - between “diverse” groups –conceptual and psychometric properties are comparable

7 Why Not Use Culture-Specific Measures? u Measurement goal is to identify measures that can be used across all groups in one study, yet maintain sensitivity to diversity and have minimal bias u Most health disparities studies compare mean scores across diverse groups

8 Generic/Universal vs Group-Specific (Etic versus Emic) u Concepts unlikely to be defined exactly the same way across diverse ethnic groups u Generic/universal (etic) –features of a concept that are appropriate across groups u Group-Specific (emic) –idiosyncratic or culture-specific portions of a concept

9 Etic versus Emic (cont.) u Goal in health disparities research with more than one group: –identify generic/universal portion of a concept that are applicable across all groups u For within-group studies: –the culture-specific portion is also relevant

10 Overview of Class u Background: culture-specific versus generic measures u Conceptual and psychometric adequacy and equivalence –Adequacy in one group –Equivalence across groups

11 Conceptual and Psychometric Adequacy and Equivalence Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

12 Left Side of Matrix: Adequacy in a Single Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

13 Ride Side of Matrix: Equivalence in More Than One Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

14 Overview of Class u Background: culture-specific versus generic measures u Conceptual and psychometric adequacy and equivalence –Adequacy in one group –Equivalence across groups u Modifing measures

15 Approaches to Explore Conceptual Adequacy in Diverse Groups u Literature reviews of concepts and measures u In-depth interviews and focus groups –discuss concepts, obtain their views u Expert consultation from diverse groups –review concept definitions –rate relevance of items

16 Example: Review of Measures of Dietary Intake in Minority Populations u Reviewed food frequency questionnaires for use in minority populations u Performed well in some groups and poorly in others u Group differences that could affect scores: –Portion sizes differ –Missing ethnic foods u Could underestimate total intake and nutrients RJ Coates et al. Am J Clin Nutr; 1997;65(suppl):1108S-15S.

17 A Structured Method for Examining Conceptual Relevance u Compiled set of 33 typical HRQL items u Administered to older African Americans u After each question, asked “how relevant is this question to the way you think about your health?” –0-10 scale with 0=not at all relevant, 10=extremely relevant Cunningham WE et al., Qual Life Res, 1999;8:

18 HRQL Relevance Results u Most relevant items: –Spirituality, weight-related health, hopefulness u Least relevant items: –Physical functioning, role limitations due to emotional problems

19 Qualitative Research: Expert Panel Reviewed Spanish FACT-G u Functional Assessment of Cancer Therapy – General (FACT-G) u Bilingual/bicultural panel reviewed items for conceptual relevance to Hispanics –One item had low relevance ( I worry about dying) »Added new item "I worry my condition will get worse" –One domain missing – spirituality »Developed new spirituality scale (FACIT-Sp) with input from cancer patients, psychotherapists, and religious experts D Cella et al. Med Care 1998: 36;1407

20 Example of Inadequate Concept u Patient satisfaction typically conceptualized in terms of, e.g., –access, technical care, communication, continuity, coordination, interpersonal style u In minority and low income groups, additional relevant domains: –discrimination by health professionals –sensitivity to language barriers MN Fongwa et al., Ethnicity Dis, 2006;16(3):

21 Measuring Park/Recreation Environments in Low-Income Communities u New focus on how environments promote physical activity –Many good new measures of environments u Reviewed adequacy for lower-income, minority communities

22 Measuring Park/Recreation Environments in Low-Income Communities (cont) u Recommendations: In low-income communities of color: –Identify and address most salient environmental needs –Incorporate research on preferred recreational activities –Ensure representation of perceptions of residents MF Floyd et al. Am J Prev Med, 2009;36:S156-S160.

23 Psychometric Adequacy in any Group u Minimal standards: – Sufficient variability – Minimal missing data – Adequate reliability/reproducibility – Evidence of construct validity – Evidence of sensitivity to change

24 Example: Adequacy of Reliability of Spanish SF-36 in Argentinean Sample SF-36 scaleCoefficient alpha Physical functioning.85 Role limitations - physical.84 Bodily pain.80 General health perceptions.69 Vitality.82 Social functioning.76 Role limitations - emotional.75 Mental health.84 F Augustovski et al, J Clin Epid, 2008, 61:

25 Overview of Class u Background: culture-specific versus generic measures u Conceptual and psychometric adequacy and equivalence –Adequacy in one group –Equivalence across groups u Modifying measures

26 Conceptual Equivalence Across Groups Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

27 Conceptual Equivalence u Is the concept relevant, familiar, acceptable to all diverse groups being studied? u Is the concept defined the same way in all groups? –all relevant “domains” included (none missing) –interpreted similarly

28 Example: Developing Concept of Interpersonal Processes of Care IPC II conceptual framework IPC Version I framework in Milbank Quarterly 19 focus groups - African American, Spanish- and English-speaking Latino, and White adults Literature review of quality of care in diverse groups

29 IPC-II Conceptual Framework: Reflects Concerns of All 4 Groups I. COMMUNICATION III. INTERPERSONAL STYLE General clarity Respectfulness Elicitation/responsiveness Courteousness Explanations of Perceived discrimination --processes, condition, Emotional support self-care, meds Cultural sensitivity II. DECISION MAKING Responsive to patient preferences Consider ability to comply

30 IPC-II Conceptual Framework (cont) IV. OFFICE STAFF Respectfulness Discrimination V. FOR LIMITED ENGLISH PROFICIENCY PATIENTS MD’s and office staff’s sensitivity to language

31 Conceptual Equivalence: Spanish- and English-speaking Inpatients u Administered Hospital Quality of Care Survey (H- CAHPS ® ), asked 2 open-ended questions to detect experiences missed by survey »What they liked most about care »What aspects of care they would change u Analyzed responses in relation to existing survey items or new topics MP Hurtado et al. Health Serv Res, 2005;40-6, Part II:

32 Psychometric Equivalence Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

33 Psychometric or Measurement Equivalence u When comparing groups (as in health disparities research): –Measures should have similar or equivalent measurement properties in all diverse groups of interest in your study »e.g., English and Spanish, African Americans and Caucasians

34 Psychometric Equivalence Across Groups u Psychometric characteristics should be “equivalent” across all groups: – Sufficient variability – Minimal missing data – Reliability/reproducibility – Construct validity – Sensitivity to change

35 Bias (Systematic Error) - A Special Concern u Observed group mean differences in a measure can be due to: –Culturally- or group-mediated differences in true score (true differences) -- OR -- –Bias - systematic differences between observed scores not attributable to true scores

36 Random versus Systematic Error Observed true item score score =+ error random systematic Relevant to reliability Relevant to validity “Bias”

37 Bias (Systematic Error) u Systematic measurement error may make group comparisons invalid u Systematic differences in scores can be due to group differences in: –the meaning of concepts or items –the extent to which measures represent a concept –cognitive processes of responding –use of response scales

38 Bias or “Systematic Difference”? u Bias = “deviation from true score” u Cannot speak of a “bias” in one group compared to another w/o knowing true score u Preferred term: differential “item” functioning (DIF) –Item (or measure) that has a different meaning in one group than another

39 Item Equivalence u No Differential Item Functioning (DIF) –Items are similarly related to the underlying trait u Meaning of response categories is similar across groups u Distance between response categories is similar across groups

40 Methods for Identifying Differential Item Functioning (DIF) u Item Response Theory (IRT) u Examines each item in relation to underlying latent trait u Tests if responses to one item predict the underlying latent “score” similarly in two groups –if not, items have “differential item functioning”

41 Example of Effect of DIF u 5 CES-D items administered to Black and White men –1 item subject to differential item functioning (bias) u 5-item scale including item suggested that Black men had more somatic symptoms than White men (p <.01) u 4-item scale excluding biased item showed no differences S Gregorich, Med Care, 2006;44:S78-S94.

42 Equivalence of Reliability?? No! u Difficult to compare reliability because it depends on the distribution of the construct in a sample –Thus lower reliability in one group may simply reflect poorer variability u More important is the adequacy of the reliability in both groups –Reliability meets minimal criteria within each group

43 Equivalence of Criterion Validity u Determine if hypothesized patterns of associations with specified criteria are confirmed in both groups, e.g. –a measure predicts utilization in both groups –a cutpoint on a screening measure has the same specificity and sensitivity in identifying a condition in both groups

44 Equivalence of Construct Validity u Are hypothesized patterns of associations confirmed in both groups? –Example: Scores on the Spanish version of the FACT-G had similar relationships with other health measures as scores on the English version u Primarily tested through subjectively examining pattern of correlations u Can also test using confirmatory factor analysis (CFA)

45 Equivalence of Construct Validity of Spanish SF-36 in Argentinean Sample u Compared Spanish SF-36 construct validity test results to U.S. English SF-36 results u Tested several previously tested hypotheses (which were confirmed): –PCS decreases with age and # of diseases –Relationship of PCS and MCS with utilization –Known groups validity (scores lower for those with various diseases) F Augustovski et al, J Clin Epid, 2008, 61:

46 Equivalence of Factor Structure u Factor structure similar in new group to structure in original study –measurement model is the same across groups u Methods –Specify number of factors –Determine if hypothesized model fits the data

47 Factor Structure of CES-D u Original study found 4 factors –Somatic symptoms –Depressive affect –Interpersonal behavior –Positive affect u In a new population group: do you find 4 factors? LS Radloff, Applied Psychol Measurement, 1977;1:

48 How Evidence for Equivalence of Factor Structure is Obtained u Subjectively –visually compare factor loadings across group- specific exploratory factor analysis u Empirically –confirmatory factor analysis of data that includes multiple groups –studies of psychometric invariance

49 Empirical Examination of Equivalence of Factor Structure u Psychometric invariance (equivalence) u Important properties of theoretically-based factor structure (measurement model) do not vary across groups (are invariant) –measurement model is the same across groups u Empirical comparison across groups using confirmatory factor analysis –Not simply by examination

50 Hierarchical Tests of Psychometric Equivalence Across all groups – a sequential process: u Same number of factors or dimensions u Same items on same factors u Same factor loadings u No bias on any item across groups u Same residuals on items u No item or scale bias AND same residuals

51 Dimensional Invariance: Same number of factors Configural Invariance: Same items load on same factors Metric or Factor Pattern Invariance: Items have same loadings on same factors Scalar or Strong Factorial Invariance: Observed scores are unbiased Residual Invariance: Observed item and factor variances are unbiased Strict Factorial Invariance Both scalar and residual criteria are met Criteria for Evaluating Invariance Across Groups: Technical Terms

52 Factor Structure of CES-D u Original study found 4 factors –Somatic symptoms –Depressive affect –Interpersonal behavior –Positive affect u In a new population group: do you find 4 factors? LS Radloff, Applied Psychol Measurement, 1977;1:

53 Test for Evidence of Dimensional Invariance u Two studies of Latinos: –2 factors in both studies »Depression and well-being u American Indian adolescents –3 factors »Depressed affect »Somatic symptoms and reduced activity »Positive affect TQ Miller et al., J Gerontol: Soc Sci 1997;520:S259 SM Manson et al., Psychol Assessment 1990;2:

54 Dimensional Invariance: Same number of factors Configural Invariance: Same items load on same factors Metric or Factor Pattern Invariance: Items have same loadings on same factors Strong Factorial or Scalar Invariance: Observed scores are unbiased Residual Invariance: Observed item and factor variances can be compared across groups Strict Factorial Invariance Both scalar invariance and residual invariance criteria are met Configural Invariance

55 Configural Invariance u Assumes: dimensional invariance is found (same number of factors) u Definition: Item-factor patterns are the same, same items load on same factors in both groups u CES-D example –4 factors found in Anglos, Blacks, and Chicanos –Same items loaded on each factor in all groups RE Roberts et al., Psychiatry Research, 1980;2:

56 Dimensional Invariance: Same number of factors Configural Invariance: Same items load on same factors Metric or Factor Pattern Invariance: Items have same loadings on same factors Strong Factorial or Scalar Invariance: Observed scores are unbiased Residual Invariance: Observed item and factor variances can be compared across groups Strict Factorial Invariance Both scalar invariance and residual invariance criteria are met Metric Invariance

57 Metric Invariance or Factor Pattern Invariance u Assumes: dimensional and configural invariance are found u Definition: Item loadings are the same across groups –i.e., the correlation of each item with its factor is the same in all groups

58 Metric Invariance Example from Interpersonal Processes of Care u Out of 91 items – factor structure of 29 items met criteria of invariance across 4 groups –Spanish-speaking Latinos, English speaking Latinos, African Americans, Whites u Dimensional –Similar factor structure across all 4 groups u Configural –Same items loaded on each factor in all 4 groups u Metric –Same item loadings in all 4 groups Stewart et al., Health Services Research, 2007; 42 (3, Part I):

59 Seven “Metric Invariant” Scales: Same Item Loadings Across Groups I. COMMUNICATION Hurried communication Elicited concerns, responded Explained results, medications II. DECISION MAKING Patient-centered decision-making III. INTERPERSONAL STYLE Compassionate, respectful Discriminated Disrespectful office staff

60 Dimensional Invariance: Same number of factors Configural Invariance: Same items load on same factors Metric or Factor Pattern Invariance: Items have same loadings on same factors Strong Factorial or Scalar Invariance: Observed scores are unbiased Residual Invariance: Observed item and factor variances can be compared across groups Strict Factorial Invariance Both scalar invariance and residual invariance criteria are met Strong Factorial Invariance

61 Strong Factorial Invariance or Scalar Invariance u Assumes: dimensional, configural, and metric invariance are found u Definition: Observed scores are unbiased, i.e., means can be compared across groups u Requires test of equivalence of mean scores across groups using confirmatory factor analysis

62 Seven “Scalar Invariant” (Unbiased) IPC Scales (18 items) I. COMMUNICATION Hurried communication – lack of clarity Elicited concerns, responded Explained results, medications – explained results II. DECISION MAKING Patient-centered decision-making – decided together III. INTERPERSONAL STYLE Compassionate, respectful–(subset) compassionate, respectful Discriminated – discriminated due to race/ethnicity Disrespectful office staff

63 Equivalence of Spanish and English Hospital Quality of Care Survey (H-CAHPS ® ) u Tested 7 subscales (e.g., nurse communication, pain control, discharge information) u Compared Spanish and English groups: –Item-scale correlations, internal consistency reliability, factor structure, and construct validity u Concluded these were equivalent MP Hurtado et al. Health Serv Res, 2005;40-6, Part II:

64 Overview of Class u Background: culture-specific versus generic measures u Conceptual and psychometric adequacy and equivalence –Adequacy in one group –Equivalence across groups u Modifying measures

65 What if Measures Need Modifying or Adapting? u Why would we modify a measure? u What information is used to modify? u What are the types of modifications? u How should we test modified measures?

66 When Problems are Found Through Pretesting… Investigators Face a Choice Use the existing measure “as is” to preserve integrity of measure OR Try to modify the measure to address problems in diverse group

67 Argument in Favor of Using Measure “As Is” u Modifications can change the measure’s validity and reliability u Allows comparison of findings to other research using the measure

68 Argument Against Using Measure “As Is” …. …when problems are found u If reliability and validity are poor… u Results pertaining to the measure could be erroneous –Limited internal validity

69 Reasons for Considering Modifying an Existing Measure u In health disparities research –Sample/population differs from that in which original measure developed u More broadly –Measure developed awhile ago –Poor format/presentation –Study context issues

70 Key Reason: Population Group Differences from Original u Research in diverse population groups –Different culture, race/ethnic group –Lower level of socioeconomic status (SES) –Limited English proficiency, lower literacy u Mainstream research –Different disease, health problem, patient group, age group

71 Why Might a Measure Not be Suitable for New Population Group? u Concept or dimension is missing u Meaning of concepts differ from mainstream u New group may not interpret items as intended u Process of answering questions may differ

72 Poor Format/Presentation = High Respondent Burden u Instructions unnecessarily wordy, unclear u Way of responding is complicated u Difficult to navigate the questionnaire –Crowded on the page –Hard to track across the page u Hard to read –Poor contrast, small font

73 Example: Complex Instructions Instructions: There are 12 statements on this form. They are statements about families. You are to decide which of these statements are true of your family and which are false. If you think the statement is TRUE or MOSTLY TRUE of your family, please mark the box in the T (TRUE) column. If you think the statement is FALSE or MOSTLY FALSE of your family, please mark the box in the F (FALSE) column. You may feel that some of the statements are true for some family members and false for others. Mark the box in the T column if the statement is TRUE for most members. Mark the box in the F column if the statement is FALSE for most members. If the members are evenly divide, decide what is the stronger overall impression and answer accordingly. Remember, we would like to know what your family seems like to you. So do not try to figure out how other members see your family, but do give us your general impression of your family for each statement. Do not skip any item. Please begin with the first item.

74 Example: Burdensome Way of Responding For each question, choose from the following alternatives: 0 = Never 1 = Almost Never 2 = Sometimes 3 = Fairly Often 4 = Very Often 1. In the last month, how often have you felt nervous and “stressed”? …………………………………… In the last month, how often have you felt that things were going your way? S Cohen et al. J Health Soc Beh, 1983;24(4):

75 What Information is Used to Decide How to Modify a Measure? u Same data identifying conceptual differences in diverse population… –often includes information for making revisions

76 Published Review - Physical Activity Measures for Minority Women u WHI convened experts to identify issues in measuring PA in minority and older women u Some conclusions: –Assess culturally sensitive activities (e.g., walking for transportation and errands) –Measure intermittent activities –Phrases “leisure time, free time, spare time” (used to denote non-occupational activities) not understood u Review can help select appropriate measures and adapt as needed LC Masse et al., J Women’s Health, 1998;7:57-67.

77 Types of Modifications u Format or presentation u Content –Dimensions –Item stems –Response options

78 Format/Presentation Modifications u Goal: reduce respondent burden u Improve appearance or way of responding –Simplify instructions –Modify format for responding –Create more space, reduce crowded items –Improve contrast, increase font size

79 Types of Modifications u Format or presentation u Content –Dimensions –Item stems –Response options Add Drop Replace Modify

80 Content Modification Example: Add Dimension u Study of older Korean/Chinese immigrants u Added language support to existing social support measure u Based on focus group data: –Help with translation at medical appointments –Help to ask questions in English when on the phone –Help to learn English S Wong et al. Int J Health Human Dev, 2005;61:

81 Content Modification Example: Add Dimension (cont) u New items were embedded in existing social support measure using same format

82 Minor to Major Modifications? u Each type of modification can hypothetically be rated on a continuum from having minor to major impact on reliability and validity of original measure –Minor – slight changes in format/presentation …… –Major – numerous changes in dimensions, items, and response choices

83 Need to Test Psychometric Properties of Modified Measures u All modifications, no matter how small, can affect reliability and validity of original measure u Burden is on investigator to test modified measure

84 Recommendations for Testing Modified Measures u Pretest modified measure extensively before fielding in new study u Build in ability to do psychometric testing when measure is fielded –Add validity variables (e.g., similar to original measure to test comparability) –Add follow-up to assess test-retest reliability

85 Analyze Psychometric Adequacy of Modified Measure in New Study u Modified measure should meet minimal criteria –Item-scale correlations –Internal-consistency reliability

86 Analyzing Modified Measure: Comparability to Original Measure u Compare measurement results of modified measure to original measure –Reliability (sample dependent) –Factor structure –Construct validity –Sensitivity to change

87 Overall Conclusions u Measurement in health disparities research is relatively new field u We encourage reporting on adequacy and equivalence of measures tested in any diverse population u As evidence grows, easier to find measures that work better across diverse groups

88 Resource: Reviews of Measures for Diverse Populations u Multicultural measurement in older populations, JH Skinner et al (eds), Springer Publishing Co: NY, 2002 –ALSO published as: Measurement in older ethnically diverse populations, J Mental Health Aging, Vol 7, Spring 2001 Reviews measures that have been used cross-culturally in: acculturation, socioeconomic status, social support, cognition, health, depression, and religiosity.

89 Resource: Special Journal Issue u Measurement in a multi-ethnic society –Med Care, Vol 44, November 2006 –Qualitative and quantitative methods in addressing measurement in diverse populations

90 Guidelines for Translating Measures u Handout: annotated bibliography of articles in which optimal methods of translation are used u Compiled by CADC Measurement and Methods Core

91 Homework for Class 3 u Complete rows in matrix –Use form posted on the website u Include your name in the filename –Smith_HW_epi222_class3 u by Monday April 18 to