1 Class 8 Measurement Issues in Diverse Populations Including Health Disparities Research November 17, 2005 Anita L. Stewart Institute for Health & Aging.

Slides:



Advertisements
Similar presentations
Andrea M. Landis, PhD, RN UW LEAH
Advertisements

Test Development.
Cross Cultural Research
Applied Structural Equation Modeling for Dummies, by Dummies February 22, 2013 Indiana University, Bloomington Joseph J. Sudano, Jr., PhD Center for.
Survey Methodology Reliability and Validity EPID 626 Lecture 12.
1 COMM 301: Empirical Research in Communication Kwan M Lee Lect4_1.
Research Methodology Lecture No : 11 (Goodness Of Measures)
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
Chapter 4A Validity and Test Development. Basic Concepts of Validity Validity must be built into the test from the outset rather than being limited to.
1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods.
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
Culture and psychological knowledge: A Recap
Chapter 4 Validity.
When Measurement Models and Factor Models Conflict: Maximizing Internal Consistency James M. Graham, Ph.D. Western Washington University ABSTRACT: The.
SOWK 6003 Social Work Research Week 4 Research process, variables, hypothesis, and research designs By Dr. Paul Wong.
Chapter 13: Descriptive and Exploratory Research
Introduction to Communication Research
Measurement Joseph Stevens, Ph.D. ©  Measurement Process of assigning quantitative or qualitative descriptions to some attribute Operational Definitions.
Chapter 4 Impact of Race, Ethnicity, and Culture on the Expression and Assessment of Psychopathology.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Validity Lecture Overview Overview of the concept Different types of validity Threats to validity and strategies for handling them Examples of validity.
Multivariate Methods EPSY 5245 Michael C. Rodriguez.
Fig Theory construction. A good theory will generate a host of testable hypotheses. In a typical study, only one or a few of these hypotheses can.
Descriptive and Causal Research Designs
© 2013 Cengage Learning. Outline  Types of Cross-Cultural Research  Method validation studies  Indigenous cultural studies  Cross-cultural comparisons.
Translation and Cross-Cultural Equivalence of Health Measures.
PERSONALITY ASSESSMENT ACROSS CULTURES Imported and Indigenous Instruments.
Instrumentation.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 14 Measurement and Data Quality.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
MEASUREMENT: VALIDITY Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods.
Use of CAHPS® Database by Researchers: Findings Related to Differences by Race and Ethnicity Ron D. Hays, Ph.D. RAND.
ScWk 240 Week 6 Measurement Error Introduction to Survey Development “England and America are two countries divided by a common language.” George Bernard.
1 Class 7 Measurement Issues in Research with Diverse Populations Including Health Disparities Research October 30, 2008 Anita L. Stewart Institute for.
Quantitative and Qualitative Approaches
 Descriptive Methods ◦ Observation ◦ Survey Research  Experimental Methods ◦ Independent Groups Designs ◦ Repeated Measures Designs ◦ Complex Designs.
1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods.
Chapter 4 – Research Methods in Clinical Psych Copyright © 2014 John Wiley & Sons, Inc. All rights reserved.
Confirmatory Factor Analysis Psych 818 DeShon. Construct Validity: MTMM ● Assessed via convergent and divergent evidence ● Convergent – Measures of the.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Principles of Instrument & Measurement Development Bonnie L. Halpern-Felsher, Ph.D. Professor University of California, San Francisco.
Carol C. Korenbrot, Ph.D., Sabrina T. Wong, R.N., Ph.D., Anita L. Stewart, Ph.D., University of California San Francisco Collaborators Analytical Team.
VALIDITY AND VALIDATION: AN INTRODUCTION Note: I have included explanatory notes for each slide. To access these, you will probably have to save the file.
1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Clinical Research with Diverse Communities.
Research: Conceptualization and Measurement Conceptualization Steps in measuring a variable Operational definitions Confounding Criteria for measurement.
G Lecture 7 Confirmatory Factor Analysis
1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Clinical Research with Diverse Communities.
1 Class 7 Measurement Issues in Research with Diverse Populations Including Health Disparities Research November 5, 2009 Anita L. Stewart Institute for.
1 Locating and Assessing the Usefulness of Health Measures for Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 15 Developing and Testing Self-Report Scales.
Applied Quantitative Analysis and Practices
Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.
Translation and Cross-Cultural Equivalence of Health Measures
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
PERSONALITY ASSESSMENT ACROSS CULTURES Imported and Indigenous Instruments.
Introduction to research
Instrument Development and Psychometric Evaluation: Scientific Standards May 2012 Dynamic Tools to Measure Health Outcomes from the Patient Perspective.
Survey Methodology Reliability and Validity
MGMT 588 Research Methods for Business Studies
Chapter 4 Research Methods in Clinical Psychology
Questions What are the sources of error in measurement?
Psychometric Properties of an Acculturation Scale:
Analyzing Reliability and Validity in Outcomes Assessment Part 1
Test construction 1. Testing in a cross-cultural context
EPSY 5245 EPSY 5245 Michael C. Rodriguez
Spanish and English Neuropsychological Assessment Scales - Guiding Principles and Evolution Friday Harbor Psychometrics Workshop 2005.
SURVEYS VERSUS INSTRUMENTS
Analyzing Reliability and Validity in Outcomes Assessment
Presentation transcript:

1 Class 8 Measurement Issues in Diverse Populations Including Health Disparities Research November 17, 2005 Anita L. Stewart Institute for Health & Aging University of California, San Francisco

2 Background u U.S. population becoming more diverse u More minority groups are being included in research due to: –NIH mandate –Recent health disparities initiatives

3 Types of Diverse Groups u Health disparities research focuses on differences in health between the following groups: –Minority vs. non-minority –Low income vs. others –Low education vs. others –Limited English skills vs. others –…. and others

4 Health Disparities Research u Increasing research to: –Describe health disparities »Differences in health across various diverse groups –Identify determinants of health disparities »Individual level »Environmental level –Intervene to reduce health disparities

5 The Measurement Problem u Measurement goal - identify measures that can be used across all groups, and –are sensitive to diversity –have minimal bias between groups u Most self-reported measures were developed and tested in mainstream, well- educated groups –little research on measurement characteristics in diverse groups

6 Issues Concerning Group Comparisons u Observed mean differences in a measure can be due to –culturally- or group-mediated differences in true score (true differences) -- OR -- –bias - systematic differences between group observed scores not attributable to true scores

7 Bias - A Special Concern u Measurement bias in any one group may make group comparisons invalid u Bias can be due to group differences in: –the meaning of concepts or items –the extent to which measures represent a concept –cognitive processes of responding –use of response scales –appropriateness of data collection methods

8 Effects of Bias on Depression: Chinese and White Respondents u In Chinese respondents - 3 sources of bias that lower observed score: –tendency to not express negative feelings –exacerbated by face-to-face interview –meaning of word “depression” is more severe than for Whites – less likely to endorse it u Comparing groups – assume true level of depression is the same in both groups – –Observed scores would be lower in Chinese group –But lower level is due to these biases

9 Typical Sequence of Developing New Self-Report Measures Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures

10 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups

11 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives

12 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives.. in all diverse groups

13 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives.. in all diverse groups

14 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives.. in all diverse groups Measurement studies across groups

15 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives.. in all diverse groups If results are non-equivalent

16 Measurement Adequacy vs. Measurement Equivalence u Making group comparisons requires conceptual and psychometric adequacy and equivalence u Adequacy - within a group –concepts are appropriate –psychometric properties meet minimal criteria u Equivalence - between groups –conceptual and psychometric properties are comparable

17 Conceptual and Psychometric Adequacy and Equivalence Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

18 Left Side of Matrix: Issues in a Single Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

19 Ride Side of Matrix: Issues in More Than One Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

20 Conceptual Adequacy in One Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

21 Conceptual Adequacy in One Group u Is concept relevant, meaningful, and acceptable in that group? u Traditional research –Conceptual adequacy = simply defining a concept –Mainstream population “assumed” u Minority and cross cultural research –Mainstream concepts may be inadequate –Concept should correspond to how a particular group thinks about it

22 (Old) Example of Inadequate Concept u Patient satisfaction typically conceptualized in mainstream populations in terms of, e.g., –access, technical care, communication, continuity, interpersonal style u In minority and low income groups, additional relevant domains include, e.g., –discrimination by health professionals –sensitivity to language barriers

23 Psychometric Adequacy in One Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

24 Psychometric Adequacy in any Group u Minimal standards: – Sufficient variability – Minimal missing data – Adequate reliability/reproducibility – Evidence of construct validity – Evidence of responsiveness to change u Basic classical test theory approach

25 Evidence of Psychometric Inadequacy of SF-36 Scale in Three Diverse Groups u SF-36 social functioning scale - internal consistency reliability <.70 in three different samples: –Chinese language, adults aged years –Japanese language, Japanese elders –English, Pima Indians Stewart AL & Nápoles-Springer A, 2000 (see readings)

26 Conceptual Equivalence Across Groups Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

27 Conceptual Equivalence u Is the concept relevant, familiar, acceptable to all diverse groups being studied? u Is the concept defined the same way in all groups? –all relevant “domains” included (none missing) –interpreted similarly u Is the concept appropriate for all diverse groups?

28 Example: Subjective Test of Conceptual Equivalence of Spanish FACT-G u Bilingual/bicultural expert panel reviewed all 28 items –One item had low cultural relevance to quality of life –One concept was missing – spirituality u Developed new spirituality scale (FACIT-Sp) with input from cancer patients, psychotherapists, and religious experts –Sample item “I worry about dying” Cella D et al. Med Care 1998: 36;1407

29 Examples of Nonequivalent Concept: Depression u Mainstream concept expressed and reported via affect, somatic symptoms, behavior, thought patterns u In Asian Americans –public expression of self-reflection is discouraged –saving face and self-sacrifice are powerful forces in molding behavior and expression

30 Generic/Universal vs Group-Specific (Etic versus Emic) u Concepts unlikely to be defined exactly the same way across diverse ethnic groups u Generic/universal (etic) –features of a concept that are appropriate across groups u Group-Specific (emic) –idiosyncratic portions of a concept

31 Etic versus Emic (cont.) u Goal in health disparities research –identify generic/universal portion of a concept (could be entire concept) that can be applied across all groups u For within-group analyses or studies –the culture-specific portion is also relevant

32 Qualitative Approaches to Explore Conceptual Equivalence in Diverse Groups u Literature reviews –ethnographic and anthropological u In-depth interviews and focus groups –discuss concepts, obtain their views u Expert consultation from diverse groups –review concept definitions –rate relevance of items

33 Psychometric Equivalence Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

34 Equivalence of Reliability?? No! u Difficult to compare reliability because it depends on the distribution of the construct in a sample –Thus lower reliability in one group may simply reflect poorer variability u More important is the adequacy of the reliability in both groups –Reliability meets minimal criteria within each group

35 Equivalence of Criterion Validity u Determine if hypothesized patterns of associations with specified criteria are confirmed in both groups, e.g. –a measure predicts utilization in both groups –a cutpoint on a screening measure has the same specificity and sensitivity in both groups

36 Equivalence of Construct Validity u Are hypothesized patterns of associations confirmed in both groups? –Example: Scores on the Spanish version of the FACT had similar relationships with other health measures as scores on the English version u Primarily tested through subjectively examining pattern of correlations –Can test differences using confirmatory factor analysis (e.g., through Structural Equation Modeling)

37 Equivalence of Factor Structure u Factor structure is similar in new group to structure in original groups in which measure was tested –In other words, the measurement model is the same across groups u Methods –Specify the number of factors you are looking for –Determine if the hypothesized model fits the data

38 Exploratory Factor Analysis (EFA) u Factor analysis methods that do not constrain the number of factors or the magnitude of the loadings u Identifies an underlying structure of a set of items with no particular hypotheses u Goal - identify as few explanatory variables (i.e., factors) as possible that account for covariation among the items

39 Confirmatory Factor Analysis (CFA) u Methods that specify a hypothesized structure a priori (before looking at the results) u Can test mean and covariance structures –to estimate bias

40 Equivalence of Factor Structure: Assuring Psychometric Invariance u Psychometric invariance (technical term for psychometric equivalence) u Invariance means that important properties of a theoretically-based factor structure (measurement model) do not differ or vary across groups (are invariant) –In other words, the measurement model is the same across groups u Empirical comparison of factor structure

41 Criteria for Psychometric Invariance: Non-technical Language Across two or more groups, determine whether each criterion is true – a sequential process: 1. Same number of factors (dimensions) 2. Same items load on (correlate with) same factors 3. Each item has same factor loadings 4. No bias on any item or scale across groups 5. Same residuals on items 6. No item or scale bias AND same residuals

42 Dimensional Invariance: Same number of dimensions Configural Invariance: Same items load on same dimensions Metric Invariance, Factor Pattern Invariance: Items have same loadings on same dimensions Strong Factorial Invariance, Scalar Invariance: Observed scores are unbiased Residual Invariance: Observed item and factor variances can be compared across groups Strict Factorial Invariance Both scalar invariance and residual invariance criteria are met Criteria for Evaluating Invariance Across Groups: Technical Terms

43 Dimensional Invariance u Definition: Factor structure is the same, i.e., t he same number of factors are observed in both groups u CES-D Example: –Four factors found in men and 3 factors in women (n=1000), years of age –Failed the dimensional invariance criterion »a different number of factors was found in both groups JM Golding et al., J Clin Psychol 1991:47;61-75

44 Example: Dimensional Invariance of CES-D in Hispanic EPESE u Original 4 factors –Somatic symptoms –Depressive affect –Interpersonal behavior –Positive affect u Hispanic EPESE - only 2 factors –Depression (included somatic symptoms, depressive affect, and interpersonal behavior) –Well-being Miller TQ et al., The factor structure of the CES-D in two surveys of elderly Mexican Americans, J Gerontol: Soc Sci, 1997;520:S

45 Configural Invariance u Assumes: dimensional invariance is found –that there were the same number of factors u Definition: Item-factor patterns are the same, i.e., the same items load on the same factors in both groups u CES-D Example –4 factors found in Anglos, Blacks, and Chicanos –Same items loaded on each factor in all groups RE Roberts et al., Psychiatry Research, 1980;2:

46 Metric Invariance or Factor Pattern Invariance u Assumes: dimensional and configural invariance are found u Definition: Item loadings are the same across groups, i.e., the correlation of each item with its factor is the same in both groups

47 Strong Factorial Invariance or Scalar Invariance u Assumes: dimensional, configural, and metric (factor pattern) invariance are found u Definition: Observed scores are unbiased, i.e., means can be compared across groups u Requires test of equivalence of mean scores across groups using confirmatory factor analysis

48 Residual Invariance u Assumes: dimensional, configural, and metric (factor pattern) invariance are found u Definition: Observed item and factor residual variances can be compared across groups, i.e., similar error associated with each item across groups

49 Strict Factorial Invariance u Assumes: dimensional, configural, metric (factor pattern), and strong factorial (scalar) invariance are found

50 Integrating Qualitative and Quantitative Methods in Assessing Cultural Equivalence u Optimal approach - use qualitative and quantitative methods in tandem to address issues of cultural equivalence u International studies do this routinely –typically develop a translated version of existing measure for a new country and language »Swedish version of SF-36 »Japanese version of Mental Health Inventory

51 Distinction Between International and U.S. Studies u International studies assume conceptual non- equivalence to begin with –Different nations, languages u Usually dealing with translated measures u During translation, items can be added or modified to improve conceptual and semantic equivalence –Product is an “adapted” instrument

52 International Versus U.S. Approach Assess Conceptual Equivalence (Qualitative) Assess Psychometric Equivalence (Quantitative)

53 Typical International Approach Assess Conceptual Equivalence (Qualitative) Assess Psychometric Equivalence (Quantitative) Begin here (assumes conceptual differences across countries) If new domains or definitions are found, can revise and add items Translated “adapted” version is the goal May achieve conceptual equivalence before testing psychometric adequacy

54 Typical U.S. Approach in Studies of English Speaking Diverse Groups u Select existing well-tested measures (developed in mainstream) and assume they will work (universality) u Assumes perspectives of diverse group are similar to mainstream –“Cultural hegemony” (Guyatt) –“Middle-class ethnocentrism (Rogler)

55 Typical U.S. Approach When No Translation is Done Most studies begin here (assumes universality of constructs) Assess Conceptual Equivalence (Qualitative) Assess Psychometric Equivalence (Quantitative)

56 Typical U.S. Approach When No Translation is Done Most studies begin here (assumes universality of constructs) If equiv. Assess Conceptual Equivalence (Qualitative) Assess Psychometric Equivalence (Quantitative) Proceed with analysis. May miss important domains and definitions

57 Typical U.S. Approach When No Translation is Done Most studies begin here (assumes universality of constructs) If problems If equiv. Assess Conceptual Equivalence (Qualitative) Assess Psychometric Equivalence (Quantitative) No Guidelines! If refine items based on qualitative studies, no longer have comparable instrument Proceed with analysis. May miss important domains and definitions

58 Typical U.S. Subgroup Approach When No Translation is Done No Guidelines! If refine items based on qualitative studies, no longer have comparable instrument Assess Conceptual Equivalence (Qualitative) Assess Psychometric Equivalence (Quantitative) Most studies begin here (assumes universality of constructs) If problems Proceed with analysis. May miss important domains and definitions If equiv.

59 What to do if Measures Are Not Equivalent in a Specific Study Comparing Groups u Need guidelines for how to handle data when substantial non-comparability is found in a study –Drop bad or “biased” items from scores »Compare results with and without biased items –Analyze study by stratifying diverse groups u The current challenge for measurement in minority health studies

60 Example: 20-item Spanish CES-D in Older Latinos u 2 items had very low item-scale correlations, high rates of missing data in two studies –I felt hopeful about the future –I felt I was just as good as other people u 20-item version Study 1 Study 2 –Item-scale correlations -.20 to to.78 –Cronbach’s alpha u 18-item version –Item-scale correlations.45 to to.79

61 Example: Measure Can be Modified u GHAA Consumer Satisfaction Survey u Adapted to be appropriate for African American patients –Focus groups conducted to obtain perspectives of African Americans –New domains added (e.g., discrimination/ stereotyping) –New items added to existing domains Fongwa M et al. Characteristics of a patient satisfaction instrument tailored to concerns of African Americans. Ethnicity and Disease, in press.

62 Approaches to Conducting Studies When You Are Not Sure u Use a combination of “universal” and group- specific items –use universal items to compare across groups –use specific items (added onto universal items) when conducting analyses within one group »To find a variable that correlates with a health measure within one group

63 Conclusions u Measurement in health disparities research is a relatively new field –Few guidelines u Encourage first steps –Test and report adequacy and equivalence u As evidence grows, concepts and measures that work better across diverse groups will be identified

64 Homework: Optional u IF you are interested in conducting research in diverse populations, –Complete rows in the matrix (equivalence across diverse groups)