1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Clinical Research with Diverse Communities.

Slides:



Advertisements
Similar presentations
Cross Cultural Research
Advertisements

Research Methodology Lecture No : 11 (Goodness Of Measures)
Issues of Technical Adequacy in Measuring Student Growth for Educator Effectiveness Stanley Rabinowitz, Ph.D. Director, Assessment & Standards Development.
1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods.
DEVELOPMENT OF A PREFERENCE-BASED, CONDITION SPECIFIC PATIENT REPORTED OUTCOME MEASURE FOR USE WITH VENOUS ULCERATION Simon Palfreyman 1, John E Brazier.
CH. 9 MEASUREMENT: SCALING, RELIABILITY, VALIDITY
Part II Knowing How to Assess Chapter 5 Minimizing Error p115 Review of Appl 644 – Measurement Theory – Reliability – Validity Assessment is broader term.
Culture and psychological knowledge: A Recap
Chapter 4 Validity.
Health-related quality of life in diabetic patients and controls without diabetes in refugee camps in Gaza strip: a cross-sectional study By: Ashraf Eljedi:
Beginning the Research Design
Health Disparities/ Cultural Competence Curriculum Clinical Addiction Research and Education Unit Section of General Internal Medicine Boston University.
1 Measurement Measurement Rules. 2 Measurement Components CONCEPTUALIZATION CONCEPTUALIZATION NOMINAL DEFINITION NOMINAL DEFINITION OPERATIONAL DEFINITION.
Chapter 7 Correlational Research Gay, Mills, and Airasian
Validity and Validation: An introduction Note: I have included explanatory notes for each slide. To access these, you will probably have to save the file.
How to Write a Successful Aging Grant Application to NIH: Hints and Tips Fredda Blanchard-Fields Georgia Institute of Technology.
Study announcement if you are interested!. Questions  Is there one type of mixed design that is more common than the other types?  Even though there.
Formulating objectives, general and specific
Fig Theory construction. A good theory will generate a host of testable hypotheses. In a typical study, only one or a few of these hypotheses can.
Reliability and factorial structure of a Portuguese version of the Children’s Hope Scale José Tomás da Silva Maria Paula Paixão Catarina Carvalho dos Santos.
Psychometric Properties of the Consumer Assessment of Healthcare Providers and Systems (CAHPS) Clinician and Group Adult Visit Survey September 11, 2012.
Implication of Gender and Perception of Self- Competence on Educational Aspiration among Graduates in Taiwan Wan-Chen Hsu and Chia- Hsun Chiang Presenter.
Nursing Care Makes A Difference The Application of Omaha Documentation System on Clients with Mental Illness.
1 EPI 225 Measurement in Clinical Research Fall 2008 Anita L. Stewart, Ph.D. Institute for Health & Aging University of California, San Francisco.
Translation and Cross-Cultural Equivalence of Health Measures.
PERSONALITY ASSESSMENT ACROSS CULTURES Imported and Indigenous Instruments.
Construction and Evaluation of Multi-item Scales Ron D. Hays, Ph.D. RCMAR/EXPORT September 15, 2008, 3-4pm
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
1 Class 1 Concept Development and Concept Definitions September 20, 2007 Anita L. Stewart, Ph.D. Institute for Health & Aging University of California,
Instrumentation.
LECTURE 06B BEGINS HERE THIS IS WHERE MATERIAL FOR EXAM 3 BEGINS.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
Standardization and Test Development Nisrin Alqatarneh MSc. Occupational therapy.
Public and Private Families Chapter 1. Increasing ambivalence Women in workforce vs. children in day care Divorce vs. unhappy marriage.
1 Copyright © 2011 by Saunders, an imprint of Elsevier Inc. Chapter 9 Examining Populations and Samples in Research.
MEASUREMENT: VALIDITY Lu Ann Aday, Ph.D. The University of Texas School of Public Health.
1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
Evaluating a Research Report
1 The Patient Perspective: Satisfaction Survey Presented at: Disease Management Colloquium June 22, 2005 Shulamit Bernard, RN, PhD.
1 Class 7 Measurement Issues in Research with Diverse Populations Including Health Disparities Research October 30, 2008 Anita L. Stewart Institute for.
Quantitative and Qualitative Approaches
1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Health Disparities Research Methods.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
Confirmatory Factor Analysis Psych 818 DeShon. Construct Validity: MTMM ● Assessed via convergent and divergent evidence ● Convergent – Measures of the.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Carol C. Korenbrot, Ph.D., Sabrina T. Wong, R.N., Ph.D., Anita L. Stewart, Ph.D., University of California San Francisco Collaborators Analytical Team.
VALIDITY AND VALIDATION: AN INTRODUCTION Note: I have included explanatory notes for each slide. To access these, you will probably have to save the file.
Measurement Issues by Ron D. Hays UCLA Division of General Internal Medicine & Health Services Research (June 30, 2008, 4:55-5:35pm) Diabetes & Obesity.
1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Clinical Research with Diverse Communities.
1 Class 7 Measurement Issues in Research with Diverse Populations Including Health Disparities Research November 5, 2009 Anita L. Stewart Institute for.
1 Locating and Assessing the Usefulness of Health Measures for Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco.
1 Class 8 Measurement Issues in Diverse Populations Including Health Disparities Research November 17, 2005 Anita L. Stewart Institute for Health & Aging.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 15 Developing and Testing Self-Report Scales.
Qualitative and Quantitative Research Methods
Multitrait Scaling and IRT: Part I Ron D. Hays, Ph.D. Questionnaire Design and Testing.
Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.
Translation and Cross-Cultural Equivalence of Health Measures
Reliability performance on language tests is also affected by factors other than communicative language ability. (1) test method facets They are systematic.
Chapter 6 - Standardized Measurement and Assessment
Impact of Perceived Discrimination on Use of Preventive Health Services Amal Trivedi, M.D., M.P.H. John Z. Ayanian, M.D., M.P.P. Harvard Medical School/Brigham.
PERSONALITY ASSESSMENT ACROSS CULTURES Imported and Indigenous Instruments.
Introduction to research
The measurement and comparison of health system responsiveness Nigel Rice, Silvana Robone, Peter C. Smith Centre for Health Economics, University of York.
Principles of Assessment and Outcome Measurement for Physical Therapists ksu. edu. sa Dr. taher _ yahoo. com Mohammed TA, Omar,
Evaluating Multi-item Scales Ron D. Hays, Ph.D. UCLA Division of General Internal Medicine/Health Services Research HS237A 11/10/09, 2-4pm,
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Instrument Development and Psychometric Evaluation: Scientific Standards May 2012 Dynamic Tools to Measure Health Outcomes from the Patient Perspective.
Concept of Test Validity
Spanish and English Neuropsychological Assessment Scales - Guiding Principles and Evolution Friday Harbor Psychometrics Workshop 2005.
Presentation transcript:

1 Measurement Issues in Health Disparities Research Anita L. Stewart, Ph.D. University of California, San Francisco Clinical Research with Diverse Communities EPI 222, Spring April 17, 2008

2 Background u U.S. population becoming more diverse u More minority groups are being included in research due to: –NIH mandate –Recent health disparities initiatives

3 Types of Diverse Groups u Health disparities research focuses on differences in health between the following groups: –Minority vs. non-minority –Low income vs. others –Low education vs. others –Limited English skills vs. others –…. and others

4 Health Disparities Research u Describe health disparities –Health differences across various diverse groups u Identify mechanisms by which health disparities occur –Individual level –Environmental level u Intervene to reduce health disparities

5 Health Care Disparities u Differential access to and quality of health care is well known –Thus, health care disparities become a plausible mechanism for health disparities u Understanding determinants of health care disparities is also of interest

6 Types of Self-Report Measures Needed u Measures of health, and of various mechanisms for disparities –Class 4 will present numerous mechanisms u Examples from this class: sense of control, self-efficacy for managing disease, health- related quality of life for various health conditions

7 Measurement Implications of Research in Diverse Groups u Most self-reported measures were developed and tested in mainstream, well-educated groups –Subgroup analysis of measures has been rare u Thus, little information is available on appropriateness, reliability, validity, and responsiveness in minority and other diverse groups

8 The Measurement Goal: Identify Measures That Can Be Used… u To compare diverse groups u To study mechanisms within any particular “diverse” group

9 Group Comparisons are the Most Problematic u Disparities research involves comparing mean levels of health or its determinants u Requires “equivalent” concepts and measures –Potential true differences may be obscured –Observed group differences may be inaccurate

10 Alternative Explanations for Observed Group Differences u Observed group mean differences in a measure can be due to –culturally- or group-mediated differences in true score (true differences) -- OR -- –bias - systematic differences between group observed scores not attributable to true scores

11 Bias - A Special Concern u Measurement bias in any one group may make group comparisons invalid u Bias can be due to group differences in: –the meaning of concepts or items –the extent to which a measure represents a concept –cognitive processes of responding –use of response scales –appropriateness of data collection methods

12 Example of Effect of Biased Items u 5 CES-D items administered to black and white men –1 item subject to differential item functioning (bias) u 5-item scale including item suggested that black men had higher levels of somatic symptoms than white men (p <.01) u 4-item scale excluding biased item showed no differences between black and white men S Gregorich, Med Care, 2006;44:S78-S94.

13 Bias or “Systematic Difference”? u Bias refers to “deviation from true score” u Cannot speak of a measure being “biased” in one group compared to another w/o knowing true score u Preferred term: differential “item” functioning –Item (or measure) that has a different meaning in one group than another

14 Typical Sequence of Developing New Self-Report Measures Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures

15 Typical Sequence of Developing New Self-Report Measures Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures

16 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups

17 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives

18 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives.. in all diverse groups

19 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives.. in all diverse groups

20 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives.. in all diverse groups Measurement studies across groups

21 Extra Steps in Sequence of Developing New Self-Report Measures for Diverse Groups Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups.. to reflect these perspectives.. in all diverse groups If results are non-equivalent

22 Measurement Adequacy vs. Measurement Equivalence u Making group comparisons requires conceptual and psychometric adequacy and equivalence u Adequacy - within a “diverse” group –concepts are appropriate –psychometric properties meet minimal criteria u Equivalence - between “diverse” groups –conceptual and psychometric properties are comparable

23 Why Not Use Culture-Specific Measures? u Measurement goal - identify measures that can be used across all groups, yet maintain sensitivity to diversity and have minimal bias u Most health disparities studies require comparing mean scores across diverse groups –need comparable measures

24 Conceptual and Psychometric Adequacy and Equivalence Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

25 Left Side of Matrix: Issues in a Single Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

26 Ride Side of Matrix: Issues in More Than One Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

27 Conceptual Adequacy in One Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

28 Conceptual Adequacy in One Group u Is concept relevant, meaningful, and acceptable to a “diverse” group? u Traditional research –Conceptual adequacy = simply defining a concept –Mainstream population “assumed” u Minority and health disparities research –Mainstream concepts may be inadequate –Concept should correspond to how a particular group thinks about it

29 Qualitative Approaches to Explore Conceptual Adequacy in Diverse Groups u Literature reviews –ethnographic and anthropological u In-depth interviews and focus groups –discuss concepts, obtain their views u Expert consultation from diverse groups –review concept definitions –rate relevance of items

30 Example of Inadequate Concept u Patient satisfaction typically conceptualized in mainstream populations in terms of, e.g., –access, technical care, communication, continuity, interpersonal style u In minority and low income groups, additional relevant domains include, e.g., –discrimination by health professionals –sensitivity to language barriers

31 Method for Examining Conceptual Relevance u Compiled set of 33 HRQL items spanning many concepts u Assessed relevance to older African Americans u After answering each question, asked “how relevant is this question to the way you think about your health?” –Response scale: 0-10 scale with endpoints labeled –Labels: 0=not at all relevant, 10=extremely relevant Cunningham WE et al., Qual Life Res, 1999;8:

32 Results: Conceptual Relevance u Most relevant items: –Spirituality (3 items) –Weight-related health (2 items) –Hopefulness (1 item) u Spirituality items –importance of spirituality to well-being, level of spirituality, being sick affected spirituality

33 Results: Conceptual Relevance u Least relevant items: –Physical functioning –Role limitations due to emotional problems u All standard MOS measures ranked in the lower 2/3, including all SF12 items

34 Conceptual Relevance of Spanish FACT-G u Bilingual/bicultural expert panel reviewed all 28 items for relevance –One item had low cultural relevance to quality of life –One concept was missing – spirituality u Developed new spirituality scale (FACIT-Sp) with input from cancer patients, psychotherapists, and religious experts –Sample item “I worry about dying” Cella D et al. Med Care 1998: 36;1407

35 Psychometric Adequacy in One Group Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

36 Psychometric Adequacy in any Group u Minimal standards: – Sufficient variability – Minimal missing data – Adequate reliability/reproducibility – Evidence of construct validity – Evidence of responsiveness to change u Basic classical test theory approach

37 Evidence of Psychometric Inadequacy of Measure in Various Diverse Groups u SF-36 social functioning scale - internal consistency reliability <.70 in three different samples: –Chinese language, adults aged years –Japanese language, Japanese elders –English, Pima Indians Stewart AL & Nápoles-Springer A, Med Care, 2000;38(9 Suppl):II-102

38 Conceptual Equivalence Across Groups Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

39 Conceptual Equivalence u Is the concept relevant, familiar, acceptable to all diverse groups being studied? u Is the concept defined the same way in all groups? –all relevant “domains” included (none missing) –interpreted similarly u Is the concept appropriate for all diverse groups?

40 Generic/Universal vs Group-Specific (Etic versus Emic) u Concepts unlikely to be defined exactly the same way across diverse ethnic groups u Generic/universal (etic) –features of a concept that are appropriate across groups u Group-Specific (emic) –idiosyncratic or culture-specific portions of a concept

41 Etic versus Emic (cont.) u Goal in health disparities research on more than one group: –identify generic/universal portion of a concept (could be entire concept) that can be applied across all groups u For within-group analyses or studies –the culture-specific portion is also relevant –Same as examining “conceptual adequacy” within one group

42 Approaches Similar to Those for Conceptual Adequacy u Main difference: Need to assure concept is equivalent across groups –Additional criterion u What do we mean by equivalent conceptually? u Methods are poorly developed

43 Obtain Perspective of All Diverse Groups on Concept Develop concept Create item pool Pretest/revise Field survey Psychometric analyses Final measures Obtain perspectives of diverse groups

44 Example: Develop Concept of Interpersonal Processes of Care u Began with conceptual framework from literature and psychometric studies of preliminary survey u IPC Version I Conceptual Framework u Three major multi-dimensional categories : –Communication –Decision-making –Interpersonal Style

45 IPC Version I: Subdomains u Communication Elicitation of concerns, explanations, general clarity u Decision-making Involving patients in decisions u Interpersonal Style Respectfulness, emotional support, non-discrimination, cultural sensitivity Stewart et al., Milbank Quarterly, 1999: 77:305

46 Limitations of First IPC Framework u Tested on small sample of 600 patients from San Francisco General Hospital u Several hypothesized concepts were not confirmed –e.g., cultural sensitivity u Needed further development and validation on a larger sample

47 Developed Revised IPC Concept Draft IPC II conceptual framework IPC Version I framework in Milbank Quarterly 19 focus groups - African American, Latino, and White adults Literature review of quality of care in diverse groups

48 IPC-II Conceptual Framework I. COMMUNICATION III. INTERPERSONAL STYLE General clarity Respectfulness Elicitation/responsiveness Courteousness Explanations of Perceived discrimination --processes, condition, Emotional support self-care, meds Cultural sensitivity Empowerment II. DECISION MAKING Responsive to patient preferences Consider ability to comply

49 IPC-II Conceptual Framework (cont) IV. OFFICE STAFF Respectfulness Discrimination V. FOR LIMITED ENGLISH PROFICIENCY PATIENTS MD’s and office staff’s sensitivity to language

50 Psychometric Equivalence Conceptual Psychometric Adequacy in 1 Group Equivalence Across Groups Concept equivalent across groups Psychometric properties meet minimal standards within one group Psychometric properties invariant (equivalent) across groups Concept meaningful within one group

51 Psychometric Equivalence u Measures have similar measurement properties in all diverse group of interest in your study –e.g., English and Spanish language, African Americans and Caucasians u Measures have similar measurement properties in one diverse group as in original (mainstream) groups on which the measures were developed

52 Equivalence of Reliability?? No! u Difficult to compare reliability because it depends on the distribution of the construct in a sample –Thus lower reliability in one group may simply reflect poorer variability u More important is the adequacy of the reliability in both groups –Reliability meets minimal criteria within each group

53 Equivalence of Criterion Validity u Determine if hypothesized patterns of associations with specified criteria are confirmed in both groups, e.g. –a measure predicts utilization in both groups –a cutpoint on a screening measure has the same specificity and sensitivity in both groups

54 Equivalence of Construct Validity u Are hypothesized patterns of associations confirmed in both groups? –Example: Scores on the Spanish version of the FACT had similar relationships with other health measures as scores on the English version u Primarily tested through subjectively examining pattern of correlations –Can test differences using confirmatory factor analysis (e.g., through Structural Equation Modeling)

55 Item Equivalence u Differential Item Functioning (DIF) –Items are non-equivalent if they are differentially related to the underlying trait –Equivalence indicated by no DIF u Meaning of response categories is similar across groups u Distance between response categories is similar across groups

56 Equivalence of Response Choices: Spanish and English Self-rated Health u Excellent u Very good u Good u Fair u Poor u Excelente u Muy buena u Buena u Regular u Mala “Regular” in Spanish may be closer to “good” in English, thus is not comparable to the meaning of “fair”

57 Spanish and English Self-rated Health Responses u Excellent u Very good u Good u Fair u Poor u Excelente u Muy buena u Buena u Regular (Pasable?) u Mala Another choice, “pasable,” may be closer in meaning to “fair”

58 Methods for Identifying Differential Item Functioning (DIF) u Item Response Theory (IRT) u Examines each item in relation to underlying latent trait u Tests if responses to one item predict the underlying latent “score” similarly in two groups –if not, items have “differential item functioning”

59 Equivalence of Factor Structure u Factor structure is similar in new group to structure in original groups in which measure was tested –In other words, the measurement model is the same across groups u Methods –Specify the number of factors you are looking for –Determine if the hypothesized model fits the data

60 Confirmatory Factor Analysis (CFA) u Can specify a hypothesized structure a priori u Can test mean and covariance structures –to estimate bias

61 Equivalence of Factor Structure: Testing Psychometric Invariance u Psychometric invariance (equivalence) u Important properties of theoretically-based factor structure (measurement model) do not vary across groups (are invariant) –measurement model is the same across groups u Empirical comparison across groups –Not simply by examination

62 Criteria for Psychometric Invariance Across all groups – a sequential process: u Same number of factors or dimensions u Same items on same factors u Same factor loadings u No bias on any item across groups u Same residuals on items u No item or scale bias AND same residuals

63 Dimensional Invariance: Same number of factors Configural Invariance: Same items load on same factors Metric or Factor Pattern Invariance: Items have same loadings on same factors Scalar or Strong Factorial Invariance: Observed scores are unbiased Residual Invariance: Observed item and factor variances are unbiased Strict Factorial Invariance Both scalar and residual criteria are met Criteria for Evaluating Invariance Across Groups: Technical Terms

64 Interpersonal Processes of Care (IPC) u Social-psychological aspects of the patient-physician interaction –communication, respectfulness, patient- centered decision-making, and being sensitive to patients’ needs u Developed survey of 92 items based on principles outlined above

65 Conducted Survey u From over 16,000 primary care patients, randomly sampled those who: –Made at least one visit in prior 12 months –Records indicated they were African American, Latino, or White (Caucasian) u Sampled within race/ethnic group

66 Sample Size (N=1,664) 383 Spanish speaking Latino 435 African American 428 English speaking Latino 418 White

67 Results u Of the 92 items, 29 had similar factor structure across all 4 groups –achieved “metric invariance”

68 Dimensional Invariance: Same number of factors Configural Invariance: Same items load on same factors Metric or Factor Pattern Invariance: Items have same loadings on same factors Strong Factorial or Scalar Invariance: Observed scores are unbiased Residual Invariance: Observed item and factor variances can be compared across groups Strict Factorial Invariance Both scalar invariance and residual invariance criteria are met Results: Metric Invariance Across 4 Groups for 29 Items

69 Seven “Metric Invariant” Scales (29 items) I. COMMUNICATION Hurried communication Elicited concerns, responded Explained results, medications II. DECISION MAKING Patient-centered decision-making III. INTERPERSONAL STYLE Compassionate, respectful Discriminated Disrespectful office staff

70 Continued Exploration of Invariance: Item Bias u Tested invariance of model parameter estimates across groups for “scalar invariance” –Bias in items

71 Dimensional Invariance: Same number of factors Configural Invariance: Same items load on same factors Metric or Factor Pattern Invariance: Items have same loadings on same factors Strong Factorial or Scalar Invariance: Observed scores are unbiased Residual Invariance: Observed item and factor variances can be compared across groups Strict Factorial Invariance Both scalar invariance and residual invariance criteria are met Obtained Partial Scalar Invariance Across 4 Groups for 18 Items

72 Seven “Scalar Invariant” (Unbiased) Scales (18 items) I. COMMUNICATION Hurried communication – lack of clarity Elicited concerns, responded Explained results, medications – explained results II. DECISION MAKING Patient-centered decision-making – decided together III. INTERPERSONAL STYLE Compassionate, respectful–(subset) compassionate, respectful Discriminated – discriminated due to race/ethnicity Disrespectful office staff

73 What to do if Measures Are Not Equivalent in a Specific Study Comparing Groups u Need guidelines for how to handle data when substantial non-comparability is found in a study –Drop bad or “biased” items from scores »Compare results with and without biased items –Analyze study by stratifying diverse groups u The current challenge for measurement in minority health studies

74 Example: 20-item Spanish CES-D in Older Latinos u 2 items had very low item-scale correlations, high rates of missing data in two studies –I felt hopeful about the future –I felt I was just as good as other people u 20-item version Study 1 Study 2 –Item-scale correlations -.20 to to.78 –Cronbach’s alpha u 18-item version –Item-scale correlations.45 to to.79

75 Example: Measure Can be Modified u GHAA Consumer Satisfaction Survey u Adapted to be appropriate for African American patients –Focus groups conducted to obtain perspectives of African Americans –New domains added (e.g., discrimination/ stereotyping) –New items added to existing domains Fongwa M et al. Ethnicity and Disease, 2006:16;

76 Approaches to Conducting Studies When You Are Not Sure u Use a combination of “universal” and group- specific items –use universal items to compare across groups –use specific items (added onto universal items) when conducting analyses within one group »To find a variable that correlates with a health measure within one group

77 Conclusions u Measurement in health disparities and minority health research is a relatively new field - few guidelines u Encourage first steps - test and report adequacy and equivalence u As evidence grows, concepts and measures that work better across diverse groups will be identified

78 Two Special Journal Issues on Measurement in Diverse Populations u Measurement in older ethnically diverse populations –J Mental Health Aging, Vol 7, Spring 2001 u Measurement in a multi-ethnic society –Med Care, Vol 44, November 2006

79 Homework for Class 3 u Using the same template and measure you reviewed for class 2, complete sections –Use same file and submit entire document by to