1 Task Force on the Development of a Common Instrument to Measure Health States: Conceptual and Logistic Issues in Item Construction Cameron N. McIntosh;

Slides:



Advertisements
Similar presentations
United NationsUnited Nations Economic Commission for Europe Statistical Division Population Unit International initiatives on health and disability statistics.
Advertisements

Developing a Questionnaire
The Budapest Initiative*: Measuring Population Health Status in Surveys and Censuses * The Joint UNECE/WHO/Eurostat Task Force on Measurement of Health.
Barbara M. Altman Emmanuelle Cambois Jean-Marie Robine Extended Questions Sets: Purpose, Characteristics and Topic Areas Fifth Washington group meeting.
Cultural practices and Environment and Participation assessment Classification, Assessment, Surveys and Terminology (CAS/EIP) World Health Organization.
Scaling Session Measurement implies “assigning numbers to objects or events…” Distinguish two levels: we can assign numbers to the response levels for.
King Fahd University of Petroleum & Minerals Department of Management and Marketing MKT 345 Marketing Research Dr. Alhassan G. Abdul-Muhmin Questionnaire.
10 th Meeting of the Washington Group Results from the UNESCAP and the Granada Group testing Luxembourg November 3 – 5, 2010.
5.00 Understand Promotion Research  Distinguish between basic and applied research (i.e., generation of knowledge vs. solving a specific.
Disability Statistics at NCHS: An Update
Short Set Update Barbara M. Altman Disability Statistics Consultant To NCHS.
The Definition and Measurement of Disability
TRANSLATION PROTOCOL PREPARED BY ETHEL JN. BAPTISTE ADAPTED FROM EURO-REVES, NOV 2003.
United Nations Economic Commission for Europe Statistical Division The problem of identifying persons with disabilities – the importance of questionnaire.
OECD World Forum “Statistics, Knowledge and Policy”, Palermo, November
I213: User Interface Design & Development Marti Hearst Tues, Feb 6, 2007.
Questionnaire Design.
Viewing Measures via the Matrix: Do we have what we need? Angela Me With Jennifer Madans, Barbara Altman, and Beth Rasch Ottawa, January 2003 Second meeting.
Exploring the Washington Group Data from the 2011 U.S. National Health Interview Survey Julie D. Weeks, Ph.D. National Center for Health Statistics, USA.
Working Paper No.3 Add.5 11 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION FOREUROPEAN COMMUNITIES EUROPE (EUROSTAT)
Jennifer Madans Associate Director for Science
General Disability Measures Used in Developed Countries: Question Characteristics Beth Rasch representing the collaborative work of the UN, ISTAT, and.
Validity and Reliability Dr. Voranuch Wangsuphachart Dept. of Social & Environmental Medicine Faculty of Tropical Medicine Mahodil University 420/6 Rajvithi.
CHAPTER 5: CONSTRUCTING OPEN- AND CLOSED-ENDED QUESTIONS Damon Burton University of Idaho University of Idaho.
1 Task Force on the Development of a Common Instrument to Measure Health States: Measuring Social Relationships (plus Communication) Cameron N. McIntosh;
The BI-Mark Vision 1. 1.Do you wear glasses or contact lenses? 2. 2.How much difficulty do you have in clearly seeing someone ’ s face across.
Translation and Cross-Cultural Equivalence of Health Measures.
Response to paper on extended measurement sets Margie Schneider HSRC South Africa.
Brief Historical Overview of the Budapest Initiative and Testing Activities January 2010 Palais des Nations, United Nations Geneva, Switzerland.
SPECA Regional Wrokshop on Disability Statistics, Dec 13-15, 2006 Issues Related to Disability Measurement: Cognitive testing and mode Jennifer Madans.
FDA Approach to Review of Outcome Measures for Drug Approval and Labeling: Content Validity Initiative on Methods, Measurement, and Pain Assessment in.
Proposed items for the measurement of Dexterity, Vitality, Affect, Vision Lidia Gargiulo, Gabriella Sebastiani, Alessandra Tinto & Elena DePalma – ISTAT,
1 Task Force on the Development of a Common Instrument to Measure Health States: Measuring Anxiety Cameron N. McIntosh; Julie Bernier; Jean-Marie Berthelot;
September 151 Screening for Disability Washington Group on Disability Statistics.
Report on the Budapest Initiative* *Joint UNECE/WHO/Eurostat Task Force on Measurement of Health Status Jennifer H. Madans National Center for Health Statistics,
…from Census to Survey: a framework for the development of extended question sets for use on surveys Mitch Loeb USA Washington Group on Disability Statistics.
Data Collection Methods
1 The Patient Perspective: Satisfaction Survey Presented at: Disease Management Colloquium June 22, 2005 Shulamit Bernard, RN, PhD.
Report on Joint UNECE/WHO/Eurostat Task Force on Measurement of Health Status Jennifer H. Madans U.S.A.
10/13/2015 Monitoring the UN Convention on the Rights of Persons with Disabilities… … and the work of the Washington Group on Disability Statistics Mitchell.
Washington City Group on Disability Statistics February 18-20, LANGUAGE AND CULTURAL ISSUES IN SURVEYS OF THE EUROPEAN UNION H. Van Oyen, Scientific.
Department of Health Sciences The Structure and Content of the European Health and Social Integration Survey (EHSIS) Washington Group meeting, 2011 Bermuda.
1 Task Force on the Development of a Common Instrument to Measure Health States: Identification of Domains Sarah Connor Gorber; Cameron N. McIntosh; Julie.
Washington Group Cognitive Test Kristen Miller Questionnaire Design Research Lab National Center for Health Statistics, USA 1) Purpose of cognitive test.
Comments on the ‘Proposed content of census questions for international use’ Xingyan Wen Ros Madden Australian Institute of Health and Welfare.
FATIGUE Results of ESCAP Testing Barbara M. Altman Disability Statistics Consultant Washington Group, November 3-5, 2010, Luxembourg.
Working Paper No.3 Add.4 11 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION FOREUROPEAN COMMUNITIES EUROPE (EUROSTAT)
ESS F5 Health & Food Safety UNECE/WHO/Eurostat - Measurement health status - Budapest 14-16/11/2005 Slide 1 of 10 Working Paper No.4 Rev.1 14 November.
Evidence and Information for Policy Health as a multi-dimensional construct and cross-population comparability Colin Mathers (WHO) on behalf of Taskforce.
Chapter 12 Advanced Measurement Designs for Survey Research.
Copyright 2010, The World Bank Group. All Rights Reserved. Questionnaire Design Part II Disclaimer: The questions shown in this section are not necessarily.
PAIN. Pain Questions Do you have frequent pain? Do you use medication for pain? If yes: In the past 3 months, how often did you have pain? Some days,
CHAPTER 11 – QUESTIONNAIRE DESIGN Zikmund & Babin Essentials of Marketing Research – 5 th Edition © 2013 Cengage Learning. All Rights Reserved. May not.
SPECA Meeting, Paris, June 16, 2006 Activities Related to Health and Disability Statistics in the UNECE Region and Globally Jennifer H. Madans for the.
QUESTIONNAIRE DESIGN.
1 Task Force on Health Expectancies National Disability Survey and Sport and Physical Exercise Module Gerry Brady Central Statistics Office, Ireland Luxembourg.
The WG Workgroup on Child Functioning and Disability Elena De Palma *, Roberta Crialesi *, Mitchell Loeb** Washington Group on Disability Statistics *Italian.
Measuring Disability: Results from the 2001 Census and the 2001 Post-Censal Disability Survey Statistics Canada January 10, 2003.
Arpo Aromaa, KTL Background, Terminology and Scope (Comment from Discussant) Working Paper No November 2005 STATISTICAL COMMISSION andSTATISTICAL.
The effects of Peer Pressure, Living Standards and Gender on Underage Drinking Psychologist- Kanari zukoshi.
1 Task Force on the Development of a Common Instrument to Measure Health States: Measuring Cognition Cameron N. McIntosh; Sarah Connor Gorber; Julie Bernier;
Translation and Cross-Cultural Equivalence of Health Measures
Chapter 14: Affective Assessment
… the work of the Washington Group on Disability Statistics Jennifer H. Madans National Center for Health Statistics, USA for the Washington Group on Disability.
Measurement and Health Information Systems Working Paper No.8 11 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION.
Questionnaire Design & Issues Lecture & Seminar. J.D. Power Asks: It’s Interesting, But Do You Really Want It? 15-2 Car makers have to evaluate what features.
RELEVANCE OF QUESTIONNAIRE METHOD OF DATA COLLECTION IN SOCIAL SCIENCERESEARCH BY : POOJAR BASAVARAJ HEAD, DEPT OF POLITICAL SCIENCE KARNATAK ARTS.
ابزار گرد آوری داده ها 1- پرسشنامه 2- مشاهده 3- مصاحبه
Measuring outcomes Emma Frew October 2012.
Presentation transcript:

1 Task Force on the Development of a Common Instrument to Measure Health States: Conceptual and Logistic Issues in Item Construction Cameron N. McIntosh; Julie Bernier; Jean-Marie Berthelot; Sarah Connor Gorber; Michael C. Wolfson Statistics Canada Ottawa, Ontario, Canada Working Paper No.3 22 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION FOREUROPEAN COMMUNITIES EUROPE (EUROSTAT) CONFERENCE OF EUROPEAN WORLD HEALTH STATISTICIANS ORGANIZATION (WHO) Joint UNECE/WHO/Eurostat Meeting on the Measurement of Health Status (Budapest, Hungary, November 2005) Session 3-Invited paper

2 Selected Domains Selected Domains 1. Physical Functioning: Mobility 2. Physical Functioning: Dexterity 3. Vitality/Fatigue 4. Affect (happiness, depression) 5. Anxiety (worry, fear, nervousness) 6. Vision (visual acuity) 7. Hearing (auditory acuity) 8. Pain and Discomfort 9. Social Relationships (including aspects of communication) 10. Cognition (a) memory and concentration (a) memory and concentration (b) problem solving and thinking (b) problem solving and thinking For inclusion on the common instrument, the task force selected the following 10 domains, for which specific items were to be constructed:

3 Developing Questions for the Health Domains Developing Questions for the Health Domains A number of conceptual and logistic issues needed to be considered in the item construction process for all domains; these can be grouped under the following five major headings: A number of conceptual and logistic issues needed to be considered in the item construction process for all domains; these can be grouped under the following five major headings: (1) Number of Questions per Domain (2) Questions Should be Uni-dimensional (3) Duration of the Recall Period for the Questions (4) Dealing with Technical and Medicinal Prosthetics (5) Item Wording and Response Categories

4 (1) Number of Questions per Domain (1) Number of Questions per Domain trade-off between adequate domain coverage and operational feasibility of the survey instrument; ideally, each question should be assessed using only one or two items trade-off between adequate domain coverage and operational feasibility of the survey instrument; ideally, each question should be assessed using only one or two items multi-faceted domains (e.g., Cognition) may necessitate multiple items to enhance measurement precision multi-faceted domains (e.g., Cognition) may necessitate multiple items to enhance measurement precision Filter questions should be considered for screening out respondents with “no limitations” on a given domain Filter questions should be considered for screening out respondents with “no limitations” on a given domain Advantage: might conserve interview time since not all response categories would need to be read in all cases Advantage: might conserve interview time since not all response categories would need to be read in all cases Disadvantage: might result in a bias toward “no” responses, as it provides a relief from the mental effort needed to generate an estimate of functioning Disadvantage: might result in a bias toward “no” responses, as it provides a relief from the mental effort needed to generate an estimate of functioning

5 (2) Questions Should be Uni-Dimensional (2) Questions Should be Uni-Dimensional to maximize measurement precision, each item should only assess one domain (or domain aspect); “double-barreled” response categories should be avoided, for example (EQ-5D): to maximize measurement precision, each item should only assess one domain (or domain aspect); “double-barreled” response categories should be avoided, for example (EQ-5D): 1. I am not anxious or depressed 1. I am not anxious or depressed 2. I am moderately anxious or depressed 2. I am moderately anxious or depressed 3. I am extremely anxious or depressed 3. I am extremely anxious or depressed responses to items mixing different concepts are difficult to interpret; do not know which part of the question was being answered responses to items mixing different concepts are difficult to interpret; do not know which part of the question was being answered multiple concepts within a single question might also confuse respondents and result in natural questions for interviewers, for example: multiple concepts within a single question might also confuse respondents and result in natural questions for interviewers, for example: “If I am not anxious but am moderately depressed, should I pick 1 or “If I am not anxious but am moderately depressed, should I pick 1 or 2?” 2?”

6 (3) Duration of the Recall Period for the Questions (3) Duration of the Recall Period for the Questions respondents need to base their functional status reports on some time period respondents need to base their functional status reports on some time period just asking about “general” or “usual” functioning might provide the least biased estimates, as it helps to avoid picking up the impact of time-limited health conditions (e.g., flu) just asking about “general” or “usual” functioning might provide the least biased estimates, as it helps to avoid picking up the impact of time-limited health conditions (e.g., flu) a problem is that “usual” or “general” are vague terms and might not have consistent meaning across countries and cultures; may pose translation difficulties a problem is that “usual” or “general” are vague terms and might not have consistent meaning across countries and cultures; may pose translation difficulties A specific recall period (e.g., the previous 30 days) would help standardize measurement, as well as facilitate translation A specific recall period (e.g., the previous 30 days) would help standardize measurement, as well as facilitate translation

7 (3) Duration of the Recall Period for the Questions (3) Duration of the Recall Period for the Questions choice of specific recall period must take several factors into account choice of specific recall period must take several factors into account shorter the recall period, the greater the tendency to only consider frequent, highly patterned events of lower intensity shorter the recall period, the greater the tendency to only consider frequent, highly patterned events of lower intensity longer the recall period, the tendency is toward consideration of infrequent, more intense events (e.g., intense episodes of anger) longer the recall period, the tendency is toward consideration of infrequent, more intense events (e.g., intense episodes of anger) optimum recall period would lead to a balanced consideration of domain-related events (i.e., events of varying intensity) optimum recall period would lead to a balanced consideration of domain-related events (i.e., events of varying intensity)

8 (3) Duration of the Recall Period for the Questions (3) Duration of the Recall Period for the Questions Telescoping: events are improperly included or excluded from the recall period Telescoping: events are improperly included or excluded from the recall period Forward telescoping: an event that is better-represented in memory (highly vivid and intense) is included incorrectly in the recall period Forward telescoping: an event that is better-represented in memory (highly vivid and intense) is included incorrectly in the recall period Backward telescoping: an event that is more poorly represented in memory (less vivid and intense) is excluded incorrectly from the recall period Backward telescoping: an event that is more poorly represented in memory (less vivid and intense) is excluded incorrectly from the recall period questions may need to reinforce that the focus is on respondents’ lives during the specified recall period only questions may need to reinforce that the focus is on respondents’ lives during the specified recall period only

9 (4) Dealing With Technical and Medicinal Prosthetics (4) Dealing With Technical and Medicinal Prosthetics to accurately measure capacity and feelings, the questions may need to incorporate information on the use of aids (e.g., walking equipment, glasses and contact lenses, hearing aids, medication for controlling pain and regulating mood) to accurately measure capacity and feelings, the questions may need to incorporate information on the use of aids (e.g., walking equipment, glasses and contact lenses, hearing aids, medication for controlling pain and regulating mood) if certain items do not specify the use of aids, respondents who use aids might pose natural questions to interviewers, for example: if certain items do not specify the use of aids, respondents who use aids might pose natural questions to interviewers, for example: “Do you mean how much difficulty I have getting around the “Do you mean how much difficulty I have getting around the neighbourhood with or without my walker/wheelchair?” neighbourhood with or without my walker/wheelchair?” “Are you referring to the intensity of my pain when I am on or off my “Are you referring to the intensity of my pain when I am on or off my medication?” medication?” questions for domains where aids are most relevant (e.g., mobility, vision, hearing, pain and discomfort) should probably mention the use of aids in the preamble and/or the response categories questions for domains where aids are most relevant (e.g., mobility, vision, hearing, pain and discomfort) should probably mention the use of aids in the preamble and/or the response categories

10 (5) Item Wording and Response Categories Terminology will have to be chosen carefully in order to facilitate translation and international comparability of concepts Terminology will have to be chosen carefully in order to facilitate translation and international comparability of concepts language that is either overly colloquial or overly scientific should be avoided language that is either overly colloquial or overly scientific should be avoided might be best to assess capacity in terms of “difficulty in doing __”; questions directly using the terms “capacity” (or “ability”) might be ambiguous for respondents might be best to assess capacity in terms of “difficulty in doing __”; questions directly using the terms “capacity” (or “ability”) might be ambiguous for respondents need to determine whether problems in functioning will be assessed in terms of frequency (how often), intensity (how bad), or both need to determine whether problems in functioning will be assessed in terms of frequency (how often), intensity (how bad), or both

11 (5) Item Wording and Response Categories Response category cut-point shift problem – the same underlying level of capacity or feeling may not receive the same rating across countries, cultures, or individuals (e.g., limitations seen as “mild” in one culture may be seen as “severe” in another; the frequency of a given problem might be rated as “some of the time” in one culture and “all of the time” in another) Response category cut-point shift problem – the same underlying level of capacity or feeling may not receive the same rating across countries, cultures, or individuals (e.g., limitations seen as “mild” in one culture may be seen as “severe” in another; the frequency of a given problem might be rated as “some of the time” in one culture and “all of the time” in another) alternative to full sets of quantifiers and qualifiers would be to use scales with qualifiers or quantifiers on the endpoints only (e.g., Visual Analogue Scale, or a ladder) alternative to full sets of quantifiers and qualifiers would be to use scales with qualifiers or quantifiers on the endpoints only (e.g., Visual Analogue Scale, or a ladder) measurement precision is lessened when descriptors are not attached to all scale values; also, it may be optimal to define every domain level for future preference measurement. measurement precision is lessened when descriptors are not attached to all scale values; also, it may be optimal to define every domain level for future preference measurement. both types of items (i.e., a fully defined system of levels versus endpoint labels only) should be subjected to cognitive testing both types of items (i.e., a fully defined system of levels versus endpoint labels only) should be subjected to cognitive testing

12 Issues Requiring Input Issues Requiring Input What should be the upper limit on questions for each domain? What should be the upper limit on questions for each domain? How do we arrive at an optimal balance between precision in measurement (i.e., maintaining item uni-dimensionality) and operational feasibility (i.e., having a reasonably brief survey module)? How do we arrive at an optimal balance between precision in measurement (i.e., maintaining item uni-dimensionality) and operational feasibility (i.e., having a reasonably brief survey module)? What is the best recall period for the items? What is the best recall period for the items? What is the best way to incorporate information on technical and medicinal prosthetics be built into the items? What is the best way to incorporate information on technical and medicinal prosthetics be built into the items? Should there be response category labels for every level of a domain, or should there be scales with labels on the endpoints only? How do we derive a set of internationally comparable descriptors? Should there be response category labels for every level of a domain, or should there be scales with labels on the endpoints only? How do we derive a set of internationally comparable descriptors?