Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE.

Slides:



Advertisements
Similar presentations
Writing constructed response items
Advertisements

WMS-IV Wechsler Memory Scale - Fourth Edition
ESI-P Early Screening Inventory-Preschool
1 The purpose of feedback is to be helpful Feedback should include positive reinforcement of strengths Describe actual behaviour, not the individual person,
Test Development.
Standardized Scales.
Conceptual Foundations for Health Measurements
The Budapest Initiative*: Measuring Population Health Status in Surveys and Censuses * The Joint UNECE/WHO/Eurostat Task Force on Measurement of Health.
Barbara M. Altman Emmanuelle Cambois Jean-Marie Robine Extended Questions Sets: Purpose, Characteristics and Topic Areas Fifth Washington group meeting.
Scaling Session Measurement implies “assigning numbers to objects or events…” Distinguish two levels: we can assign numbers to the response levels for.
Sleep disorders Narcolepsy 1. Incidence & symptoms Narcolepsy usually begins in adolescence or early adulthood, and continues through the person’s life.
1 The Measurement of Mental Disorder Why is it so difficult to determine who is mentally disordered when interviewing people in the community?
Viewing Measures via the Matrix: Do we have what we need? Angela Me With Jennifer Madans, Barbara Altman, and Beth Rasch Ottawa, January 2003 Second meeting.
Meta-analysis & psychotherapy outcome research
Explain and evaluate research into Hassles and Uplifts
Mary Ganguli’s Slides March 13 th Meeting. Mild Cognitive Impairment A View from the Trenches.
Exploring the Washington Group Data from the 2011 U.S. National Health Interview Survey Julie D. Weeks, Ph.D. National Center for Health Statistics, USA.
Validity and Validation: An introduction Note: I have included explanatory notes for each slide. To access these, you will probably have to save the file.
Working Paper No.3 Add.5 11 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION FOREUROPEAN COMMUNITIES EUROPE (EUROSTAT)
Unit 1 Task 4 Barriers To Communication Jackson Coltman.
DEVELOPMENT AND TRIAL OF AN ACT WORKSHOP FOR PARENTS OF A CHILD WITH ASD Associate Professor Kate Sofronoff School of Psychology University of Queensland.
Now that you know what assessment is, you know that it begins with a test. Ch 4.
General Disability Measures Used in Developed Countries: Question Characteristics Beth Rasch representing the collaborative work of the UN, ISTAT, and.
1 Task Force on the Development of a Common Instrument to Measure Health States: Measuring Social Relationships (plus Communication) Cameron N. McIntosh;
Portfolios.
Some Considerations on Question Design These slides offer some comments on more challenging aspects of questionnaire design. They are mainly questions.
ESI-P Early Screening Inventory-Preschool Developed by Meisels, Wiske, Henderson, Marsden & Browning.
Construction and Evaluation of Multi-item Scales Ron D. Hays, Ph.D. RCMAR/EXPORT September 15, 2008, 3-4pm
McMillan Educational Research: Fundamentals for the Consumer, 6e © 2012 Pearson Education, Inc. All rights reserved. Educational Research: Fundamentals.
FDA Approach to Review of Outcome Measures for Drug Approval and Labeling: Content Validity Initiative on Methods, Measurement, and Pain Assessment in.
1 Task Force on the Development of a Common Instrument to Measure Health States: Measuring Anxiety Cameron N. McIntosh; Julie Bernier; Jean-Marie Berthelot;
September 151 Screening for Disability Washington Group on Disability Statistics.
Report on the Budapest Initiative* *Joint UNECE/WHO/Eurostat Task Force on Measurement of Health Status Jennifer H. Madans National Center for Health Statistics,
Functional assessment and training Ahmad Osailan.
#1 STATISTICS 542 Intro to Clinical Trials Quality of Life Assessment.
Exercise and Psychological Well-Being
EVIDENCE ABOUT DIAGNOSTIC TESTS Min H. Huang, PT, PhD, NCS.
Dealing with Anxiety in Adolescents Jacquelyn M. Trejo.
1 Task Force on the Development of a Common Instrument to Measure Health States: Identification of Domains Sarah Connor Gorber; Cameron N. McIntosh; Julie.
Performance Assessment OSI Workshop June 25 – 27, 2003 Yerevan, Armenia Ara Tekian, PhD, MHPE University of Illinois at Chicago.
Working Paper No.3 Add.4 11 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION FOREUROPEAN COMMUNITIES EUROPE (EUROSTAT)
Evidence and Information for Policy Health as a multi-dimensional construct and cross-population comparability Colin Mathers (WHO) on behalf of Taskforce.
Non-Self-injury – perceived helpfulness Self-injury – perceived helpfulness Non-Self-injury – freq of use Self-injury – freq of use Figure 3. Average use.
1 EQ-5D, HUI and SF-36 Of the shelf instruments…..
VALIDITY AND VALIDATION: AN INTRODUCTION Note: I have included explanatory notes for each slide. To access these, you will probably have to save the file.
Descriptive Research Study Investigation of Positive and Negative Affect of UniJos PhD Students toward their PhD Research Project Dr. K. A. Korb University.
Extended sets – draft proposal Washington Group Meeting Dublin, Ireland 19 – 21 September 2007 Margie Schneider (Workgroup coordinator)
1 Task Force on the Development of a Common Instrument to Measure Health States: Conceptual and Logistic Issues in Item Construction Cameron N. McIntosh;
Trends in the prevalence of disability and chronic conditions: implications for survey design and measure of disability. Presented by Xingyan Wen Australian.
SPECA Meeting, Paris, June 16, 2006 Activities Related to Health and Disability Statistics in the UNECE Region and Globally Jennifer H. Madans for the.
Validity: Introduction. Reliability and Validity Reliability Low High Validity Low High.
Arpo Aromaa, KTL Background, Terminology and Scope (Comment from Discussant) Working Paper No November 2005 STATISTICAL COMMISSION andSTATISTICAL.
Breaking the NEWS About CANCER to FAMILY and FRIENDS To Tell or Not To Tell... Karen V. de la Cruz, Ph.D.
1 Task Force on the Development of a Common Instrument to Measure Health States: Measuring Cognition Cameron N. McIntosh; Sarah Connor Gorber; Julie Bernier;
Reliability a measure is reliable if it gives the same information every time it is used. reliability is assessed by a number – typically a correlation.
Measurement and Health Information Systems Working Paper No.8 11 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION.
Measuring Well-being October 2011 OSI Education Programme workshop Charles Seaford Head of the Centre for Well-being, new economics foundation.
Copyright ©2016 Pearson Education, Inc. 5-1 Essentials of Organizational Behavior 13e Stephen P. Robbins & Timothy A. Judge Chapter 5 Personality and Values.
Cognitive Testing, Statistics and Dementia Ralph J. Kiernan Ph.D. 14 th May 2013.
Early Screening Inventory-Preschool Developed by Meisels, Wiske, Henderson, Marsden & Browning.
Rosemarie Bernabe, PhD Julius Center for Health Sciences and Primary Care Patient representatives’ contributions to the benefit-risk assessment tasks of.
RELIABILITY AND VALIDITY Dr. Rehab F. Gwada. Control of Measurement Reliabilityvalidity.
Quantification of dyspnea using descriptors: Development and initial testing of the Dyspnea-12 J Yorke, S H Moosavi, C Shuldham, P W Jones (Thorax
Psychological Treatments for Chronic Pain: The Example of Acceptance & Commitment Therapy for Chronic Pain Kevin E. Vowles, Ph.D. 5 th Annual A Thoughtful.
INTERPERSONAL SKILL C HAPTER 3 Lecturer : Mpho Mlombo.
Copyright © 2009 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 47 Critiquing Assessments.
Some Considerations on Question Design
NIH: Patient-Reported Outcomes Measurement Information System (PROMIS®) Ron D. Hays Functional Vision and Visual Function November 10, 2016, 8:55-9:15am.
Data Analysis and Standard Setting
How should we classify emotions?
Presentation transcript:

Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION FOREUROPEAN COMMUNITIES EUROPE (EUROSTAT) CONFERENCE OF EUROPEAN WORLD HEALTH STATISTICIANS ORGANIZATION (WHO) Joint UNECE/WHO/Eurostat Meeting on the Measurement of Health Status (Budapest, Hungary, November 2005) Session 3

Clarify Purpose: Description or evaluation? Design implications of each … Descriptive: Broad ranging. Goal = to classify groups Themes of interest to people in general (“quality of life”, etc); issues of public concern To debate: Emphasize modifiable themes? To debate: profile rather than index? Evaluative: Content tailored to intervention; usually not comprehensive Needs to be sensitive to change produced by particular intervention Focused & fine-grained: select indicators that sample densely from relevant level of severity; unidimensional ? emphasis on summary score Discussion point: does proposed instrument need to serve as an evaluative measure?

Purpose, Performance and Capacity Descriptive purposes Analytic purposes Performance Capacity (with any aids) Capacity (without aids) Potential Unmet needs Current picture Needs that have been met Environment

Parsimony, Sensitivity & Specificity These are in tension! Need for brevity implies: If goal is to have broad coverage of domains (descriptive measure), there can only be few items in each To achieve breadth within a domain in few items, we need to use generic items (e.g., the infamous “can you cut your toenails?”) This can achieve sensitivity as a screen, but at cost of low specificity: cannot classify type of condition Will also lose interpretability and unidimensionality Point #38: the WP discussion of physical function illustrates choice between measuring overall, vs. specific functions. Do we care whether it’s knee pain, or muscle weakness, or balance that limits walking ability?

Unidimensionality ( point #11) IRT goal of unidimensionality is hard to apply in many areas of health measurement. Some topics are hierarchical; symptoms of depression (e.g.) are not, so in IRT analyses, depression or anxiety scales often do not meet unidimensionality criterion Unidimensionality is chiefly important for clinical interpretation & maybe evaluation; not the issue here. Surveys focus on how bad it is, not what it is If instrument will be scored as an index, the issue of unidimensionality becomes irrelevant as all the items are combined and it’s impossible to visualize the person’s disability anyway There is an inherent tension between using generic, screening-type items (e.g., IADLs) and unidimensionality Many functions involve more than one body system (e.g., recognizing a face across street), so are not unidimensional

The Time Frame Debate WP 1 says “present”; WP 3 much broader (& varied) If sample is large, could use “yesterday” to get prevalence, but will not tell incidence, or duration of condition Duration requires additional questions, as does change Width of time window not very important: average is just calculated over a shorter or longer time Suggest one week (to capture week-ends, etc) or else “yesterday” (as today is incomplete) Problem! Change only captured if additional questions asked, so can’t distinguish A from B Sampling window ABC

Time Window & Response Shift (Point #13) Larger time windows, and phrasing in terms of “usual” can face issue of response shift (recalibration of person’s view of what is “normal”) “Usual” phrasing seems most problematic: may miss chronic disabilities (cf. criticism of GHQ); cannot record incidence, maybe not even prevalence Actual trajectory Perception of “usual” function Typical delay varies according to a range of factors Response Shift:

Continuous States vs. Episodic Events Mobility limitations often endure. By contrast, pain, anxiety or marital disputes are commonly episodic Averaging over broad time-window can be an issue for the episodic events (point #15), because Averaging episodes raises issue of frequency vs. intensity of events (see next slide) In general, time & averaging is less of an issue for capacity than for performance, because capacity is enduring, performance may fluctuate However, the notion of capacity is hard to apply to pain, anxiety and depression (in which wording a question in capacity terms tends to approximate performance)

Combining Severity & Frequency (e.g., anxiety questions: point 76; pain, point 97) Risk of trying to do too much. The problem of summarizing frequency & severity grows with increasing length of retrospection. If “yesterday” is used, you need only ask about severity The term “level” (“How would you describe your level of anxiety?”) is unclear: presumably some combination of severity & frequency of episodes, but how does respondent combine these? Options. PhD level: “We want you to judge the overall amount of pain, considering both intensity and frequency, you have experienced …” Simpler: “How bad was your pain?” Mild, moderate, severe… versus? time

Response options: Frequency vs. Difficulty (point # 30) For chronic conditions, evidently intensity responses are more appropriate For fluctuating conditions (insomnia, depression), frequency seems most appropriate If brief recall periods, use intensity responses For longer-term recall, use frequency Also, need to decide on relative vs. absolute responses. E.g., “do you have difficulty keeping up with people your own age?” Likewise, do we specify “level ground” for walking, or “where you live.” The first is close to disability and may not be relevant to them, the second (handicap) will be relevant but may make direct comparisons difficult

Discuss Structure of Overall Instrument Can it be made dynamic? Item banking; tailored responses; computer administration or using skip patterns. Some examples: Cella: Ware JE et al. Item banking and the improvement of health status measures. Quality of Life Newsletter 2004; Fall (Special Issue):2-5. Bjorner JB et al. Using item response theory to calibrate the Headache Impact Test (HIT) to the metric of traditional headache scales. Qual Life Res 2003; 12:

Reference for upper level of function Best possible function Compared to your potential Compared to average person of your age Without difficulty To adjust for age or not?

Prosthetics, Analgesics, etc. (points 20-25) Rocks & hard places… Without aids approximates impairment; with aids = disability But this distinction is hard to make in ICF: ‘activity’ and ‘participation’ both sound like performance rather than capacity Not quite clear why eye glasses are singled out for inclusion, while walking sticks apparently are not Asking an amputee about mobility without his prosthesis seems artificial (point #21) Likewise, if they are taking effective analgesics, it’s hard for them to report pain without (points #24 & 25) If purpose is to indicate health states in this nation, suggest the approach of “using any aids you normally use.” Suggest not relying on use of analgesics as way to indicate severity (point #22), because availability will vary greatly

Visual Analogue Scales In clinical settings, VAS, NRS pain ratings intercorrelate highly. Verbal scales correlate with both, but less closely VAS is visual, so implies use of paper & pencil If used in telephone format, VAS reduces to a NRS, so why not just use NRS? Less educated and older patients appear to find NRS easier than VAS, so these have been endorsed for use in cancer trials (Moinpour et al., J Natl Cancer Inst 1989; 81: ) The FLIC began with VAS, but changed to 6-pt NRS However, the VAS can be very responsive (e.g., Hagen et al, J Rheumatol 1999; 26: ). But do we need responsiveness? Many alternative formats, including graphic rating scale (Dalton et al, Cancer Nurs 1998; 21:46-49) or box scale (Jensen et al, Clin J Pain 1998; 14: ). See also Cella & Perry, Psychol Rep 1986; 59: , and Scott & Huskisson, Pain 1976; 2:

Anxiety & Depression Trying to discriminate between these may focus attention on the trees rather than the forest Unitary theory sees A & D as expressions of the same pathology; the opposing perspective sees them as fundamentally different, while the compromise is to view them as having common roots but different expressions (Brown et al, J Abnorm Psychol 1998; 107: ). Anxiety suggests arousal and an attempt to cope with a situation; depression suggests lack of arousal and withdrawal: the NE and SE quadrants of the diagram (next slide) An anxious person might say “That terrible event is not my fault but it may happen again, and I may not be able to cope with it but I’ve got to be ready to try.” A depressed person might say “That terrible event may happen again and I won’t be able to cope with it, and it’s probably my fault anyway so there’s really nothing I can do.” (Barlow DH. The nature of anxiety: anxiety, depression, and emotional disorders. In: Rapee RM, Barlow DH, eds. Chronic anxiety: generalized anxiety disorder and mixed anxiety- depression. New York: Guilford, 1991: 1-28)

High positive affect Low positive affect High negative affect Low negative affect Disengagement Pleasantness Strong engagement Unpleasantness content, happy, satisfied active, elated, excited aroused, astonished, concerned relaxed, calm, placid distressed, fearful, hostile sad, lonely, withdrawn sluggish, dull, drowsy inactive, still, quiet Depression Anxiety A circumplex model of affect

Emotions & Affect: scattered thoughts How to fit affect within capacity / performance distinction? Many anxiety questions use either state or performance wordings (“How severe was you anxiety?” or “Did anxiety limit your daily activities?”) Why try to distinguish anxiety and depression? Not completely clear why we need both positive and negative affect (point #68): if time frame correctly chosen, they should not be orthogonal Phrase such as “upset or distressed” may capture general affect quite well Stress may also be pertinent: cf. DASS of Lovibond (Manual for the Depression Anxiety Stress Scales. Sydney: Psychology Foundation, 1995)