Some Considerations on Question Design

Some Considerations on Question Design
These slides offer some comments on more challenging aspects of questionnaire design. They are mainly questions for further consideration when you actually get involved in developing a measurement tool: more questions than answers!

Different purposes for measurements imply different ways to write the questions. For example:
Evaluative measures: Content should be tailored to the intervention; usually not comprehensive Must be sensitive to change produced by intervention Focused & fine-grained: select indicators that sample densely from relevant level of severity; should be unidimensional Consider whether you need to have a summary score Descriptive measures: Content often broad-ranging Goal = to classify groups Choose themes of interest to people in general (“quality of life”, etc.), or issues of public concern Should it emphasize modifiable themes? Do you want a profile or an index? vs.

Sensitivity & Specificity vs. Brevity
These are in tension! Need for brief instrument implies: If goal is to have broad coverage of domains (as in a descriptive measure), there can only be few items in each domain: lacks detail To achieve breadth within a domain in few items, we need to use generic items (e.g., “can you cut your toenails?”) This can achieve sensitivity as a screen, but cannot classify type of condition and specificity may be low Generic items also lose interpretability and unidimensionality For example, is it knee pain, or muscle weakness, or balance that limits walking ability. Do you want a descriptive or a diagnostic instrument?

Unidimensionality The IRT goal of unidimensionality is hard to apply in health measurement. A few health topics are hierarchical; others (e.g. symptoms of depression) are not. Depression or anxiety scales generally do not meet IRT unidimensionality criterion Many functions involve more than one body system (e.g., recognizing a face across street), so are not unidimensional Unidimensionality is chiefly important for clinical interpretation & perhaps evaluation; not an issue in a survey. Surveys focus on how bad it is, not what it is If instrument (e.g. SF-36, physical score) will be scored as an index, unidimensionality becomes less relevant as all the items are combined. They gives the appearance of a single dimension, but you can not visualize the person’s disability from the score There is an inherent tension between screening items (which emphasize sensitivity, hence breadth) and unidimensionality

What time frame to choose?
Some questions refer to “present”. These can indicate prevalence, but not incidence or duration of condition Duration requires additional questions, as does change Width of time window not very important: average is just calculated over a shorter or longer time Suggest one week (to capture week-ends, etc) or else “yesterday” (as today is incomplete) Sampling window Problem! A B C Change only captured if additional questions asked, so can’t distinguish A from B

Time Window & Response Shift
Longer time windows, and phrasing in terms of “usual activity” can cause a response shift. Here people recalibrate what they see as “normal” because they still see their usual function in terms of the way it was before they got sick. “Usual” phrasing can be problematic: may miss chronic disabilities that people have adapted to (cf. criticism of GHQ); cannot record incidence, maybe not even prevalence Response Shift: (still not accepting disability) Perception of “usual” function Actual trajectory Delay between functional change & perception varies according to a range of factors

Continuous States vs. Episodic Events
Mobility limitations often endure. By contrast, pain, anxiety or marital disputes are commonly episodic Averaging over broad time-window can be an issue for the episodic events, because Averaging episodes raises issue of frequency vs. intensity of events (see next slide) In general, time & averaging is less of an issue for capacity than for performance, because capacity is enduring while performance may fluctuate However, the notion of capacity is hard to apply to pain, anxiety and depression (in which wording a question in capacity terms tends to approximate performance).

Combining Severity & Frequency (1)
versus ? time Health measures often try to cover both frequency of the problem and its severity when it occurs. Beware of trying to do too much! The challenge grows with increasing length of retrospection. If “yesterday” is used, you can only ask about severity Think carefully and specify which you wish to ask about. One option is to use “level” (“How would you describe your level of anxiety?”) but this is unclear. It does not tell the respondent how they are to combine severity & frequency of episodes Options: “Please judge the overall amount of pain, considering both intensity and frequency, you have experienced …” Simpler: “How bad was your pain?” Mild, moderate, severe…”

Frequency vs. Intensity (2)
For chronic conditions, intensity responses are obviously more appropriate For fluctuating conditions (insomnia, depression), frequency seems most appropriate If brief recall periods, use intensity responses For longer-term recall, use frequency phrasing Also, need to decide on relative vs. absolute responses. Expectations for health vary: e.g., “do you have difficulty keeping up with people your own age?” Likewise, do we specify “level ground” for walking, or “where you live.” The first is close to disability and may not be relevant to them, the second (handicap) will be relevant but may make direct comparisons difficult.

Prosthetics, Analgesics, etc.
Decide how to handle use of aids and medications… Without aids approximates impairment; with aids = disability But this distinction is hard to make in ICF: ‘activity’ and ‘participation’ both sound like performance rather than capacity Not clear why eye glasses are often included, while walking sticks may not be (in some scales, at least). Suggest “using any aids you normally use.” Asking an amputee about mobility without his prosthesis seems artificial Likewise, if they are taking effective analgesics, consider how you will record pain levels Probably do not rely on use of analgesics as a way to indicate severity, because availability will vary greatly

Visual Analogue Scales
In clinical settings, VAS, NRS pain ratings intercorrelate highly. Verbal scales correlate with both, but less closely VAS is visual, so implies use of paper & pencil (or computer mouse) If used in telephone format, VAS reduces to a NRS, so just use NRS Less educated and older patients appear to find NRS easier than VAS, so these have been endorsed for use in cancer trials (Moinpour et al., J Natl Cancer Inst 1989; 81: ) The FLIC began with VAS, but changed to 6-pt NRS However, the VAS can be very responsive (e.g., Hagen et al, J Rheumatol 1999; 26: ). But do you need responsiveness? Many alternative VAS formats, including graphic rating scale (Dalton et al, Cancer Nurs 1998; 21:46-49) or box scale (Jensen et al, Clin J Pain 1998; 14: ). See also Cella & Perry, Psychol Rep 1986; 59: , and Scott & Huskisson, Pain 1976; 2:

Anxiety & Depression Trying to discriminate between these may focus attention on the trees rather than the forest Unitary theory views A & D as expressions of the same pathology; the opposing perspective sees them as fundamentally different, while the compromise is to view them as having common roots but different expressions (Brown et al, J Abnorm Psychol 1998; 107: ). Anxiety suggests arousal and an attempt to cope with a situation; depression suggests lack of arousal and withdrawal: the NE and SE quadrants of a circumplex diagram (next slide) An anxious person might say “That terrible event is not my fault but it may happen again, and I may not be able to cope with it but I’ve got to be ready to try.” A depressed person might say “That terrible event may happen again and I won’t be able to cope with it, and it’s probably my fault anyway so there’s really nothing I can do.” (Barlow DH. The nature of anxiety: anxiety, depression, and emotional disorders. In: Rapee RM, Barlow DH, eds. Chronic anxiety: generalized anxiety disorder and mixed anxiety-depression. New York: Guilford, 1991: 1-28)

Low negative affect High negative affect
A circumplex model of affect High positive affect Anxiety active, elated, excited Strong engagement Pleasantness content, happy, satisfied aroused, astonished, concerned Low negative affect High negative affect relaxed, calm, placid distressed, fearful, hostile April 8, 2004. This is intended to go with the discussion on Bradburn’s 2-factor model. Source: Adapted from Tellegen A. Structures of mood and personality and their relevance to assessing anxiety, with an emphasis on self-report. In Tuma AH, Maser JD (eds.) Anxiety and the anxiety disorders. Hillsdale, NJ: Erlbaum, 1985. inactive, still, quiet sad, lonely, withdrawn sluggish, dull, drowsy Disengagement Unpleasantness Low positive affect Depression

Emotions & Affect: scattered thoughts
How to fit affect within capacity / performance distinction? Many anxiety questions use either state or performance wordings (“How severe was you anxiety?” or “Did anxiety limit your daily activities?”) Why try to distinguish anxiety and depression? Not completely clear why we need both positive and negative affect: if time frame is correctly chosen, they should not be orthogonal Phrase such as “upset or distressed” may capture general affect quite well Stress may also be pertinent: cf. Lovibond’s DASS. (Manual for the Depression Anxiety Stress Scales. Sydney: Psychology Foundation, 1995)

Some Considerations on Question Design

Similar presentations

Presentation on theme: "Some Considerations on Question Design"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Some Considerations on Question Design

Similar presentations

Presentation on theme: "Some Considerations on Question Design"— Presentation transcript:

Similar presentations

About project

Feedback