Download presentation
Presentation is loading. Please wait.
Published byRichard Bishop Modified over 9 years ago
1
Discussion Alan Zaslavsky Harvard Medical School
2
Fabrication as a Statistical Procedure Fabrication is like imputation – Duplication is like hot deck – Duplication with random modifications is like multiple imputation – Duplication is like weight modification Fabrication is a multilevel process – Interview, interviewer, area, … project level
3
Fabrication as a Game Payoffs/risks to fabricator – Reduce effort while receiving payment – Risks greater for higher-level organization/person Detection/deterrence Costs/risks to data purchaser – Paying more for less information – Wrong decisions – Loss of credibility (cliff loss function) Risks may change with greater expertise on either side
4
Assumptions about Fabricators Fabricators are not very sophisticated – No fancy synthesis models Fabricators are not trying to work hard – Falsifying must be easier than data collection – Will not know how to “beat” moderately sophisticated detection techniques If fabricators try harder … – Good standard synthesis methods could be hard to detect – Learning on both sides
5
Fabrication on the Continuum of Survey Management Related to other survey errors at scale – Inadequately designed survey questions and tools Not adapted to conditions under which survey fielded – Interviewer errors Misinterpretation of questions, procedures Interpersonal interview technique Training and motivation Monitoring of “honesty”, accuracy, technique
6
Detection techniques Good survey management – Timely, at all levels – Recruitment, observation – Metadata and paradata Post-survey analysis – Replication of survey: interpenetrating samples – Subject-matter expertise – Statistical outliers (single and patterns) Earlier is better
7
Regina Faranda Extensive checking – Subject-matter and survey expertise – Checklist: QC Statistical assumptions? – Can be stated and tested
8
Rita Thissen Detailed specifics of monitoring and detection systems – Technology: CARI, CAPI, … (Anecdotes rarely heard)
9
Mike Robbins Duplicate detection is like record linkage – Likelihood ratio Duplicate detection also important in other settings – US Census (2000?): match 330M
10
Robbins – Duplicate detection Duplicate detection is like record linkage – Likelihood ratio Duplicate detection also important in other settings – US Census (2000?): match 330M × 330M possible record pairs Would models be different for fabricated data, processing errors, repeated real interviews?
11
Example: Medicare CAHPS survey Pulled ~5000 responses (out of ~400K/year) Examined 27 substantive items Complex features – Substantial amount of screening/skipped items – Multiple choice items – Blocks of closely related items
12
Agreement – all pairs
13
Best agreement: duplicates?
14
Conclusions Know your data and survey methodology Thanks to speakers for sharing their experience and methods
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.