Download presentation
Presentation is loading. Please wait.
1
CESSDA Expert-Seminar 2004 Data Processing and Publishing Enhancing Dataset Quality - Plausibility – strategies of data checking Meinhard Moschner ZA, Cologne
2
CESSDA Expert-Seminar 2004 Plausibilty – Strategies of Data Checking “The quality of the survey is further raised by plausibility checks of the data which test the logical consistency of the answers. However, due to the rigidity of German data protection laws there is no check of the actual content of the data, for example by comparison with former surveys or other data sources.” (Mikrozensus) “The data measured in the NFI need to be plausible; that is all measurement values had to be within the defined value range and no inadmissible codes could be used. The attribute combinations had to be meaningful and admissible.” (Swiss National Forest Inventory) “Basically, data and metadata should be consistent and plausible.” (Dataset Processing at SIDOS)
3
CESSDA Expert-Seminar 2004 Plausibilty – Strategies of Data Checking (internal) consistency case / datum level admitted value range / undocumented codes follow up questions / filter conditions (explicit) marginal frequencies cross-tabulations questionnaire / codebook / report as reference
4
CESSDA Expert-Seminar 2004 Plausibilty – Strategies of Data Checking implicit consistency non-meaningful attribute combinations logical empirical cross-tabulations selected independent variables (demographics, facts) general ideas about empirical reality
5
CESSDA Expert-Seminar 2004 Plausibilty – Strategies of Data Checking plausibility on figures level aggregate data level conformance to independent reference data statistical records (facts) comparable space or time instances (+ attitudes) limited scope of error discovery (suspicious aggregates) method defects differences / changes in social reality domain specific knowledge required useful strategy in data integrating or cumulation
6
CESSDA Expert-Seminar 2004 Plausibilty – Strategies of Data Checking plausibility on model level analysis level conformance to other investigators’ results exploratory analysis … user feedback
7
CESSDA Expert-Seminar 2004 Plausibilty – Strategies of Data Checking Example 1 Eurobarometer 55 to 58 error in the original data detected by plausibility check after time series integration changed original category order not considered during cumulation? regions (ex-)changed in the British field instrument? (show cards not available)
8
CESSDA Expert-Seminar 2004 Plausibilty – Strategies of Data Checking Example 2 ISSP 1985 error detected by plausibility check after integration of country data sets country specific differences in attitudes? scale asked the other way around? labels were adapted to the standard but data not recoded by the national data producer? deviating translations in German and Austrian field questionnaires: ‘Organising protest marches which prevent the traffic’ (back translated) Q.3 There are many ways people or organisations can protest against a government action they strongly oppose. Please show which you think should be allowed and which should not be allowed by ticking a box on each line. Q.3c Organising protest marches and demonstrations 1.Definitely allowed 2.Probably allowed 3.Probably not allowed 4.Definitely not allowed
9
CESSDA Expert-Seminar 2004 Plausibilty – Strategies of Data Checking Example 3 Eurobarometer 44.1 and 51.0 error detected by plausibility check after time series integration (trend file) clear pattern over time confusion in standard re-coding which was not always well documented
10
CESSDA Expert-Seminar 2004 Plausibilty – Strategies of Data Checking What to do? contact principal investigator review original documentation and back dated data processing correct data if evidence is achieved keep as much valid information as possible produce as much comparability as possible in case of conflict or little evidence: leave decision to the user documentation is fundamental!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.