Data validation at DESTATIS ESTP course on Data Validation Item 12 – German validation system
Federal Production Validation Data Collection Data Processing and Analysis Publication © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Common data formats DatML/RAW DatML/SDF XML DatML/ASK DatML/EDT Micro data DatML/RAW XML Data model Range checks DatML/SDF PL-Spezifikationssprache (Validation specification language) Web form DatML/ASK DatML/EDT Hierarchical data structures Validation/editing rules Validation procedures © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Common software portfolio II + III IV V VI VII Run exclusively at DESTATIS Survey editor IDEV Web forms .CORE Web services Data Validation and Editing Runtime Statspez Validation rules editor Genesis Open data portal Input database Domain applications SAS Web forms editor Micro data Validated micro data Metadata Macro data Survey database Metadata DB © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Overview of metadata system II + III Define data model for survey Validation on data items (e.g. range checks) Survey editor Define data model for validation Write validation and automatic editing rules All kinds of data validation Arrange data validation procedures Validation editor Design web forms Determine when data validation procedures are executed Web forms editor Store and provide resources where needed Survey database Metadata DB © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Overview of the data collection system II + III IV Web form based data collection Validation on data items Validation within data set Some validation against reference material Survey editor IDEV Web forms .CORE Web services Interpret rules in PL- Spezifikationssprache Validation rules editor Web service based data collection Otherwise just structural validation Input database Web forms editor Survey database Metadata DB © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Overview of the data collection system II + III IV Survey editor IDEV Web forms .CORE Web services Validation rules editor Stores incoming DatML/RAW files Transports files to … … the data owner … the central production office Input database Web forms editor Survey database Metadata DB © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Overview of the processing system II + III IV V Translate rules to JAVA at runtime (OVIS) Survey editor IDEV Web forms .CORE Web services Data Validation and Editing Runtime Validation rules editor All kinds of data validation Input database Domain applications Web forms editor Use validation procedures as JAVA classes Survey database Metadata DB © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Output of the validation procedure Micro data Validated micro data Validation DB Editing One tool Error list Long validation report Short validation report © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Micro, macro, …? II + III IV V VI VII .CORE Validation rules editor Web forms editor Survey editor Survey database Metadata DB Statspez SAS Data Validation and Editing Runtime Domain applications EDIT Other software Other Print or digital Data recipients Parliament, etc. IDEV Web forms .CORE Web services Input database Data Validation and Editing Runtime Domain applications Statspez SAS Genesis Open data portal © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
We’re working on it! A complete system? Common data formats Common software portfolio covering the complete production process Common data validation language covering the complete production process Standardized validation reports Standardized review process We’re working on it! (up to 5.4) (mostly) © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
General process Decide Produce Evaluate process, decide on changes Design rules Production process Methodology dept. advises Review and accept rules © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Centralization in a federal system? Länder DESTATIS Not centralized Centralized Approval of validation rules Structural validation Content validation (4.3) Content validation (5.3) Data editing (5.4) Content validation (6.2) Data editing (6.2) Review of production process Validation rule design Content validation (5.3) Data editing (5.4) Content validation (6.2) Data editing (6.2) Review of production process Validation methodology (in an advisory role) Metadata distribution Structural validation Content validation (4.3) Content validation (5.3) Data editing (5.4) Quality reports © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019
Questions? © Federal Statistical Office of Germany (Destatis) | Department C3 19.02.2019