Download presentation
Presentation is loading. Please wait.
Published byPriscilla Heath Modified over 9 years ago
1
USING THE METADATA IN STATISTICAL PROCESSING CYCLE – THE PRODUCTION TOOLS PERSPECTIVE Matjaž Jug, Pavle Kozjek, Tomaž Špeh Statistical Office of the Republic of Slovenia
2
Overview n Current statistical production cycle in SORS n Using the metadata in Blaise applications n The role of metadata in automatic editing system in SAS n Metadata connected with the data in Oracle data warehouse n Lessons learnt n Questions
3
Current statistical production cycle n Entry and micro editing (Blaise) n Macro and statistical editing (SAS) n Storing and analysis (Oracle) n Dissemination (PC-Axis) n Central metadata stores (Klasje & Metis)
4
Using the metadata in Blaise applications n Generation of (high speed) data-entry applications using Gentry (using by non- IT personnel) n Metadata-based transformations between different data structures (EXTRA-FAT, FAT, THIN)
5
Gentry – tool for generation of the Blaise data-entry application n Questionnaire structure and layout (name, blocks, tables, routing etc.) n Field characteristics (length, data type, constants, other parameters) Field characteristics Data type
6
Gentry – example of generated application section header Data entry for table 12
7
Transformations All data for one unit (provider) in one row (EXTRA FAT): suitable for micro editing Classification and continuous variables in the columns (FAT): suitable for analysis Classification variables in the columns and continuous variables in the rows (THIN) Metadata-based transformation in Blaise Metadata-based transformation in SAS
8
The role of metadata in automatic editing system in SAS n General system for automated editing n Process metadata
9
The role of metadata in automatic editing system in SAS n In order to be general the tool must be able to: - recognize the data which are due to be subjected to editing and/or imputation; - recognize which editing method should be applied, - and with what parameters
10
Process indicators – level 1 n Mode of data collection - 1 data provided directly by reporting unit - 2 data from administrative source - 3 data computed from original values - 4 imputed data – imputation of non-response - 5 imputed data – imputation due to invalid values detected through the editing process - 6 data missing because the unit is not eligible for the item (logical skip)
11
Process indicators – level 2 n Data status - 1 original value - 2 corrected value
12
Process indicators – level 3 n Method of data correction - 11 correction after telephone contact - 12 data reported at a later stage
13
Process indicators – level 3 n Reporting methods - 11 reporting by mail questionnaire - 12 computer assisted telephone interview(CATI) - 13 telephone interview without computer assistance - 14 paper assisted personal interview (PAPI) - 15 computer assisted personal interview (CAPI) - 16 paper assisted self interviewing - 17 computer assisted self interviewing - 18 web reporting
14
Process indicators – level 3 n Imputation methods - 10 method of zero values - 11 logical imputation - 12 historical data imputation - 13 mean values imputation - 14 nearest neighbour imputation - 15 hot-deck imputation - 16 cold-deck imputation - 17 regression imputation - 18 method of the most frequent value - 19 estimation of anual value based on infraanual data - 21 stochastic hot-deck (random donor) - 22 regression imputation with random residuals - 23 multiple imputation
15
Process indicators examples - xy.zz n 11.15 means: 1 - data provided directly by reporting unit 11 - original value 11.15 - computer assisted personal interview (CAPI) n 42.19 means: 4 - imputed data – imputation of non- response 42 - corrected value 42.19 - estimation of anual value based on infraanual data
16
Statistical process Key responders Other units SAS Blaise Oracle SAS Blaise
17
Metadata connected with the data in Oracle data warehouse n On-line access to: - Historical data - Data from different phases (not only final data) - Data for multiple surveys (not only data marts) - Statistical (variables & classifications) and process (time stamps, status indicators...) metadata connected with the data n...accessible for third-party tools
18
Conceptual star scheme for SBS THIN table design
20
Lessons learnt n The role of central repositories for metadata - Natural source of conceptual metadata - Metadata have to be exact, complete and consistant - Process metadata should be connected with the data n Harmonisation of metadata concepts - Local metadata vs. global metadata - The cultural change is needed n Technical considerations - The possibilities for metadata exchange and system integration are good (XML, SQL)
21
Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.