Download presentation
Presentation is loading. Please wait.
Published byOswin Pearson Modified over 9 years ago
1
Best practice case: Comparing the implementations of the Irish CDM and the Dutch DSC ESSnet on microdata linking and data warehousing in statistical production Harry Goossens – Statistics Netherlands Head Data Service Centre / ESSnet Coordinator hct.goossens@cbs.nl
2
ESSnet Data Warehousing 2 The CSO Corporate Data Model (CDM) Underlying principle:4 datastores 1.INPUT-raw data 2.CLEAN UNIT-cleaned data 3.AGGREGATE-aggregated data 4.DISSIMINATION-published data CDM was seen as ≈ active DWH
3
ESSnet Data Warehousing 3 The CSO Corporate Data Model (CDM) Main characteristics: All (statistical) processes must use the 4 datastores Processing systems interact on the data stores At some moments: snap shots, which build next data store It is possible to work further on the same (snap shotted) data store Simultanious updating of / on data is mainly organisational issue
4
ESSnet Data Warehousing 4 The CSO Corporate Data Model INPUT CLEANED DATASETS AGGREGATE DATASETS DISSEMINATION DATA MANAGEMENT STORE ADMINISTRATIVE DATA CENTRE 2OPERATIONAL IMPLEMENTATIONS Surveys Admin data
5
ESSnet Data Warehousing 5 Data Management Store (DMS) First implementation of CDM Only survey data Data tables are created and populated through the DMS applications. Metadata must be entered as the data tables are created. Metadata capturing = minimal bottleneck BR outside DMS (stand alone)
6
ESSnet Data Warehousing 6 CDM – Data Management Store DATA COLLECTION ACTIVITIES INPUT CLEANED DATASETS AGGREGATE DATASETS DISSEMINATION DMSDMS APP – layer, incl. I/O interfaces DMS meta layer – Basic descriptions SHARED INPUT SHARED CLEANED UNIT AGGREGATE STORE SNAPSHOTS BIBI SYS 1 SYS 2 SYS n Mainly surveys
7
ESSnet Data Warehousing 7 Administrative Data Centre (ADC) Developed for organisational reasons Only Admin data A catalyst to exploit administrative data for statistical purposes Interface with public authorities on admin data flows to CSO Clearing house inside CSO for admin data Data governance with respect to admin data
8
ESSnet Data Warehousing 8 Administrative Data Centre (ADC) Has an analysis layer R&D on available data To develop new datasets Without specific needs / demands from statistics
9
ESSnet Data Warehousing 9 CDM – Administrative Data Centre INPUT CLEANED DATASETS AGGREGATE DATASETS DISSEMINATION ADCADC ADC meta layer BIBI SYS 1 SYS 2 SYS n DATA COLLECTION ACTIVITIES SOURCES Data Products ETLETL ADC Front Door LEAN INTERFACE Only Admin Data
10
ESSnet Data Warehousing 10 Corporate Data Model CSO - Ireland DATA COLLECTION ACTIVITIES INPUT CLEANED DATASETS AGGREGATE DATASETS DISSEMINATION DMSDMS ADCADC APP – layer, incl. I/O interfaces DMS meta layer – Basic descriptions ADC meta layer SHARED INPUT SHARED CLEANED UNIT AGGREGATE STORE SNAPSHOTS BIBI SYS 1 SYS 2 SYS n DATA COLLECTION ACTIVITIES SOURCES Data Products ETLETL ADC Front Door LEAN INTERFACE
11
ESSnet Data Warehousing 11 The CBS Data Service Centre (DSC) The concept: No data without metadata Dedicated metadata model as basis Strict distinction between: Statistical data (facts & figures) Conceptual metadata (definitions, description of quality,process activities etc.) Steady states explicitly designed for re-use. All metadata (of steady states) are generally accessible and are standardised as much as possible
12
ESSnet Data Warehousing 12 The CBS Data Service Centre (DSC) What is it ? Fundamental corner stone of the CBS Business Architecture Central ‘vault’ with Steady States, linking: statistical data (facts & figures) conceptual metadata (description) technical metadata (user’s guide)\ Documentation Implementation of the Dutch metadata model
13
ESSnet Data Warehousing 13 The CBS Data Service Centre (DSC) What offers it ? Generic services: Metadata coordination Centralised data distribution Authorisation management Automatic process interfacing (in developement) Archiving of statistical dataset
14
ESSnet Data Warehousing 14 The CBS Data Service Centre (DSC) Why do we do it ? Data-sharing / re-using data Intermediary, archive and distribution, CBS data-vault. Maximum efficient use of data en metadata Process guarantee / security Safety net in case of calamity, static ‘froozen’ data Process standardization Transparancy & efficiency Coordination of metadata & classificaties One, single source with elements for the statistical process Process chain support Steady States as data hubs Generic process for data linking DSC structure enables linking datasets with equal object type
15
ESSnet Data Warehousing 15 CBS Business Architecture: Layers Strategy Design Chain management StatisticsProduction SteadyStates DSC - Data Storage DSC – Metadata Catalogue
16
ESSnet Data Warehousing 16 CBS Business Architecture: Steady States
17
ESSnet Data Warehousing 17 DSC: What are Steady States ? A steady state is a dataset together with information for its correct interpretation. Rectangular Rows represent units (micro) or classes of units (macro) Columns represent variables Heading: population, time Dataset design is like a template of a table: only borders and heading 1 Dataset design, n Datasets Data Service Centre - DSC
18
ESSnet Data Warehousing 18 DSC: Why Steady States ? Reduce storage: Store once Re-use many times Secure the statistical proces: Each steady state is a guaranteed fall back point Improve consistency: Every following process uses the same dataset Improve flexibility: Enables independent, generic proces design
19
ESSnet Data Warehousing 19 Conclusions Both CSO & CBS Use the same basic principle of 4 (static) stages/bases had the same 'drivers' to start DWH: -re-use of data, -deconnecting input - output (= getting rid of stove pipes) CSO strong focus on practical results, (succesfull) quick wins; 2 different implementations of the CDM organisational driver for ADC CBS Strong focus on metadata model DSC = essential element of the business architecture 1 implementation supporting all processes
20
ESSnet Data Warehousing 20 Conclusions Regarding the DWH ESSnet S-DWH architecture covers both best practices ESSnet indicated right issues to focus: -metadata -role/position BR strong desire for knowledge exchange, learning from other NSIs CSO = very helpful best practice case CSO acknowledges importance of ESSnet, wants to stay closely involved
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.