Harry Goossens Centre of Competence on Data Warehousing
Centre of Competence on DWH 2 Active support of ESS member states, putting results ESSnet DWH in practice
ESSnet DWH– Short Recap 3 A central ‘statistical data store’ for managing all available data of interest, regardless of its source, enabling the NSI to: produce necessary information (= statistics !) (re)use available data to create new data / new outputs execute analysis and perform reporting A warehouse approach to statistics: Provide an architectural model of the statistical data flow, from data collection to statistical output. Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Introduction: Why S-DWH ? The Challenges 4 Rapidly changing info-demand Decrease costs & admin burden Integration & re-use available data sources Shorter life cycle, Quicker delivery Increase efficiency & flexibility Statistical Production Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Introduction: Why S-DWH ? The goal Reusing Statistical Data External data sources 5 Data Collection Make optimal use of all available data sources (existing & new) Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Output: Huge set of deliverables The ESSnet DWH produced: Architectural framework for the S-DWH: Business architecture (3.1 and 3.3) Information systems architecture (3.5 and 1.6) Technical architecture (3.4) Metadata framework (1.1) Metadata guidelines & recommendations on Use of metadata models (1.3) and functionalities (1.4) Metadata quality (1.2) and governance (1.5) Methodological recommendations Workshop CoC on DWH, Helsinki, 24 – 25 September2014
7 ISO 11179
The layered architecture of the S-DWH 8 Distinguishes S-DWH Workshop CoC on DWH, Helsinki, 24 – 25 September2014
GSIM: General Statistical Information Model 9 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The Business architecture of a S-DWH 10 BUSINESS PRODUCTION STRUCTURES CONCEPTS Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Information Systems Architecture 11 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Information Systems Architecture 12 Source Layer Integration Layer Interpretation Layer Access Layer Staging Data ICT - Survey SBS - Survey ET- Survey... ADMIN Usually of temporary nature, contents can be erased or archived after the S-DWH has been loaded successfully Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Information Systems Architecture 13 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Information Systems Architecture 14 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Information Systems Architecture 15 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The layered architecture 16 Reflects to 2 different IT environments of the S-DWH: 1.Operational to support semi-automatic computer interaction systems. 2.Analytical, the actual data warehouse to maximize free human interaction Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The layered architecture 17 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The source layer 18 Gathering point for all data to be used in the S-DWH Internal sources: surveys, data from processing programms External sources: admin data, collected for other purposes No specific, predefined data model Depends on design of datacollection process Well structured and/or simply flat files Important role as ‘Gatekeeper’ Ensuring that data getting in the S-DWH always has metadata matching minimum requirements and quality Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The source layer 19 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The integration layer 20 Operation system(s) used to process day-to-day operations, translating source data into useful content in S-DWH, commonly called ETL: EXTRACT TRANSFORM LOAD Source Layer Integration Layer Integration Layer Integration Layer Interpretration & Analysis Layer Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The integration layer 21 As the focus is on processing, data should be stored in a generalized and normalized data model, optimized for OLPT ETL needs active metadata Integration layer produces reference, process and statistical metadata The efficiency of the processing in the integration layer strongly depends on the quality of the metadata comming from the source layer. Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The integration and layer 22 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The interpretation and analysis layer 23 Contains relevant (micro) data, processed and structured to be optimized for analysis as base for the planned output of NSI Specially designed for statistical experts Built to support data manipulation of large, complex search operations Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The interpretation and analysis layer 24 Data modelling based on analysis & real time output dimensional datamodels highly denormalized, redundancy sometimes cubes Metadata normally added, with few changes variable definitions, deriviation rules estimation rules, confidentiality rules Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The interpretation and analysis layer 25 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The access layer 26 Designed for the final presentation, dissemination and delivery of statistical information. Used by a wide range of users and computer instruments. Data is optimized to present and compile data effectively. Data may be presented in data cubes with different formats, specialized to support different tools and software: ’Data marts’ Workshop CoC on DWH, Helsinki, 24 – 25 September2014
The access layer 27 Workshop CoC on DWH, Helsinki, 24 – 25 September2014
28 What support in your opinion is most important for the Centre to provide ? Review*** Advice & Consultancy**** Implementing deliverables* Knowledge repository**** Information broker** Supporting Business Case** C0C on DWH: Desired services Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Main route 29 Focus on middle Phases Workshop CoC on DWH, Helsinki, 24 – 25 September2014
30
Main goals 31 Contacting ESS members for identifying and prioritizing relevant projects and support requests Provide ad-hoc support and consultancy to ESS members on their specific subjects as requested. Active dissemination and implementation in daily practice of the deliverables of the ESSnet Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Main goals 32 Set up the knowledge repository in the ESSnet DWH domain of the CROS portal, incl. S-DWH best practice cases in ESS member states. Further elaboration of specific deliverables from the ESSnet DWH that are characterized as continuous activities (’living documents’). Keeping up-to-date the Handbook Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Status 33 1st line help desk function established via CROS Portal: Inventory of needs & tools At final workshop Face-to-face uestionnaire, wider group Various support provided Roadmap for Design Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Support actions 34 CSO Ireland: −Architecture and Metadata −Review Corporate Data Vault Destatis Germany: −Metadata support CSO Poland −General advise on S-DWH topic −Support session planned Dec/Jan INE Portugal: −Contacts renewed, partner in COE Workshop CoC on DWH, Helsinki, 24 – 25 September2014
Lessons learned 35 Work in progress, but slowly Most NSI in early stage of redesign, only few practice experiences to share Availability of resources Different format & more flexibility needed Need for exchanging expertise and experiences still actual Workshop CoC on DWH, Helsinki, 24 – 25 September2014