Presentation is loading. Please wait.

Presentation is loading. Please wait.

ESTP course on Statistical Metadata – Introductory course

Similar presentations


Presentation on theme: "ESTP course on Statistical Metadata – Introductory course"— Presentation transcript:

1 ESTP course on Statistical Metadata – Introductory course
Statistics Netherlands, The Hague,18-19 February 2013

2 at Statistics Netherlands
Implementation of the Data Service Centre at Statistics Netherlands Harry Goossens Programm Manager DSC 2

3 Agenda Why, What, How ? The CBS Metadata model
The CBS Business Architectuur Steady States Implementation Daily practice Demo 3

4 Data Service Centre: What is it ?
Fundamental corner stone of the CBS Business Architecture Central ‘vault’ with Steady States, linking: statistical data (facts & figures) conceptual metadata (description) technical metadata (user’s guide) documentation Implementation of the Dutch metadata model

5 DSC: The concept No data without metadata
Based upon dedicated metadata model Strict distinction between the data that are actually processed and the metadata that describe the definitions, the quality and the process activities Steady states are explicitly designed for re-use. The metadata (of steady states) are generally accessible and are standardised as much as possible

6 Generic services: Catalogue: searching & finding Metadata management
DSC: What offers it ? Generic services: Catalogue: searching & finding Metadata management Centralised data distribution Authorisation management Automatic process interfacing Archiving of statistical datasets Version management

7 DSC: Conceptual metadata
Metadata that describes the data in a generic, non specific way, in all the various phases of the statistical proces: Input - description of received data; - terminology of the supplier (internal and external) Processing - description of data produced in various statistical processes; - internal (and international) standards / guidlines Output - description of publishable output data - definition of (sub)populations, outputvariables, object types content description,

8 DSC Metamodel (simplified)
Variable Data Design 1 : 1 1 : n Technical Metadata file (XML) Context Variable 1 : n Documentation (Word, PDF,….) Datasets (ASCII CSV, Fixed) Codelists (XML) 1 : 1

9 Mission, Policy and Strategic Objectives
CBS Business Architecture (simplified) Mission, Policy and Strategic Objectives Design Process Design Data External users with information needs DSC Metadata Catalogue Metadata Management External Suppliers of data Collect Data Process Data Disseminate Data DSC Data storage Data (steady states)

10 What are steady states ? A steady state is a data set together with information for its correct interpretation. Rectangular - Rows represent units (micro) or classes of units (macro) - Columns represent variables Heading: population, time Dataset design (vary time): in design phase Dataset design is like a template of a table: only borders and heading 1 Dataset design, n Datasets

11 Why steady states ? Reduce storage: Secure the statistical proces:
Store once Re-use many times Secure the statistical proces: Each steady state is a guaranteed fall back point Improve consistency: Every following process uses the same dataset ‘Single point of truth’ principle Improve flexibility Enables independent, generic proces design

12 Implementation Micro model simplyfied for practical use
Skip / Combine objects Reduce attributes Why Documentum ? Completely object orientated, enables to implement DSC metamodel Largly configurable (user interface, authorisation, etc.) Large flexibillity Proven technology TaskSpace

13 DSC system - Functionalities
Taylor made user interface (TaskSpace) Maintenance of meta objects Constraints mostly in interface, not in DB/repository Specific flexible search engine Various entrances, easy to extend Import & Export of datasets Modified according to model Interface for bulk-import of metadata Based on standard XML schema Conceptual and technical meta

14 Daily practice - Challenges
Available metadata quality often poor Great variety, each statistic own way of describing Often tool based (SPSS), more technical then logical Definition = question from survey Minimum mapping with DSC model No real urge Although rated IMPORTANT, low priority No clear ownership, resonsibility not felt Extra work without direct gain (burden)

15 Daily practice - Road map
Explaining the concept & metadata model Requirements, guidelines Stocktaking What meta is available ? How extensive or poor ? What quality, actuality ? Re-usability Mapping on the model (Re)Design Datadesigns Matching attributes

16 Daily practice - Chances
Look for added value Problems ? Wishes ? (Long) Wanted improvements ? Re-usability Define pilot Quick hands on experience, short cycle Good estimation time & resources ‘Proof of the design pudding’

17 Daily practice – At work (1)
Excel template, Nesstar Publisher Porch / Ballot Visual check on guidelines Automated check on completeness, inconsistencies, relations VARIABLE – CONTEXT VARIABLE etc. Advise for corrections/improvemnts: by owner (statistics) ! Define and set authorisations Groups for import, export, metadata maintenance

18 Daily practice – At work (2)
Metadata in DSC-system Define datadesign Import TMF (xml) Bulkimport Variables (xml) Import dataset (s) Search Export data & metadata

19

20

21


Download ppt "ESTP course on Statistical Metadata – Introductory course"

Similar presentations


Ads by Google