Download presentation
Presentation is loading. Please wait.
1
at Statistics Netherlands
Data Service Centrum Implementation of the Data Service Centre at Statistics Netherlands Harry Goossens Programm Manager DSC
2
Agenda Why, What, How ? The CBS Metadata model
The CBS Business Architectuur Steady States Implementation Daily practice Demo
3
Data Service Centre: What is it ?
Fundamental corner stone of the CBS Business Architecture Central ‘vault’ with Steady States, linking: statistical data (facts & figures) conceptual metadata (description) technical metadata (user’s guide) documentation Implementation of the Dutch metadata model
4
DSC: The concept No data without metadata
Based upon dedicated metadata model Strict distinction between the data that are actually processed and the metadata that describe the definitions, the quality and the process activities Steady states are explicitly designed for re-use. The metadata (of steady states) are generally accessible and are standardised as much as possible
5
DSC: What offers it ? Generic services: Catalogue: searching & finding
Metadata management Centralised data distribution Authorisation management Automatic process interfacing Archiving of statistical datasets Version management
6
DSC: Conceptual metadata
Metadata that describes the data in a generic, non specific way, in all the various phases of the statistical proces: Input description of received data; terminology of the supplier (internal and external) Processing description of data produced from various statistical processes; internal (and international) standards/guidelines Output description of publishable outputdata; definition of (sub)populations, outputvariables, object types content description,
7
CBS Metadata model - micro
8
CBS Metadata model - macro
9
DSC Metamodel - simplified
Variabele Dataontwerp 1 : n 1 : 1 Technische Metafile (XML) Context Variabele 1 : n Documentatie (Word, PDF,….) Datasets (ASCII CSV, Fixed)
10
CBS Business Architecture
Strategy Design DSC – Metadata Catalogue Chain management Statistics Production Steady States DSC - Data Storage
11
Steady States
12
What are steady states ? A steady state is a data set together with information for its correct interpretation. Rectangular Rows represent units (micro) or classes of units (macro) Columns represent variables Heading: population, time Dataset design (vary time): in design phase Dataset design is like a template of a table: only borders and heading 1 Dataset design, n Datasets
13
Why steady states ? Reduce storage: Store once Re-use many times
Secure the statistical proces: Each steady state is a guaranteed fall back point Improve consistency: Every following process uses the same dataset Improve flexibility Enables independent, generic proces design
14
Implementation Micro model simplyfied for practical use
Skip / Combine objects Reduce attributes Why Documentum ? Completely object orientated, enables to implement DSC metamodel Largly configurable (user interface, authorisation, etc.) Large flexibillity Proven technology TaskSpace
15
DSC system - Functionalities
Taylor made user interface (TaskSpace) maintenance of meta objects constraints mostly in interface, not in DB/repository Specific flexible search engine various entrances, easy to extend Import & Export of datasets modified according to model Interface for bulk-import of metadata based on standard XML schema conceptual and technical meta
16
Daily practice - Challenges
Available metadata quality often poor Great variety, each statistic own way of describing Often tool based (SPSS), more technical then logical Definition = question from survey Minimum mapping with DSC model No real urge Although rated IMPORTANT, low priority No clear ownership, resonsibility not felt Extra work without direct gain (burden)
17
Daily practice - Road map
Explaining the concept & metadata model Requirements, guidelines Stocktaking What meta is available ? How extensive or poor ? What quality, actuality ? Re-usability Mapping on the model (Re)Design Datadesigns Matching attributes
18
Daily practice - Chances
Look for added value Problems ? Wishes ? (Long) Wanted improvements ? Re-usability Define pilot Quick hands on experience, short cycle Good estimation time & resources ‘Proof of the design pudding’
19
Daily practice – At work
Excel template Porch / Ballot Visual check on guidelines Automated check on completeness, inconsistencies, relations VARIABLE – CONTEXT VARIABLE Corrections Bulk import TMF Meta XML
20
Screenshot
21
Screenshot: Metadata attributes of a Data design
22
Screenshot: and this is how we stor the statistical datasets
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.