StatLine 4 metadata implementation Edwin de Jonge Statistics Netherlands.

Slides:



Advertisements
Similar presentations
DC2001, Tokyo DCMI Registry : Background and demonstration DC2001 Tokyo October 2001 Rachel Heery, UKOLN, University of Bath Harry Wagner, OCLC
Advertisements

Chapter 10: Designing Databases
Information and Business Work
Oct 31, 2000Database Management -- Fall R. Larson Database Management: Introduction to Terms and Concepts University of California, Berkeley School.
The Dutch Censuses of 1960, 1971 and 2001 Producing public use files in the IPUMS project Wijnand Advokaat Statistics Netherlands Division Social and Spatial.
“DOK 322 DBMS” Y.T. Database Design Hacettepe University Department of Information Management DOK 322: Database Management Systems.
United Nations Expert Group Meeting on Revising the Principles and Recommendations for Population and Housing Censuses New York, 29 October – 1 November.
1 Collection and dissemination of statistics on disability at the United Nations Statistics Division Proposals for the future Expert Group Meeting to Review.
Documenting Register Data for Research Purposes Finnish Information Centre for Register Research Marianne Johnson Irma-Leena Notkola
Augmenting search using a semantic visual graph Edwin de Jonge Olav ten Bosch Statistics Netherlands.
Case Studies: Statistics Canada (WP 11) Alice Born Statistics UNECE Workshop on Statistical Metadata.
Statistics New Zealand Classification Management System Andrew Hancock Statistics New Zealand Prepared for 2013 Meeting of the UN Expert Group on International.
Using ISO/IEC to Help with Metadata Management Problems Graeme Oakley Australian Bureau of Statistics.
Record matching for census purposes in the Netherlands Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands.
Architecture for a Database System
Met a-data Resources in Europe: within NSIs and from Dosis Projects Wilfried Grossmann Department of Statistics and Decision Support Systems University.
Statistics Sweden Results from operations in 2006: 146 publications 356 press releases commissions 3,7 million visitors at
26 June 2008 DG REGIO Evaluation Network Meeting Ex-post Evaluation of Cohesion Policy Programmes co-financed by the European Fund for Regional.
Statistics Portugal/ Metadata Unit Monica Isfan « Joint UNECE/ EUROSTAT/ OECD Work Session on Statistical Metadata.
Metadata Architecture at StatCan MSIS 2008 Luxembourg, April 7-9, 2008 Karen Doherty Director General Informatics Branch Statistics Canada.
Explaining the statistical data warehouse (S-DWH)
Revision Project of the Business Register (BR) and Business Statistics in September 2013 Tuula Viitaharju.
Commemorative Event for the 60th Anniversary of the United Nations Statistical Commission Seminar 'Evolution of the National Statistical Systems' 23 February.
INFORMATION MANAGEMENT Unit 2 SO 4 Explain the advantages of using a database approach compared to using traditional file processing; Advantages including.
Implementation Experiences METIS – April 2006 Russell Penlington & Lars Thygesen - OECD v 1.0.
Data resource management
The availability of Dutch census microdata Eric Schulte Nordholt Senior researcher and project leader of the Census Statistics Netherlands Division Social.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
A modular metadata-driven statistical production system The case of price index production system at Statistics Finland Pekka Mäkelä, Mika Sirviö.
© Statistisches Bundesamt, I/A Case study Federal Statistical Office Germany (Destatis) Joint UNECE/ EUROSTAT/ OECD Work Session on Statistical Metadata.
Implementing the GSIM Statistical Classification model – the Finnish way Essi Kaukonen / Statistics Finland UNECE Workshop on International Collaboration.
MetaPlus Klas Blomqvist Statistics Sweden Research and Development – Central Methods
Integrated metadata systems History Status Vision Roadmap
1 Statistical business registers as a prerequisite for integrated economic statistics. By Olav Ljones Deputy Director General Statistics Norway
Statistics Netherlands CRISTAL, a Model for Data and Metadata Statistics Netherlands Erik van Bracht METIS Feb 2004.
Software Reuse Course: # The Johns-Hopkins University Montgomery County Campus Fall 2000 Session 4 Lecture # 3 - September 28, 2004.
Page 1 Development of Metadata System at Croatian Bureau of Statistics Development of Metadata System at Croatian Bureau of Statistics Presented by Maja.
Dissemination Statline tool and organisation André de Boer.
Use of Standardized Metadata to Find, Select and Access Statistical Data - Experience of Statistics Canada - Joint UNECE/Eurostat/OECD Work Session on.
Developing a metadata system for microdata About the project of developing a system for description of microdata at Statistics Sweden.
Conceptual metadata and process metadata Max Booleman (Statistics Netherlands) WP18.
1 10 Systems Analysis and Design in a Changing World, 2 nd Edition, Satzinger, Jackson, & Burd Chapter 10 Designing Databases.
M.-E. Bégin¹, S. Da Ronco², G. Diez-Andino Sancho¹, M. Gentilini³, E. Ronchieri ², and M. Selmi² ¹CERN, Switzerland, ² INFN-Padova, Italy, ³INFN-CNAF,
Metadata models to support the statistical cycle: IMDB
Prepared by: Galya STATEVA, Chief expert
Statistics Netherlands Division Social and Spatial Statistics
The Generic Statistical Information Model (GSIM) and the Sistema Unitario dei Metadati (SUM): state of application of the standard Cecilia Casagrande –
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
SDMX Information Model
Questasy: Documenting and Disseminating Longitudinal Data Online with DDI 3 Edwin de Vet 11/14/2018.
The Re3gistry software and the INSPIRE Registry
S-DWH layered architecture – Statiscs Finland
at Statistics Netherlands

Application of Dublin Core and XML/RDF standards in the KIKERES
YTY − an integrated production system for business statistics
Metadata in the modernization of statistical production at Statistics Canada Carmen Greenough June 2, 2014.
Working on coherence and consistency of an output database
Evaluation & Experiences ‘YTY-System’ Statistics Finland
Documentation of statistics Metadata
Max Booleman Statistics Netherlands
International Marketing and Output Database Conference 2005
Open Archival Information System
Database Design Hacettepe University
Metadata used throughout statistics production
Statistical process as a structured chain of successive actions and intermediate products, supported by the coherent use of metadata  Focused on energy.
ESTP course on Statistical Metadata – Introductory course
Dissemination Databases
ECONOMIC CLASSIFICATIONS Advanced course Day 1 – third afternoon session Tools for assisting the use of classifications Zsófia Ercsey - KSH – Hungary.
Presentation transcript:

StatLine 4 metadata implementation Edwin de Jonge Statistics Netherlands

What is StatLine? StatLine is online output database of Statistics Netherlands. –Primary output channel –Contains all published data –Current size: 1500 data cubes, 2 billion data cells, over 150 million facts –Contains much functionality, including very good search engine

StatLine in Bussiness Architecture StatLine in statistical process

What is StatLine 4? Redesign current StatLine 3 dissemination software: Reasons redesign: –Improve coherence –Changing publication policy –Handle time dependence –Archiving –Many new features

StatLine coherence Ideally: StatLine coherent & consistent Currently (StatLine 3): –1500 independent data cubes StatLine 4: –Data cubes share metadata: –centrally moderated, quality improvement –Data cubes share data: –Each fact stored once.

StatLine 4 metadata management Metadata management centralized: –What? Conceptual metadata: –Classifications –Variables –By whom? Two organization units: 1.Coordination: Maintaining structure and meaning of classifications 2.Dissemination: Textual editing and translations –Data producers own data, but not meta data. –Result: Every fact in StatLine 4 uses central classifications.

StatLine in Bussiness Architecture StatLine in statistical process

Classification status In StatLine 4 each classification has status: –(Inter)national standard –Coordinated – within Statistics Netherlands –Shared –Shared but not coordinated –Private –Can only be used by 1 data cube –Only during conversion This status is used for coordination purposes.

Cristal model: StatLine 4 uses Cristal model –Model for classifications and variables (Van Bracht et al.) –Focus on Conceptual and Value domain (ISO 11179) Model elements: –Category (value): –value of variable, creates subpopulation. e.g.: male (gender: male) –Can be part of other category (partial order) –Level: –set of disjoint categories –Equals “flat” classification

Cristal model (2): –Hierarchy: –Sequence of levels (total order) with contained categories –Every category in hierarchy has 1 parent in higher level –Equals “hierarchical” classification –Classification: –set of hierarchies with contained levels and categories –Equals: Family of hierarchical classifications.

Cristal model (3) –Classification versioning –Each metadata object has lifetime (begin and end date) –Each metadata object can have a predecessor and successor –Models versions of categories, levels and hierarchies.

Cristal model (4) Multilingual –All textual properties are multilingual –E.g. Mannelijk (dutch) -> Male –All metadata and tables can be shown in each defined language –All textual properties have popular versions –E.g. Consumer Price Index -> Inflation –All metadata and tables can be shown in “popular” or “expert” mode Object class: Is stored, but not coordinated (yet)

StatLine 4 conversion All content current StatLine must be converted –From 1500 independent cubes –To 1500 coordinated cubes Conversion means coordination! –Total coordination -> very long conversion –No coordination -> no added value Ergo: Partial classification coordination

Conversion strategy (1) Strategy: –Coordinate standardized metadata –Allow non standards for 2 year period –Phased conversion –Preparation, conversion, coordination

Conversion strategy (2) Preparation phase: until June 2006 –Collect and store standard classifications –E.g. Time, Region (50 versions), Age, Marital status, Sex, NACE –Including variations (disclosure control) –For each data cube –Check usage standard classifications –Non standard is marked “private” –Define StatLine 4 structure

Conversion strategy (3) Conversion phase: (June 2006) –Convert data cube –Add missing meta data to metadata server –Check conversion Coordination phase (November 2006) –After conversion: StatLine 4 contains coordinated and private metadata –In two years time all private metadata must be replaced with coordinated metadata

Benefits metadata StatLine 4 –Coordinated classifications and variables –Uniform naming and description –Standard/coordinated metadata can be downloaded –Better comparability of data –Better search results

Future improvements StatLine 4.1 –Centralize population (object class) management: –E.g.: person, enterprise –Model populations and subpopulations Statistical process –Centralize: –process metadata –quality metadata.