Download presentation
Presentation is loading. Please wait.
1
Metadata use in the Statistical Value Chain
UNECE-Eurostat-OECD Meeting on Management of Statistical Information Systems MSIS 2008 Luxembourg, 7-9 April 2008 Georges Pongas Adam Wroński 07-Apr-08
2
Content Introduction Operational Characteristics of Metadata
Technical Characteristics of the Metadata Metadata types needed in the various steps of the SVC (statistical value chain) Conclusion 7-Apr-08 Metadata use in the Statistical Value Chain
3
Seven SVC steps Expression of the need Data collection design
Specification and development of the tools needed for the data collection Data collection Data editing and imputation Data processing Data dissemination 7-Apr-08 Metadata use in the Statistical Value Chain
4
Basics Leave out the statistical notions from the technical (implementation oriented) characteristics of the metadata. Design metadata technical characteristics so the same metadata structures can cover both statistical and non-statistical requirements 7-Apr-08 Metadata use in the Statistical Value Chain
5
Operational Characteristics of Metadata
Static nature Long production process Located in various places (resources) Critical link with statistical data depends on statistical data changes Strong coupling of structural metadata with the statistical data Large number of metadata entities needed in SVC 7-Apr-08 Metadata use in the Statistical Value Chain
6
Technical Characteristics of Metadata
Terminology often complex Technical characteristics and statistical notions frequently mixed 7-Apr-08 Metadata use in the Statistical Value Chain
7
Statistical Notions and Metadata
Examples Classification, keyword list and set of information related to the SDDS standard Correspondence table between two classifications & table containing the links (access rights) between the user names and the statistical datasets of a database The only difference is the context, i.e., the user interface Thus develop separately: a common set of functionalities and the interface layer for an application 7-Apr-08 Metadata use in the Statistical Value Chain
8
Metadata Technical Structure Categories
Three categories proposed: Simple Metadata Entities (SME) Binary Relationships (BR) Clustered Metadata Entities (CME) 7-Apr-08 Metadata use in the Statistical Value Chain
9
Simple Metadata Entities (SME)
simple key variable number of attributes appropriate for vertical type storage Example 1 Example 2 Entity NACE user name Entity element gpongas Attribute name English label phone no Attribute value “Mining” 7-Apr-08 Metadata use in the Statistical Value Chain
10
Examples of SMEs SDDS documents Dublin Core Classifications Keywords
Administrative entities Programs Publications 7-Apr-08 Metadata use in the Statistical Value Chain
11
Binary Relationships (BR)
Two types: Between two different entities correspondence tables, access rights definitions Inside the same entity thesauri, classification hierarchies, links between regulations, statistical documents Example Relationship id UN thesaurus First entity id EUROPE First entity role Parent Second entity id FR Second entity role Child Reason of link Broader term 7-Apr-08 Metadata use in the Statistical Value Chain
12
Clustered Metadata Entities (CME)
Complex entities characterised by variable keys’ cardinality and references to other entities of type CME, SME and BR Description techniques XML schema is appropriate 7-Apr-08 Metadata use in the Statistical Value Chain
13
Examples SDMX, Gesmes definitions Dataset definitions
Annotations to dataset cells Confidentiality definitions linked to datasets 7-Apr-08 Metadata use in the Statistical Value Chain
14
Metadata in the various steps of the SVC
7-Apr-08 Metadata use in the Statistical Value Chain
15
Collection Metadata Mostly of type BR and SME
Among others they contain: source agencies data files descriptions codelists validation rules linked to initial data checks 7-Apr-08 Metadata use in the Statistical Value Chain
16
Editing, Imputation and Processing Metadata
More complex than the collection metadata (more CME entities needed) Among others they contain: Dataset definitions Formulas, programs, scripts Conditional and ordinary annotations Dissemination feeding information 7-Apr-08 Metadata use in the Statistical Value Chain
17
Dissemination Metadata
The most complex metadata types are located here. They contain almost all the previously described metadata plus their own Reasons for this complexity Dissemination contains all the statistical domains It must cover all user types It has tight delivery deadlines It must offer navigation presentation and extraction facilities of great friendliness 7-Apr-08 Metadata use in the Statistical Value Chain
18
Among others dissemination metadata contain
Sitemap description Release calendars Dataset links to publication tables Questionnaires definitions linked to datasets Units of measurement Ready made queries 7-Apr-08 Metadata use in the Statistical Value Chain
19
Conclusion Separation of
statistical notions (context) and structure (functionality) of metadata gives minimisation of structural metadata types consequently it makes easier to build and implement a complex statistical (metadata and data) system 7-Apr-08 Metadata use in the Statistical Value Chain
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.