REFERENCE METADATA FOR DATA TEMPLATE Ales Capek EUROSTAT
WHY METADATA NEEDED? One of the aims of the WG is to propose what kind of metadata should be attached to the data Metadata enable the assesment of the data and their quality aspects Metadata are an instrument for further harmonization of compilation of the statistics globally (Example EU statistics based on EU legislation)
TYPES OF REFERENCE METADATA Variety of metadata, harmonized metadata cover only small percent of statistics Short description - 1 sentence or paragraph, can be universally valid (no need for country specific metadata) but not sufficient information Structured detailed metadata – SDDS, SDMX,ESMS, provide sufficient detail but are country specific, sometimes of generic nature (cover statistical area rather than given indicator) Quality profile – focus on quality dimensions, rating
SUGGESTIONS Short descriptions – Eurostat short metadata can be used as a starting point Detailed metadata – agree on structure SDDS already available for the PGI but not enough detail on quality aspects SDMX – more ambitious, provides complete information on the data
EXAMPLE – SHORT DESCRIPTION The unemployment rate represents unemployed persons as a percentage of the labour force based on International Labour Office (ILO) definition. The labour force is the total number of people employed and unemployed. Unemployed persons comprise persons aged 15 to 74 who: - are without work during the reference week; - are available to start work within the next two weeks; - and have been actively seeking work in the past four weeks or had already found a job to start within the next three months. Data are presented in seasonally adjusted form
EXAMPLE – EURO SDMX Indicator Reference Metadata – full size Compiling agency: For any question on data and metadata, please contact: 1. Contact 1.1. Contact organisation 1.2. Contact organisation unit 1.5. Contact mail address 2. Metadata update 2.1. Metadata last certified 2.2. Metadata last posted 2.3. Metadata last update 3. Statistical presentation 3.1. Data description 3.2. Classification system 3.3. Sector coverage 3.4. Statistical concepts and definitions 3.5. Statistical unit 3.6. Statistical population 3.7. Reference area 3.8. Time coverage 3.9. Base period 4. Unit of measure 5. Reference period 6. Institutional mandate 6.1. Legal acts and other agreements 6.2. Data sharing 7. Confidentiality 7.1. Confidentiality - policy 7.2. Confidentiality - data treatment 8. Release policy 8.1. Release calendar 8.2. Release calendar access 8.3. User access 9. Frequency of dissemination 10. Dissemination format News release Publications On-line database Micro-data access Other 11. Accessibility of documentation Documentation on methodology Quality documentation 12. Quality management Quality assurance Quality assessment 13. Relevance User needs User satisfaction Completeness 14. Accuracy and reliability Overall accuracy Sampling error Non-sampling error 15. Timeliness and punctuality Timeliness Punctuality 16. Comparability Comparability - geographical Comparability - over time 17. Coherence Coherence - cross domain Coherence - internal 18. Cost and burden 19. Data revision Data revision - policy Data revision - practice 20. Statistical processing Source data Frequency of data collection Data collection Data validation Data compilation Adjustment 21. Comment Notes Related Metadata 21.3 Annex