IMF Approach to Storing Metadata with Macroeconomic Statistics UNECE Workshop on the Common Metadata Framework (Vienna, Austria, 4-6 July 2007)
Dissemination Standards Bulletin Board (DSBB) data standards initiative (SDDS/GDDS countries’ dissemination practices information that SDDS countries provide the IMF on their dissemination practices direct links to the economic and financial data that countries disseminate under the SDDS information that GDDS countries make available to the IMF on their statistical practices
Collaboration with OECD Dec Agreement to use Dotstat and MetaStore to form the basis of the IMF data warehouse Jan 07 – software available on joint Team Foundation Server (TFS) Feb 07 IMF.Stat installed with the assistance of OECD May 07 have loaded: International Financial Statistics (IFS), World Economic Outlook (WEO), and Sub Saharan Africa Regional Economic Outlook (REO) June 07 – signed an MOU which supports a collaboration approach to future enhancements for the mutual benefit of both organizations
Data Fact table CouGrpID ConceptID DataSourceID UnitOfMeasID TimeFreqID StatusID Observation Flag E Country Group CouGrpID ParentID Code 100 Null 156 Label Canada Concept ConceptID ParentID Code NGDP Label Gross... DataSource DatSrceID ParentID Code 3000 Null WEO Label World... Unit Of Measure UofMID ParentID Code 10 Null N Label Nat Curr Time & Frequency TimeFreqID ParentID Code 25 Null Label 2004 Q1 Status StatusID ParentID Code 2 Null SHARE Label Shareable Metadata Fact table CouGrpID ConceptID DataSourceID UnitOfMeasID TimeFreqID StatusID MetadataID Country Group CouGrpID ParentID Code 100 Null 156 Label Canada Concept ConceptID ParentID Code NGDP Label Gross... DataSource DatSrceID ParentID Code 3000 Null WEO Label World... Unit Of Measure UofMID ParentID Code 10 Null N Label Nat Curr Time & Frequency TimeFreqID ParentID Code 25 Null Label 2004 Q1 Status StatusID ParentID Code 2 Null SHARE Label Shareable DataReferential Metadata Metadata Text Chain-linked GDP volume measures are expressed in... MetadataID 5487 IMF.Stat Data Model
Structural metadata Economic Concepts -mapped as many time series as possible to the Catalogue of Time Series and loaded them to IMF.stat Countries and groups – used IFS version of Country names and codes as the authoritative source for codes and labels Unit – chose to combine unit and scale e.g. Millions of US dollars Storing data in native units i.e. not trying to convert observations to a common unit. Status, Source and Time and Frequency reasonably straight forward so far. Will become more problematic when we introduce versioning.
Working through existing metadata from IFS publications and production system Where necessary/possible cleaning it up, standardizing it and loading it to MetaStore WEO – metadata sourced from the external web site, reformatted and stored in MetaStore then exported to IMF.stat All referential metadata loaded to MetaStore and then exported to IMF.Stat Referential Metadata
Data- IFS All time series which were able to be mapped to the Catalogue of Time Series (CTS) Includes –Exchange rates –Balance of Payments –International Investment Position –Real Sector Statistics –International Liquidity –Money and Banking non-SRF data Excludes –Government Finance –Money and Banking SRF data –Fund Accounts »191 concepts »233 countries »39 groups »7.6 million observations
WEO Two most recent editions Includes series published externally as well as other series available internally Concept - generally consistent with the CTS Country and group – some differences in codes used so mapped where possible. Some groups added. Unit - limited number of units used and mainly consistent across countries Data-WEO
Sub Saharan Africa REO–Structural Metadata Concepts – virtually no codes or labels in common with the CTS Able to map those series published in the REO but the supporting series too difficult. Are now working through them on a case by case basis to determine which if any map to the CTS Country and group – country codes and labels mainly consistent with WEO. Groups all different even though sometimes have the same label. Units - mainly ratios which were added to the authoritative lis t. Sub Saharan Africa REO Referential Metadata Have sourced top level referential metadata only. Will work with the Africa Department after the data are loaded to identify any usable referential metadata. Data-REO
MetaStore Some modifications with assistance from OECD –Now includes structural metadata mappings to authoritative lists referential metadata SchemaLogic –In future may integrate structural metadata in MetaStore or replace Alignment with SDMX –Have used 42 ‘types’ to categorize our referential metadata –Added one to the OECD set, which are consistent with SDMX
Managing metadata within the IMF Locate relevant sources of metadata Locate potential warehouse content Central repositories for data and metadata Harmonizing and mapping to a preferred term Authoritative lists Working with Information Services Division (ISD) to ensure information management best practice Assigning data stewards to manage metadata
Governance Establishing groups and individuals with certain roles and responsibilities for management of metadata –Economic Data Advisory Group Representation from departments across the Fund Includes several working groups with specific focus –Information Services Division Responsible for provision of metadata –Metadata and Standards team New group in the Statistics Department currently focusing on metadata used in the data warehouse
Next Steps Changes to work practices across the Fund Identify a data steward for each dimension in IMF.Stat Standardization, authoritative sources Reuse of metadata across systems Raise awareness of the value of quality metadata Tie together basic schemas
IFS WEO Internal External Data sources DataStream ETL IMF.stat MetaStore User interface End-users User interface Referential and structural metadata Haver 111 USA 112 UK 273 MEX ;USA,GDP, ;USA,GDP, Data flow Referential metadata Structural metadata EDW Top Level Diagram Concept Country Group Data Source Time & Freq