The FRB and XML: National data and International standards San Cannon Federal Reserve Board IASSIST 2005.

Slides:



Advertisements
Similar presentations
The SDMX Registry Model April 2, 2009 Arofan Gregory Open Data Foundation.
Advertisements

The use of SDMX at the ECB Xavier Sosnovsky European Central Bank Bonn,
Status on the Mapping of Metadata Standards
National Institute of Statistics, Geography and Informatics (INEGI) Implementation of SDMX in Mexico.
SDMX in the Vietnam Ministry of Planning and Investment - A Data Model to Manage Metadata and Data ETV2 Component 5 – Facilitating better decision-making.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Snippets of Data at a Glance: Using RSS to deliver statistics San Cannon Federal Reserve Board UNECE Work Session on Statistical Dissemination and Communication.
Lecture-7/ T. Nouf Almujally
SDMX Data Structure Definition for BPM6 and EBOPS Working Party on International Trade in Goods and Trade in Services Statistics Paris, France November.
Introduction to Databases
1 The Economic Census and NAICS in 30 minutes or less: John Medina, EPCD.
Data Archiving at the U.S. Central Bank Linda F. Powell Board of Governors of the Federal Reserve System Research and Statistics Division IASSIST Conference.
File Systems and Databases
Everything but the Kitchen Sink: Building a metadata repository for time series data at the Federal Reserve Board San Cannon and Meredith Krug Federal.
Chapter 10 ECONOMIC AND INDUSTRY ANALYSIS. 1.2 Investments Chapter 10 Chapter 10 Questions What are the generic approaches to security analysis? What.
File Systems and Databases Hachim Haddouti
Mariana Schkolnik National Director National Statistics Institute of Chile Busan 26 October 2009 National Statistic Institute Chile OECD Accession Process.
Information systems and databases Database information systems Read the textbook: Chapter 2: Information systems and databases FOR MORE INFO...
Product Offering Overview CONFIDENTIAL AND PROPRIETARY Copyright ©2004 Universal Business Matrix, LLC All Rights Reserved The duplication in printed or.
Identifying Good Stock Investments Investment and Finance 12 Ms. Stewart.
SDMX at the New York Fed Paul Asman 10 January 2007.
The implementation of the SDMX standards by the ECB and the European System of Central Banks Werner Bier (ECB) Gérard Salou (ECB) Sami Airo (Bank.
9 Feb 2004Mikko Mäkinen & Saija Ylönen Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) Geneva, 9-11 February 2004, Topic (ii): Metadata.
Week 1 Lecture MSCD 600 Database Architecture Samuel ConnSamuel Conn, Asst. Professor Suggestions for using the Lecture Slides.
Database Design - Lecture 1
Overview of SDMX: Statistical Data and Metadata eXchange Technical and Content Standards for Statistical Data Ann McPhail, Division Chief Statistics Department,
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design, 2 nd Edition Copyright 2003 © John Wiley & Sons, Inc. All rights reserved.
ICT Technologies Session 2 4 June 2007 Mark Viney.
SDMX AND DATA DISSEMINATION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA.
Restricted Daejeon, April An SDMX based unified data catalogue (UDC) MSIS – Meeting on the Management of Statistical Information Systems 1.
Unit Seven Database 1.Passage One. Foundation of Database.
METADATA HARMONISATION SDMX Training BANK INDONESIA SEPTEMBER 2015 YOGYAKARTA, INDONESIA.
ISetup – A Guide/Benefit for the Functional User! Mohan Iyer January 17 th, 2008.
Reference Sources on Business and Economics Sarah Aerni Special Projects Librarian University of Pittsburgh 6 April 2005.
Database What is a database? A database is a collection of information that is typically organized so that it can easily be storing, managing and retrieving.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
4 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved. Computer Software Chapter 4.
SDMX data structure definition for BPM6-based data BP Balance of PaymentsWorking Group Luxembourg, 2-3 April 2012.
United Nations Economic Commission for Europe Statistical Division The Importance of Databases in the Dissemination Process Steven Vale, UNECE.
1 Digital Preservation Testbed Database Preservation Issues Remco Verdegem Bern, 9 April 2003.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Cyclical Indicators for the United States Carol Moylan Third International Seminar on Early Warning and Business Cycle Indicators Moscow, Russian.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
Unified information system of statistical data collection, processing, storage and dissemination (Rosstat UIS ) Overview FEDERAL STATE STATISTICS SERVICE.
BU204 - Macroeconomics Unit 8 Seminar. Key Term Assignment Fiat money M1 M2 FED Bank Reserves Federal funds rate FED discount rate Monetary policy.
Eurostat 4. SDMX: Main objects for data exchange 1 Raynald Palmieri Eurostat Unit B5: “Central data and metadata services” SDMX Basics course, October.
SDMX IT Tools Introduction
Introduction to XML MIS3502: Application Integration and Evaluation Paul Weinberg Presentation by David Schuff.
Improving the visualisation of statistics: The need for an SDMX-based visualisation framework Xavier Sosnowska Luxembourg, 6 May 2008.
Joint UNECE/Eurostat/OECD work session on statistical metadata (METIS) APRIL 2006Mar Blanco Frías STATISTICAL METADATA MODEL DEVELOPED IN SPAIN:CURRENT.
Presentation on Database management Submitted To: Prof: Rutvi Sarang Submitted By: Dharmishtha A. Baria Roll:No:1(sem-3)
VCE IT Theory Slideshows by Mark Kelly study design By Mark Kelly, vceit.com, Begin.
The use of administrative data for the production of official economic statistics in Brazil - current situation and challenges for the future Shanghai,
1 ConIstat-on-line Short term statistical time-series data base in web environment By Francesco Rizzo
IAEA International Atomic Energy Agency Implementing SDMX for Energy Domain: From Discussion to Actual Implementation and Testing Andrii Gritsevskyi Oslo.
Snippets of Data at a Glance: Using RSS to deliver statistics
The Monetary-Financial Environment
Progress Update MSIS: Bratislava, April 2005
Exchanging Reference Metadata using SDMX
SDMX Information Model
File Systems and Databases
2. An overview of SDMX (What is SDMX? Part I)
2. An overview of SDMX (What is SDMX? Part I)
Prepared by Peter Boško, Luxembourg June 2012
Developing SDMX artefacts for data exchange, sharing and dissemination
Presentation transcript:

The FRB and XML: National data and International standards San Cannon Federal Reserve Board IASSIST 2005

2 Background: The Fed is a statistical agency as well as a central bank and regulatory agency. Lots of data and information are available on the public website. Lots of data and information are available on the public website. Statistical data is varied: Monthly industrial production indexes (non-financial), daily interest and exchange rates (financial) and quarterly financial flows for various sectors of the economy, surveys of small businesses and consumers, etc. Statistical data is varied: Monthly industrial production indexes (non-financial), daily interest and exchange rates (financial) and quarterly financial flows for various sectors of the economy, surveys of small businesses and consumers, etc.

3 The different roles are often competing interests... Sometimes it seems that the statistical agency role is secondary. Data are not always easy to find. Data are not always easy to find. Downloads are not customizable. Downloads are not customizable. Example: Trying to extract one industrial production series: Requires two text files, cutting and pasting, reformatting…. Example: Trying to extract one industrial production series: Requires two text files, cutting and pasting, reformatting…. All or nothing approach. All or nothing approach. Complete – yes. User Friendly – no. Complete – yes. User Friendly – no.

4 Other agencies making great strides : Bureau of Economic Analysis has wonderful tabling capabilities: Bureau of Economic Analysis has wonderful tabling capabilities: Bureau of Labor Statistics has query screens, series select screens and frequently requested statistics: Bureau of Labor Statistics has query screens, series select screens and frequently requested statistics:

5 Taking an extra step: We wanted to build something forward looking; XML was identified early on. Most flexible and seems to be the trend for future. Most flexible and seems to be the trend for future. Financial data already heading that way: FinXML, FpML (financial product ML), MDDL (Market data definition language), XBRL (eXtensible Business reporting language) Financial data already heading that way: FinXML, FpML (financial product ML), MDDL (Market data definition language), XBRL (eXtensible Business reporting language)

6 How do we do it? Build our own XML definitions: Build our own XML definitions: - Pro: would fit our data perfectly - Con: we’d be the only ones Use financial definitions: Use financial definitions: - Pro: lots of others use them - Con: we have nonfinancial data Try SDMX (Statistical Data and Metadata eXchange) : Try SDMX (Statistical Data and Metadata eXchange) : - Pro: designed for time series data - Con: new kid on the block

7 But nothing goes smoothly at first: SDMX is based on ‘key families’ and codelists where every concept can be represented by a code with a corresponding definition in a list: HBBA Int. Rate, Official, Discount rate/Base rate HBCA Int. Rate, Official, Intra-day loans SCBA Indust. Production, Motor vehicles, NSA SCBB Indust. Production, Motor vehicles, SA

8 We think about data differently The Fed uses mnemonic series names where each character in our series name has meaning and names are hierarchical. The Fed uses mnemonic series names where each character in our series name has meaning and names are hierarchical. RIFSPFF_N.BR.*:Rate R.I.*:Rate of interest in money and capital markets R.I.F.*:Federal Reserve System R.I.F.S.*:Short-term or money market R.I.F.S.P.*:Private securities R.I.F.S.P.FF.:Federal funds _N.:Not seasonally adjusted.B:Business (Five days, Monday-Friday) JQI_I02Y3361T3_N.M: J.*:Indices except of prices J.Q.*:ProductionJ.Q.I.:Industrial _I.*:NAICS-based industry classification 02Y:codes from year :Motor Vehicle Manufacturing T:thru 3363:Motor Vehicle Parts Manufacturing _N.:Not seasonally adjusted.M:Monthly

9 Fitting a square peg in a round hole…. Data represented by a concrete number of concepts are much easier to represent with key family dimensions and attributes: Data represented by a concrete number of concepts are much easier to represent with key family dimensions and attributes: Q.SCBA.GB.92 → Freq.Topic.Country.BIS code M.HBBA.US.01 → Freq.Topic.Country.BIS code Hierarchical relationships and varying number of concepts makes life more difficult – a single key family isn’t possible: Hierarchical relationships and varying number of concepts makes life more difficult – a single key family isn’t possible: JQI_I02YMF_N.M → Topic_Industry_SA.Freq RIFSPPNA2P2D30_N.B → Topic?_SA.Freq

10 SDMX only provides a framework: We still needed to build the actual schemas to describe our data within the SDMX metaschema framework. We still needed to build the actual schemas to describe our data within the SDMX metaschema framework. Each data release uses its own schema or set of schemas. Each schema is based on a key family used to describe the data. Each data release uses its own schema or set of schemas. Each schema is based on a key family used to describe the data. Currently, our schemas are tailored to meet our data needs. Currently, our schemas are tailored to meet our data needs.

11 Storage adds further complications: We need to store data and metadata in a database to be retrieved with queries. We need to store data and metadata in a database to be retrieved with queries. Native XML databases in their infancy. Native XML databases in their infancy. We couldn’t find many people storing XML tagged data in relational databases We couldn’t find many people storing XML tagged data in relational databases

12 So what did we end up with? Data model is hybrid: tree structure flattened to fit codelist setup. Data model is hybrid: tree structure flattened to fit codelist setup. We store the XML as carefully sliced text in a relational database and we can build an index structure that allows us to respond to ad-hoc queries very efficiently, even for large volumes of data.

13 This kind of structure:

14 Looks like this in SDMX-ML: Commercial Paper Outstandings Commercial Paper Outstandings

15 Which gets stored like this:

16 And the end result? The Data Download Project (DDP) is the largest, most complex application on the Board’s public website. The Data Download Project (DDP) is the largest, most complex application on the Board’s public website. It’s also the first production application to deliver customized data extracts in SDMX format. It’s also the first production application to deliver customized data extracts in SDMX format. And now……. And now……. Version 1.0!

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36 Next steps… Performance testing and verify server load capabilities. Performance testing and verify server load capabilities. Polish interface, do usability testing and verify compliance with Section 508 regulations. Polish interface, do usability testing and verify compliance with Section 508 regulations. Long run: work with other central banks on common schema framework. Long run: work with other central banks on common schema framework. Release on the unsuspecting public! Target: Third quarter 2005 Release on the unsuspecting public! Target: Third quarter 2005

37 The last slide… Questions? Comments? Thank you for your attention! San Cannon (202)