Download presentation
Presentation is loading. Please wait.
1
The FRB and XML: National data and International standards San Cannon Federal Reserve Board IASSIST 2005
2
2 Background: The Fed is a statistical agency as well as a central bank and regulatory agency. Lots of data and information are available on the public website. Lots of data and information are available on the public website. Statistical data is varied: Monthly industrial production indexes (non-financial), daily interest and exchange rates (financial) and quarterly financial flows for various sectors of the economy, surveys of small businesses and consumers, etc. Statistical data is varied: Monthly industrial production indexes (non-financial), daily interest and exchange rates (financial) and quarterly financial flows for various sectors of the economy, surveys of small businesses and consumers, etc.
3
3 The different roles are often competing interests... Sometimes it seems that the statistical agency role is secondary. Data are not always easy to find. Data are not always easy to find. Downloads are not customizable. Downloads are not customizable. Example: Trying to extract one industrial production series: Requires two text files, cutting and pasting, reformatting…. Example: Trying to extract one industrial production series: Requires two text files, cutting and pasting, reformatting…. All or nothing approach. All or nothing approach. Complete – yes. User Friendly – no. Complete – yes. User Friendly – no.
4
4 Other agencies making great strides : Bureau of Economic Analysis has wonderful tabling capabilities: www.bea.gov Bureau of Economic Analysis has wonderful tabling capabilities: www.bea.govwww.bea.gov Bureau of Labor Statistics has query screens, series select screens and frequently requested statistics: www.bls.gov Bureau of Labor Statistics has query screens, series select screens and frequently requested statistics: www.bls.govwww.bls.gov
5
5 Taking an extra step: We wanted to build something forward looking; XML was identified early on. Most flexible and seems to be the trend for future. Most flexible and seems to be the trend for future. Financial data already heading that way: FinXML, FpML (financial product ML), MDDL (Market data definition language), XBRL (eXtensible Business reporting language) Financial data already heading that way: FinXML, FpML (financial product ML), MDDL (Market data definition language), XBRL (eXtensible Business reporting language)
6
6 How do we do it? Build our own XML definitions: Build our own XML definitions: - Pro: would fit our data perfectly - Con: we’d be the only ones Use financial definitions: Use financial definitions: - Pro: lots of others use them - Con: we have nonfinancial data Try SDMX (Statistical Data and Metadata eXchange) : Try SDMX (Statistical Data and Metadata eXchange) : - Pro: designed for time series data - Con: new kid on the block
7
7 But nothing goes smoothly at first: SDMX is based on ‘key families’ and codelists where every concept can be represented by a code with a corresponding definition in a list: HBBA Int. Rate, Official, Discount rate/Base rate HBCA Int. Rate, Official, Intra-day loans SCBA Indust. Production, Motor vehicles, NSA SCBB Indust. Production, Motor vehicles, SA
8
8 We think about data differently The Fed uses mnemonic series names where each character in our series name has meaning and names are hierarchical. The Fed uses mnemonic series names where each character in our series name has meaning and names are hierarchical. RIFSPFF_N.BR.*:Rate R.I.*:Rate of interest in money and capital markets R.I.F.*:Federal Reserve System R.I.F.S.*:Short-term or money market R.I.F.S.P.*:Private securities R.I.F.S.P.FF.:Federal funds _N.:Not seasonally adjusted.B:Business (Five days, Monday-Friday) JQI_I02Y3361T3_N.M: J.*:Indices except of prices J.Q.*:ProductionJ.Q.I.:Industrial _I.*:NAICS-based industry classification 02Y:codes from year 2002 3361.:Motor Vehicle Manufacturing T:thru 3363:Motor Vehicle Parts Manufacturing _N.:Not seasonally adjusted.M:Monthly
9
9 Fitting a square peg in a round hole…. Data represented by a concrete number of concepts are much easier to represent with key family dimensions and attributes: Data represented by a concrete number of concepts are much easier to represent with key family dimensions and attributes: Q.SCBA.GB.92 → Freq.Topic.Country.BIS code M.HBBA.US.01 → Freq.Topic.Country.BIS code Hierarchical relationships and varying number of concepts makes life more difficult – a single key family isn’t possible: Hierarchical relationships and varying number of concepts makes life more difficult – a single key family isn’t possible: JQI_I02YMF_N.M → Topic_Industry_SA.Freq RIFSPPNA2P2D30_N.B → Topic?_SA.Freq
10
10 SDMX only provides a framework: We still needed to build the actual schemas to describe our data within the SDMX metaschema framework. We still needed to build the actual schemas to describe our data within the SDMX metaschema framework. Each data release uses its own schema or set of schemas. Each schema is based on a key family used to describe the data. Each data release uses its own schema or set of schemas. Each schema is based on a key family used to describe the data. Currently, our schemas are tailored to meet our data needs. Currently, our schemas are tailored to meet our data needs.
11
11 Storage adds further complications: We need to store data and metadata in a database to be retrieved with queries. We need to store data and metadata in a database to be retrieved with queries. Native XML databases in their infancy. Native XML databases in their infancy. We couldn’t find many people storing XML tagged data in relational databases We couldn’t find many people storing XML tagged data in relational databases
12
12 So what did we end up with? Data model is hybrid: tree structure flattened to fit codelist setup. Data model is hybrid: tree structure flattened to fit codelist setup. We store the XML as carefully sliced text in a relational database and we can build an index structure that allows us to respond to ad-hoc queries very efficiently, even for large volumes of data.
13
13 This kind of structure:
14
14 Looks like this in SDMX-ML: Commercial Paper Outstandings Commercial Paper Outstandings
15
15 Which gets stored like this:
16
16 And the end result? The Data Download Project (DDP) is the largest, most complex application on the Board’s public website. The Data Download Project (DDP) is the largest, most complex application on the Board’s public website. It’s also the first production application to deliver customized data extracts in SDMX format. It’s also the first production application to deliver customized data extracts in SDMX format. And now……. And now……. Version 1.0!
17
17
18
18
19
19
20
20
21
21
22
22
23
23
24
24
25
25
26
26
27
27
28
28
29
29
30
30
31
31
32
32
33
33
34
34
35
35
36
36 Next steps… Performance testing and verify server load capabilities. Performance testing and verify server load capabilities. Polish interface, do usability testing and verify compliance with Section 508 regulations. Polish interface, do usability testing and verify compliance with Section 508 regulations. Long run: work with other central banks on common schema framework. Long run: work with other central banks on common schema framework. Release on the unsuspecting public! Target: Third quarter 2005 Release on the unsuspecting public! Target: Third quarter 2005
37
37 The last slide… Questions? Comments? Thank you for your attention! San Cannon scannon@frb.gov (202) 452-3710
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.