Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Data Exchange Platform

Similar presentations


Presentation on theme: "Statistical Data Exchange Platform"— Presentation transcript:

1 Statistical Data Exchange Platform
MarketMap Analytic Platform

2 Agenda Introduction to MarketMap Analytic Platform
Sample Statistical Data Exchange Platform Economic Data Manager SDMX Driven Roadmap

3 MarketMap Analytic Platform
USERS MarketMap Analytic Language C Available 3rd Party Interfaces Out-of-the-box development and user interfaces Web Services Data Loaders Forecasting, Analysis & Modeling Environment Fast DB for time series data storage Built in Analytical platform Time intelligence APPLICATIONS Web Reports Onsite Server SQL Access Managed Data Services Pathfinder Cross Symbology MarketMap Analytic Platform

4 Key, Value Pair Time Series Data Storage
Vector object database with coupled analytical engine Store, retrieve, and manipulate large numbers of rapidly accessible facts ibm.close sp500.TotalReturn PCT(s sales) Apply a structured programming language geared towards manipulation of vector objects Btree The MarketMap Analytic Database is specifically designed for storage of time series data and treats each time series as a separate object. These time series objects have a set of meta data attributes that are used to describe the objects and provide unique information that is used in the manipulation of time series. The database is designed for optimal performance when working with time series data –whether correlating and regressing one issue's pricing series against another or SLICING through multiple issue's prices and results as of a point in time. The database consists of two parts, the database header and the database body. The database header is a B*-tree that is used to store the unique identifier for each object, the object name, as well as the meta data. In the body of the database are the observations themselves and also the attributes that are associated with the object and this is always stored contiguously on the disk. A coupled analytic engine supports a built in scripting language that allows you to perform very powerful analytical operations with very little code allowing you to do extremely rapid development as well as advanced ad-hoc analysis and prototyping. Data

5 Analytic layer with an embedded concept of time
Regular frequencies: events recorded millisecondly, daily, monthly, etc Pattern frequencies: specialized, but regular (e.g. market hrs) Aperiodic: for event-driven data capture (trades, corporate actions) Able to automatically convert frequencies Not all data must be physically stored on disc. Some objects can actually be virtual formulas that are evaluated at run-time based on an expression. Whether data is stored on disc as a hard physical object or it’s a virtual soft objects, all objects can take advantage of the built in time series analytics in the database. Embedded in the system is an inherent notion of time that makes it easy to take data of disparate periodicities and frequencies and line them up on a consistent basis. For example, the evaluation of the global formula PE could taking prices from a business database and divide those prices by EPS numbers from a quarterly or annual database. This ratio combining daily and annual could then be iterated over time on a monthly basis. This automatic frequency conversion comes "for free" with our ADBMS. GLOBAL FORMULAS Business Monthly Quarterly Return Automatic Frequency Conversions Monthly Monthly Monthly Monthly

6 Multiple Analytic Databases house all required data
This MarketMap Analytic database persists historical market data objects One can think our database as pure containers of independent objects. As discussed previously, each object is in turn a b-tree vector. In fact, the MarketMap Analytic Database can be thought of as a container of independent objects. Some objects are very simple – a scalar object representing the name of an issue. Others can be fatter, for example storing all prices for an issue since it went public. It's easy to combine, multiple, divide, correlate and regress one object against another. One can also open multiple containers on the server and let MAP locate the appropriate object based on its name – whether the US consumer price index values in an object named USA.CPI or IBM's daily trading volume values in an object named IBM.VOLUME. This MarketMap Analytic database persists historical macro economic objects

7 MarketMap Economic Data Manager
Extract, transform and load data stored in Excel workbooks Dual database system Time series data stored in FAME databases Meta data about these series stored in SQL container Web Access layer used to report and graph time series data and also display descriptive data about the series

8 Economic Data Collected in Excel Workbooks
SunGard recently worked with a central bank where economic data was stored and analyzed by regional offices throughout the country in workbooks very similar to this one, which tracks consumer price index related data for the months of December and January. Data tended to be “locked” and “isolated” rather than centralized. In addition, it was challenging to distribute this valuable data outside of Excel and via a Web portal. This customer also asked us SunGard to design an approach for storing the entity, item, time and other dimensions of economic queries with an SDMX-inspired metadata approach.

9 Common "access point" to Time Series & Metadata
Relational Database Client Applications Client Applications MAP Web Access Server MAP Database HTTP Request Web Service (WSDL) Many of the reporting features seen on this demo and in other economic reporting portals comes "out of the box" with a subscription to the MarketMap Analytic Platform For example, the MarketMap accessPoint layer – a servlet based middleware component – can accept incoming queries from a web browser and then assemble return results consisting of both relational data and time series data Service Request Remote MAP Web Access Server Web Browser Business Process Downstream Output Providers HTML / CSV / XML Time IQ / Result Set Proprietary Database

10 Statistical Data Warehouse Use Case

11 Sample statistical production process
Statistics Production Environment publications Collect Compile Disseminate web ESCB-Net/EXDI Production system (FAME) SDW ESCB-net SDMX Data Model

12 Data structures Statistical data can be grouped together at
the observation level (the measurement of some phenomenon); the series level (the measurement of some phenomenon over time, usually at regular intervals); the group level (a group of series – a well-known example being the sibling group, a set of series which are identical, except for the fact that they are measured with different frequencies); and the dataset level (made up of several groups, to cover a specific statistical domain for instance). Dimensions are grouped into keys, which allow the identification of a particular set of data (a series, for example). Key values are attached at the series level and given in a fixed sequence. Partial keys can be attached to groups. Key values are attached at the series level and given in a fixed sequence. Conventionally, frequency is the first descriptor concept and the other concepts are assigned an order for that particular dataset. Partial keys can be attached to groups

13 Example: Monetary aggregate M3
BSI.M.U2.N.V.M30.X.I.U Z01.A BSI = Key Family, M = Monthly series, U2 = euro area aggregate, N = Non-adjusted, V = MFIs + Govt, M30 = M3, X = Unspecified maturity, I = Index, U2 = residence of counterpart is euro area, 2300 = other residents sector, Z01 = denominated in all currencies, A = Annual growth/change

14 Use of a (SDMX-based) structured data format
In the exchange, storage and dissemination of all statistical data and associated metadata In the internal system and the communication with partner institutions and the general public (web services, SDMX-ML based extractions from the web site) Covering most domains of economic statistics (e.g. monetary and financial statistics, balance of payments, price indices, short-term statistics, real sector, government finance statistics, securities, etc.)

15 Browsing for data series
Organize object names and categories based on the SDMX standard

16 Demo: Key family search
In addition to loading Excel workbooks and creating Web-friendly reports, this custom solution allows economists to SEARCH for objects and build ad hoc queries that also incorporate analytical functions from FAME. Let’s switch over to the search option to find some data of interest. This particular customer asked SunGard to organize object names based a specific standards called SDMX which we will review later in the presentation. Essentially, it is a standard for organizing economic content to ease the sharing of data among central banks.

17 Demo: Context sensitive selection boxes
As an SDMX-inspired interface, the first task in finding data is to identify its KEY FAMILY and CHAPTER – both SDMX terms. Note that the dropdown boxes dynamically change as you change your selections. The SDMX-inspired categories also expand based on the choices made. Let’s click the SEARCH button to find some series.

18 Demo: Report on the January 2006 data
Let’s switch to the PRICES menu and click on the REVIEW navigation bar on the left. Given our working example, let’s select the CPI prices category. Of course, we should work with an appropriate date range. Let’s cover a timeframe of January 2004 to January 2006. When I click REFRESH, the Fame objects are queried and the data sets are returned as XML. The XML is then styled into a table report that is consistent with this institution’s look and feel. Note the use of SDMX-inspired dimension names to name the objects. Also note that this table can be quickly exported to Excel.

19 Demo: Detailed CPI search results
As you can see, this search found 483 CPI items. Let’s go back to our working date range and then view some details about particular consumer price index related data by clicking on the edit link.

20 Demo: Select objects of interest
Now let’s add a few items to a basket so that we can begin analyzing a group of indicators. Once again, let’s establish an appropriate working date range. The basket feature is similar to the shopping cart found in many online shopping sites, like Amazon.com. It allows the user to combine results from different searches for analysis at a later time. We can click on the VIEW button to work with these individual series.

21 View objects in monthly frequency
As you can see, the data for the three series are displayed based on our date range of January 2004 to January 2006.

22 Demo: View selected objects in annual frequency
This data is monthly. However, using Fame’s built-in time intelligence, we can easily change to quarterly and annual views of data.

23 Demo: Multiple ways of aggregating data
We can also change the observed attributes – for example, whether data should be summed when analyzed over time averaged over time. Using the SUMMED aggregation method yields large values as the data within the specified frequency is summed. Using the AVERAGED method yields smaller values as the data is averaged within the specified frequency. Fame’s built-in time intelligence provides fast and easy to use manipulation of economic data – no matter what the natural frequency or format of the data.

24 Demo: PCT Let’s reset the values back to monthly with the original observed attributes. The Fame database contains an entire library of analytical functions. This particular page displays a subset of those functions, including moving average and annual percent change. Notice that when we click on PCT or percentage change, another column is added to the report that displays the calculation. In addition to downloading data to Excel, the Fame Web Access Layer can also load the Fame 4GL graphing engine to help visualize data. Let’s close out of this report VIEW and GRAPH the data.

25 Demo: Graph selected objects
The Fame 4GL graphing engine is loaded on the web server making it possible to generate quick graphical images like this one analyzing three series over time. This was a brief overview of a solution leveraging the Fame Solution Stack and the expertise of Fame Professional Services. The Fame Solution Stack provides a means to: Parse and load data stored in Excel into a central warehouse Perform in database analytics on the data within the server Distribute and report the data via the Web

26 Demo: Administer user entitlements
In a centralized warehouse, it is important that supervisors be able to approve data before it is published. Click on the Administration tab to view maintenance of users with the Fame Web Access entitlement layer. After a workbook is parsed, it is loaded into a holding area where appropriate supervisors can approve the publishing of the Fame-based data to the entire enterprise. With the administration page, managers can grant access to individual users as well as individual data series. It also provides a facility to approve series that have been loaded into the system. With data now loaded and approved for distribution, we can query the Fame database.

27 SDMX Oriented Roadmap MarketMap Analytic Platform

28 Central Bank Use Case Using FAME since 1993 to not only store time series but also to produce statistics from primary to derived information, to disseminate statistical information, internally and on the internet, in different ways (statics, and dynamically) and formats, and, to exchange statistical information with national and international organizations using the SDMX standard. Core tools Hundreds of functions and procedures using Fame/4GL, TimeIQ and the C-HLI Working with FAME 9.3 and migrating to FAME 10.1

29 High priority database improvements
Integration of FAME vectors, formulas, functions and procedures with the SDMX information model and SDMX ML format Specific objects to manage data structural definitions (DSD) information Referential integrity between series names and the DSD Creation of a matrix object to deal with observation-level attributes New second sub-index referencing the attribute so that attribute series values and their observation level attributes can be stored together Concurrent database updating Current update mechanism locks database at the database level which prevents users from updating data at the same level concurrently

30 High priority end user improvements
Friendly menu-driven environment for end users & developers Take a page from SAS, eViews and others who provide a menu driven experience for building statistical studies Take a page from Eclipse or Visual Studio .NET who provide an integrated development environment complete with syntax wizards, auto completion facilities and debuggers Attribute searching facilities Wildcard searching on the information stored in attributes that perform with the same level of efficiency as the searching facilities for series name and series alias

31 Statistical Data & Metadata eXchange
SDMX is not just tagging of data with XML.  Its intent  is to be a standard for data interrogation (SDMX Registry)  and common format for publishing (SDMX XLM output) .  SDMX Data – Query Response provider in Web Access Included a POC SDMX registry. The Metadata for a few Forex based times series using SDMX structures were placed in this registry by hand. The Registry was implemented in MySQL. The SDMX Custom Service provider implemented the SDMX V1.0 standard. It included SDMX supported query response. The idea here is you hit the MYSQL that has the SDMX Registry to discover what data is available at that site. Create a second call that had the SDMX Data information. That second call for the data was in SDMX xml form and the CSP received that request, looked at the SDMX Registry which had a hard mapping to the Forex time series. The response was sent back using SDMX xml V1.0 format.


Download ppt "Statistical Data Exchange Platform"

Similar presentations


Ads by Google