GSBPM, GSIM, and CSPA
Overview Acronyms Background
Acronyms It’s all about acronyms: UNECE- United Nations Economic Commission for Europe CES – Conference of European Statisticians HLG – The High Level Group… GSBPM – Generic Statistical Business Process Model GSIM – Generic Statistical Information Model CSPA – Common Statistical Production Architecture DDI – Data Documentation Initiative SDMX – Statistical Data and Metadata Exchange
Organizational Acronyms UN/ECE – The United Nations Economic Commission for Europe Based in Geneva (Palais de Nations) Not strictly European in Scope CES – The Committee of European Statisticians Coordinates statistical activity among national and supra-national statistical agencies HLG – The High Level Group for the Modernization of Statistical Production and Services What do you think?
Acronyms for Standards and Models GSBPM – Generic Statistical Business Process Model Conceptual process model Originally produced by the METIS Workshop Now a product of the HLG (version 5.0) GSIM – Generic Statistical Information Model Conceptual information model Produced by HLG Now in version 1.1 CSPA – Common Statistical Production Architecture Shared architecture for interoperable services DDI – The Data Documentation Initiative Implementation model in XML Produced by the DDI Alliance Now in version 3.3 SDMX – The Statistical Data and Metadata Exchange Implementation model in UML, XML, and other formats Produced by the SDMX Initiative Now in version 2.1
Background
The Perceived Problem In the age of Google, “Big Data”, and Open Data, statistical agencies fear they may become irrelevant The demand for data is huge, but traditional statistical production is relatively slow, and very expensive Budgets are shrinking, but the legal requirements for official data are not changing
Addressing the Problem The HLG was formed to address this problem New sources of data Increased use of registers Possible use of “big data” Modernization of statistical production Based on standard models Get away from hand-crafted statistics Emphasis on metadata-driven processes Shared software “Plug-and-play” services Shared architecture
GSBPM
A Reference Model for Process Over the past few years, the CES community produced a “reference model” describing the statistical production process GSBPM – the Generic Statistical Business Process Model Very effective in allowing statistical agencies to describe their processes, and to communicate among themselves Defines a non-linear process model, with clear terms and definitions
GSBPM
GSBPM Detail GSBPM 5_0.docx
GSBPM Implementation More than 50 statistical agencies have implemented GSBPM Used as the basis for analyzing the processes which are used for statistical production locally Additions, edits, and deletions are made to the standard GSBPM to produce a national version For example, The “Norwegian Statistical Business Process Model” Sequence of steps may be ordered to produce a linear description of processes
Using GSBPM with Other Standards GSBPM can be implemented directly using DDI XML Lifecycle Events These link to an external process model The DDI Alliance has even taken GSBPM and used it to describe longitudinal research: LGBPM GSBPM can be implemented using the SDMX Process Model Process Steps link to specific GSBPM steps
Success Leads to Success… Following the GSBPM model. The HLG decided to create an information model… GSIM – the Generic Statistical Information Model One year to produce version 1.0 in a series of “sprints” Version 1.1 delivered December 2013 A reference model for data and metadata
GSIM
GSIM Version 1.0 was organized into four areas:
GSIM History GSIM first developed in 2012 through a series of 2-week “Sprints” Development timeline was a year 3 sprints: Slovenia, South Korea, Netherlands Lots of conference calls Produced GSIM 1.0 in December 2012 GSIM was revised in 2013 1-year timeline Conference calls and sprint in Geneva at UN/ECE Produced GSIM 1.1 in December 2013
Version 1.1 Version 1.1 re-organized and simplified the model: Task-based models Identifying statistical needs Managing statistical programs Designing and running statistical processes Collecting, processing, and disseminating statistical information “Foundational” information Concepts, populations, codelists and classifications, variables, data sets, quality metadata Neuchatel Classification model is now part of GSIM DDI 4 is aligning with the GSIM model Some parts of DDI 3.2 and 3.3 were influenced by GSIM
Aligning with GSBPM GSIM models all of the information needed for the statistical production process It therefore provides full support for GSBPM It does not require the use of GSBPM It can function as a stand-alone model or with any business process model
“Full” GSIM Model
Resources There is an easy-to-use “Clickable GSIM” available on the UN/ECE site: http://www1.unece.org/stat/platform/display/GSIMclick/Clickable+GSIM Users can add “views” Specification is available Includes some profiles of DDI for implementing GSIM (more to come)
RAIRD: An Example of a GSIM Implementation
CSPA
Statistical Architecture To increase efficiency, statistical agencies want to share IT development resources by sharing services CSPA – Common Statistical Production Architecture “Plug and play” services using standard interfaces Standard architecture model was developed CSPA Services Catalog has been developed
Services today are hard to share and reuse! Canada Collect Process Analyse Disseminate Sweden ?
…but if statistical organisations work together? HLG sponsor project: Desired Project Outcomes Increased: interoperability in Official Statistics through the sharing of processes and components ability to find real/genuine collaboration opportunities ability to make international decisions and investments sharing of architectural/design knowledge and practices
This makes it easier to share and reuse! Collect Process Analyse Disseminate ? Sweden Canada
CSPA: Three Major Parts CSPA Architecture Now in version 1.1 CSPA Services Produced by many different countries Same service shared/implemented by multiple stats agencies (“plug and play”) CSPA Services Catalog So agencies can see what services are being developed using the CSPA architecture
Three Levels of Abstraction
CSPA Architecture Very much a “service-oriented architecture” (SOA) Individual pieces of application functionality implemented as services Services are loosely coupled Functionality uses GSBPM to describe what functions a service performs Interfaces (inputs, outputs) are described using GSIM
CSPA is Platform-Independent Like many SOA architectures, CSPA is intended to work for any technology platform Standard interfaces implemented in platform-independent syntaxes (XML, JSON) Communications between services are handled by the implementing agency omn whatever platform they choose Windows, Linux, Mac, etc.
Services and Communication Platform
CSPA Services Development 2013 – Initial year-long proof of concept 2014 – Year-long production prototyping 2015 – Initial full production services development CSPA is developed using the same “sprint” mechanism as GSIM 1 or 2-week meetings, with delivery at the end of the year
Roles for Developing CSPA Services
CSPA Proof of Concept A number of shared services were developed in 2013 Based on DDI interfaces “Wrapping” existing production processes The POC was a success! Proved the idea is feasible Feedback to DDI is being incorporated into the DDI 4 development
Prototyping More complete services were developed Feedback from 2013 incorporated into a revised CSPA architecture Provided additional confidence in the idea Initial services catalog was developed
Looking Forward HLG has now focused on two areas: Big Data (from a statistical office perspective) More CSPA services Several are now being developed, based not only on DDI but also on SDMX Emphasis is on production DDI-related work is now being coordinated by the Modernization Committee on Standards Emphasis is placed on having standard profiles and mappings from GSIM to DDI Work now being organized for 2014 DDI Alliance is represented on this committee CSPA is also working on a GSIM-based implementation model
CSPA Services Catalog - Layers
The Catalogue http://www1.unece.org/stat/platform/display/CSPA/CSPA+Global+Artefacts+Catalogue
Questions?