United Nations Economic Commission for Europe Statistical Division International Collaboration to Modernise Official Statistics Steven Vale UNECE
Introducing UNECE Statistics
UNECE Statistics: Priorities Population censuses, migration, Millennium Development Goals Globalisation, National Accounts, employment, business registers Sustainable development, environmental accounts, climate change Modernisation
Introducing the HLG High-level Group for the Modernisation of Statistical Production and Services Created by the Conference of European Statisticians in 2010 Vision and strategy endorsed by CES in 2011/2012
Who are the HLG members? Pádraig Dalton (Ireland) - Chairman Trevor Sutton (Australia) Wayne Smith (Canada) Emanuele Baldacci (Italy) Bert Kroese (Netherlands) Park, Hyungsoo (Republic of Korea) Genovefa Ružić (Slovenia) Walter Radermacher (Eurostat) Martine Durand (OECD) Lidia Bratanova (UNECE)
What does the HLG do? Oversees activities that support modernisation of statistical organisations Stimulates development of global standards and international collaboration activities “Within the official statistics community... take a leadership and coordination role”
Why is the HLG needed? Before the HLGNow Many expert groupsClear vision Little coordinationAgreed priorities No overall strategyStrategic leadership Limited impactReal progress
The Challenges
These challenges are too big for statistical organisations to tackle on their own We need to work together
Using common standards, statistics can be produced more efficiently No domain is special! Do new methods and tools support this vision, or do they reinforce a stove-pipe mentality?
HLG 30+ Expert Groups Projects
2012
What is GSIM? A reference framework of information objects It sets out definitions, attributes and relationships of information objects It aligns with relevant standards such as DDI and SDMX
A step back: The GSBPM
GSIM and GSBPM GSIM describes the information objects and flows within the statistical business process.
Clickable GSIM
2013
Standards-based Modernisaton % 43% 34,600
Simplified information objects Incorporates revised Neuchâtel Model of classification terminology Better aligned with other standards, particularly DDI Outcomes
GSBPM v5.0
Main Changes Phase 8 (Archive) removed Archiving can happen at any stage in the statistical production process New sub-process "Build or enhance dissemination components" Clearer distinction between detection and treatment of errors Sub-processes re-named to improve clarity Descriptions of sub-processes improved Terminology is less survey-centric
Mappings Fundamental Principles of Official Statistics
Frameworks and Standards Project
Problem statement: Specialised business processes, methods and IT systems for each survey / output
Applying Enterprise Architecture Disseminate
... but if each statistical organisation works by themselves...
... we get this...
.. which makes it hard to share and reuse!
… but if statistical organisations work together to define a common statistical production architecture...
... sharing is easier!
CSPA development project ArchitectureProof of Concept
The Proof of Concept 5 countries built CSPA services 3 countries implemented them
Project Outcomes The CSPA approach works It promises increased: sharing interoperability collaboration opportunities Some licensing issues!
United Kingdom 5 Build teams 42 individuals 2 Sprints 3 Assemble teams 1 Working Group FAO
2014
Implementation of the CSPA
Services built 1. Seasonal adjustment – France, Australia, New Zealand 2. Confidentiality on the fly – Canada, Australia 3. SVG generator – OECD 4. SDMX transform – OECD 5. Sample selection – Netherlands 6. Linear error localisation – Netherlands 7. Linear rule checking – Netherlands 8. Error correction – Italy
Architecture Working Group: Australia, Austria, Canada, France, Italy, Mexico, Netherlands, New Zealand, Turkey, Eurostat Catalogue team: Australia, Canada, Italy, Hungary, New Zealand, Romania, Turkey, Eurostat
Specify Needs DesignBuildAnalyseDisseminateEvaluateProcessCollectDisseminate Sharing Infrastructure across Statistical Organisations By sharing infrastructure development, we can: Reduce costs of development Adopt new methods quickly Increase comparability of statistics
Big Data
In the last 2 years more information was created than in the whole of the rest of human history!
Physical sprint Consultation Virtual sprint Task teamsGuidelines
Priority Areas Partnerships – Guidelines Privacy – Guidelines Quality – Guidelines Skills – Survey IT / methodological issues - Sandbox
Sandbox Irish Centre for High-end Computing is hosting a Big Data ‘sandbox’ containing data and tools for international experiments “Play is the highest form of research” – Einstein
Sandbox: Aims Test feasibility of remote access and processing: - Could this approach be used in practice? Test whether existing statistical standards / models / methods can be applied to Big Data Determine which Big Data software tools are most useful for statistical organisations Learn about the potential uses, advantages and disadvantages of Big Data – “learning by doing”. Build an international collaboration community on the use of Big Data
2015
Key priorities agreed last month CSPA Develop many more services Support implementation by effective governance and appropriate technical standards Big Data Continue the sandbox Produce joint international outputs Develop skills and capacity to work with new sources to support the “Data Revolution”
How to work together for minimum cost and maximum benefit?
The “Sprint” method
Virtual meetings We use Webex – others are available Flexibility Free for participants Join meetings from office, home, airport etc. Screen sharing Virtual whiteboard
Wikis Central repository of information Latest versions and comments in one place Good for joint drafting of papers Access anywhere with a web connection Can be public or restricted
Governance
HLG Activities – Engagement Map
Get involved! Anyone is welcome to contribute! More Information HLG Wiki: www1.unece.org/stat/platform/display/hlgbas LinkedIn group: “Business architecture in statistics”