The systems approach to official statistics Bo Sundgren 2010

Slides:



Advertisements
Similar presentations
Software Requirements
Advertisements

Towards a normalised, domain-independent model for modelling the contents of statistical data and associated metadata Or: How to design correct and globally.
Modelling the contents and structure of official statistics Or: How to design correct and globally consistent SDMX Data Structure Definitions Or: Navigating.
Defining Decision Support System
Designing statistical surveys and statistical systems – a complex decision process Bo Sundgren 2010
Lecture 6: Software Design (Part I)
Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
United Nations Economic Commission for Europe Statistical Division UNECE Training Workshop on Dissemination of MDG Indicators and Statistical Information.
IT Requirements Capture Process. Motivation for this seminar Discovering system requirements is hard. Formally testing use case conformance is hard. We.
Enhancing Data Quality of Distributive Trade Statistics Workshop for African countries on the Implementation of International Recommendations for Distributive.
Sept-Dec w1d21 Third-Generation Information Architecture CMPT 455/826 - Week 1, Day 2 (based on R. Evernden & E. Evernden)
Basic guidelines for the creation of a DW Create corporate sponsors and plan thoroughly Determine a scalable architectural framework for the DW Identify.
Lecture 5 Themes in this session Building and managing the data warehouse Data extraction and transformation Technical issues.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 8 Slide 1 System models.
Understanding Metamodels. Outline Understanding metamodels Applying reference models Fundamental metamodel for describing software components Content.
Software Requirements
Overview of Software Requirements
9 1 Chapter 9 Database Design Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
Course Instructor: Aisha Azeem
Chapter 9 Architecture Alignment. 9 – Architecture Alignment 9.1 Introduction 9.2 The GRAAL Alignment Framework  System Aspects  The Aggregation.
The use and convergence of quality assurance frameworks for international and supranational organisations compiling statistics The European Conference.
CZECH STATISTICAL OFFICE Na padesátém 81, CZ Praha 10, Czech Republic The use of administrative data sources (experience and challenges)
NSI 1 Collect Process AnalyseDisseminate Survey A Survey B Historically statistical organisations have produced specialised business processes and IT.
Chapter 4 System Models A description of the various models that can be used to specify software systems.
© 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 1 A Discipline of Software Design.
ITEC224 Database Programming
CSE 303 – Software Design and Architecture
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 09. Review Introduction to architectural styles Distributed architectures – Client Server Architecture – Multi-tier.
Met a-data Resources in Europe: within NSIs and from Dosis Projects Wilfried Grossmann Department of Statistics and Decision Support Systems University.
Eurostat Overall design. Presented by Eva Elvers Statistics Sweden.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
BAIGORRI Antonio – Eurostat, Unit B1: Quality; Classifications Q2010 EUROPEAN CONFERENCE ON QUALITY IN STATISTICS Terminology relating to the Implementation.
Architectural Design lecture 10. Topics covered Architectural design decisions System organisation Control styles Reference architectures.
ESS-net DWH ESSnet DWH - Metadata in the S-DWH Harry Goossens – Statistics Netherlands Head Data Service Centre / ESSnet Coordinator
Statistical databases in theory and practice Part III: Designing statistical databases Bo Sundgren
Chapter 7 System models.
Monitoring public satisfaction through user satisfaction surveys Committee for the Coordination of Statistical Activities Helsinki 6-7 May 2010 Steve.
Systems Analysis and Design in a Changing World, 3rd Edition
Sommerville 2004,Mejia-Alvarez 2009Software Engineering, 7th edition. Chapter 8 Slide 1 System models.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Use of Administrative Data Seminar on Developing a Programme on Integrated Statistics in support of the Implementation of the SNA for CARICOM countries.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Service Oriented Architecture CCT355H5 Professor Michael Jones Suezan Makkar.
Systems Analysis and Design in a Changing World, Fourth Edition
1 Capturing Requirements As Use Cases To be discussed –Artifacts created in the requirements workflow –Workers participating in the requirements workflow.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
CSE 303 – Software Design and Architecture
Foundations of Information Systems in Business. System ® System  A system is an interrelated set of business procedures used within one business unit.
Integrated metadata systems History Status Vision Roadmap
1 Statistical business registers as a prerequisite for integrated economic statistics. By Olav Ljones Deputy Director General Statistics Norway
Copyright 2010, The World Bank Group. All Rights Reserved. Managing processes Core business of the NSO Part 1 Strengthening Statistics Produced in Collaboration.
 To explain why the context of a system should be modelled as part of the RE process  To describe behavioural modelling, data modelling and object modelling.
System A system is a set of elements and relationships which are different from relationships of the set or its elements to other elements or sets.
Introduction to Service Orientation MIS 181.9: Service Oriented Architecture 2 nd Semester,
Public Policy Process An Introduction.
Statistical process model Workshop in Ukraine October 2015 Karin Blix Quality coordinator
United Nations Statistics Division Developing a short-term statistics implementation programme Expert Group Meeting on Short-Term Economic Statistics in.
11 Systems Analysis and Design in a Changing World, Fifth Edition.
METADATA MANAGEMENT AT ISTAT: CONCEPTUAL FOUNDATIONS AND TOOLS Istituto Nazionale di Statistica ITALY.
ROMA 23 GIUGNO 2016 MODERNISATION LAB - FOCUSSING ON MODERNISATION STRATEGIES IN EUROPE: SOME NSIS’ EXPERIENCES Insert the presentation title Modernisation.
United Nations Economic Commission for Europe Statistical Division CSPA: The Future of Statistical Production Steven Vale UNECE
Herman Smith United Nations Statistics Division
Development of Strategies for Census Data Dissemination
Towards more flexibility in responding to users’ needs
"Development of Strategies for Census Data Dissemination".
Measuring Data Quality and Compilation of Metadata
Statistical databases in theory and practice Part IV: Modelling the contents and structure of official statistics Bo Sundgren 2010.
Presentation transcript:

The systems approach to official statistics Bo Sundgren

My presentation Why a systems approach – and what does it mean? Some history Some different views of statistical systems Quality and efficiency of statistical systems Different perspectives on statistical systems: customers, stakeholders, organisation, information system (contents/data + processes), technical Modelling the contents of a statistical system: helicopter, zoom, contents by example, everything clickable Conclusion: Statistical systems are unperceivable systems (Langefors), and the systems approach is a strategy for coping with statistical systems without losing either overview or precision in details

My presentation Why a systems approach – and what does it mean? Some history Some different views of statistical systems Quality and efficiency of statistical systems Different perspectives on statistical systems: customers, stakeholders, organisation, information system (contents/data + processes), technical Modelling the contents of a statistical system: helicopter, zoom, contents by example, everything clickable Conclusion: Statistical systems are unperceivable systems (Langefors), and the systems approach is a strategy for coping with statistical systems without losing either overview or precision in details

Why a systems approach? Modern societies are complex and interdependent Governments demand more, better, and more coherent official statistics for decision-making, planning, and evaluation Businesses also demand good official statistics, national and international, to discover and exploit business opportunities Other stakeholders: analysts, researchers, students, journalists, and interested citizens But there are budget and time constraints forcing all producers of official statistics to economise with resources, especially in data collection Respondents expect producers to reduce the response burden, and to harmonise data collection for official statistics with the natural business processes of the data providers Hence a holistic approach is needed, the systems approach

Paradigms in the history of statistics Data about the state: bookkeeping of citizens and resources (from Farao and onwards) –statisticus (Latin) = statesman, politician –statistics = a comprehensive description of the social, political, and economic features of a state, Gottfried Achenwall in Staatsverfassung der heutigen vornehmsten europäischen Reiche und Völker im Grundrisse (1749) Statistics as a mathematically founded theory: probability theory (17th century and onwards) Probability-based sample surveys (20th century) Reuse of data from administrative registers and data archives Reuse of data generated by business processes and other processes on the Internet The systems approach: a holistic and combined approach

What is a system, and what is the systems approach? The systems approach is not a bad idea, according to Charles West Churchman in the summary statement of his book on the topic in 1968 von Bertalanffy focused on open systems (in biology) as opposed to closed systems (studied in physics) Open systems interact with other systems outside themselves through inputs and outputs A system and its environment are in general separated by a boundary, an interface The output of a system a direct or indirect result from the input But the system is not just a passive tube, but an active processor, a transformer The transformation of input into output by the system is usually called throughput

Figure 1. A system in interaction with its environment. From Heylighen (1998). Figure 2. A system as a transparent box, containing a collection of interacting subsystems, and as a black box, without observable components. From Heylighen (1998).

A system, its environment, and its parts The environment of a system consists of systems interacting with their environments A collection of such systems which interact with each other could again be seen as a system The mutual interactions of the component systems, or parts makes the system as a whole something more than the sum of its parts With respect to the whole, the parts are seen as subsystems With respect to the parts, the whole is seen as a supersystem If we look at a system as a whole, we need not be aware of all its parts; we can just look at its total input and total output: the black box view of a system When necessary, we may adopt the transparent box view, where we can see the system's internal processes

Different types of systems and views of systems open systems vs closed systems black box views vs transparent views hard systems vs soft systems –hard systems, e.g. technical systems, are typically associated with quantifiable variables –soft systems typically involve people and both quantifiable variables, and variables that are not easy to quantify; see, for example, Checkland (1981) and later works by the same author man-designed systems vs given systems (by ?) man-independent systems, e.g. buildings, vs man-dependent systems, e.g. enterprises – what Berger&Luckmann (1966) call social constructions of reality

My presentation Why a systems approach – and what does it mean? Some history Some different views of statistical systems Quality and efficiency of statistical systems Different perspectives on statistical systems: customers, stakeholders, organisation, information system (contents/data + processes), technical Modelling the contents of a statistical system: helicopter, zoom, contents by example, everything clickable Conclusion: Statistical systems are unperceivable systems (Langefors), and the systems approach is a strategy for coping with statistical systems without losing either overview or precision in details

Statistical systems open, soft, man-designed, reality as a statistical construction Figure 3. A statistical system in its environment. From Sundgren (2004a).

Statistical systems: one view Figure 4. Control and execution of a statistical production system. Metadata and paradata (process data). Feedback. From Sundgren (2004a).

Figure 5. Basic operations in a database-oriented statistical production system. From Sundgren (2004a). Statistical systems: another view

Figure 6. Statistics production with integrated data and metadata management. From Sundgren (2004a). Statistical systems: yet another view

My presentation Why a systems approach – and what does it mean? Some history Some different views of statistical systems Quality and efficiency of statistical systems Different perspectives on statistical systems: customers, stakeholders, organisation, information system (contents/data + processes), technical Modelling the contents of a statistical system: helicopter, zoom, contents by example, everything clickable Conclusion: Statistical systems are unperceivable systems (Langefors), and the systems approach is a strategy for coping with statistical systems without losing either overview or precision in details

Quality and efficiency of statistical systems quality as fitness for purpose: the totality of features and characteristics of a product or service that bear on its ability to satisfy stated or implied needs (ISO 8402, 1994) quality components (Eurostat, 2003): –relevance –accuracy –timeliness and punctuality –accessibility and clarity –comparability –coherence quality of individual surveys vs quality of statistical systems –special purpose vs general purpose –known needs vs unknown needs –present needs vs future needs efficiency needs for new methods and methodologies?

My presentation Why a systems approach – and what does it mean? Some history Some different views of statistical systems Quality and efficiency of statistical systems Different perspectives on statistical systems: customers, stakeholders, organisation, information system (contents/data + processes), technical Modelling the contents of a statistical system: helicopter, zoom, contents by example, everything clickable Conclusion: Statistical systems are unperceivable systems (Langefors), and the systems approach is a strategy for coping with statistical systems without losing either overview or precision in details

Statistical systems from different perspectives the customers perspective, the statistical system as seen from the perspective of customers, or users of statistics the stakeholders perspective, including also sponsors, respondents, etc the organisational perspective, how a statistical organisation is set up for a statistical system, e.g. the statistical system of a country or an international organisation the information systems perspective – statistical systems in general, and systems for production of official statistics in particular, are examples of information systems, systems for obtaining, storing, processing, and communicating information the contents perspective, how the statistical system relates to some kind of theoretical framework, where the data from the statistical system is used, e.g. the system of national accounts (SNA) the data/process perspective, a dualistic pair of views, where the data view focuses on the logical organisation of the data and metadata obtained, stored, transformed, and produced by the statistical system, and the process view focuses on the processes for obtaining, storing, transforming, and producing statistical data and metadata the technical perspective, the hardware/software environment, where the statistical system is implemented and running

Customers - Users The customer in focus The customer is always right How to reconcile conflicting interests between many customers, some of whom are even unknown, in a general-purpose system? Figure 7. How to satisfy partially conflicting needs of different users of official statistics? From Sundgren (2004a).

Stakeholders Customers – Users Respondents Sponsors (the taxpayers and their representatives) Producing staff: designers, operators, managers, business partners and professional colleagues Researchers on official statistics Others concerned, victims – cf group privacy

Organisation Figure 8. Different ways of organising a statistical organisation. Why have a statistical agency at all? Better use of the potentials of the organisation Better use of data and production systems Emerging: Network organisations

Organisation Figure 9. Four ways of organising tasks and responsibilities in a statistical organisation.

Emerging network organisations Coordination (in one network node) of loosely coupled, services in other nodes of the network Cf the Internet Cf Service-Oriented Architecture (SOA) Services should be welldefined and easy to use through standardised messages There may be many coordination nodes using overlapping sets of service nodes Some of the coordination nodes may be competing with each other (cf different Internet shops using partially the same warehouses, transportation firms, etc)

Information systems: data/process dualism Figure 16. Fundamental data/metadata interfaces and metadata objects in a statistical production system.

Process view Figure 17. A process view of a system for production of official statistics.

Technical perspective: Service-Oriented Architecture (SOA) Design principles: Loose coupling – Services maintain a relationship that minimises dependencies and only requires that they retain an awareness of each other. Service contract – Services adhere to a communications agreement, as defined collectively by one or more service descriptions and related documents. Autonomy – Services have control over the logic they encapsulate. Abstraction – Beyond what is described in the service contract, services hide logic from the outside world. Reusability – Logic is divided into services with the intention of promoting reuse. Composability – Collections of services can be coordinated and assembled to form composite services. Statelessness – Services minimise retaining information specific to an activity. Discoverability – Services are designed to be outwardly descriptive so that they can be found and assessed via available mechanisms.

My presentation Why a systems approach – and what does it mean? Some history Some different views of statistical systems Quality and efficiency of statistical systems Different perspectives on statistical systems: customers, stakeholders, organisation, information system (contents/data + processes), technical Modelling the contents of a statistical system: helicopter, zoom, contents by example, everything clickable Conclusion: Statistical systems are unperceivable systems (Langefors), and the systems approach is a strategy for coping with statistical systems without losing either overview or precision in details

Modelling the contents of a statistical system A statistical system must be able to provide relevant information contents to its customers But how can customers and designers of statistical systems find out and specify in a precise way, what is relevant contents, given often rather broadly and vaguely defined information needs. And how can they reconcile conflicting information needs within the same model There are different approaches to this problem. One may distinguish between theory-driven and data-driven approaches; see for example Hox (1997) In practice different approaches are combined and iterated

Theory-driven approaches derive the desirable information contents from theories and analytical models based on these theories, for example the analytical models and macroeconomic theories behind the System of National Accounts (SNA) top-down

Data-driven approaches may take their starting-point from existing data within fields and sectors of society that are relevant to the customers of the statistical system to be designed these data are carefully defined and described in descriptive data models these models usually need to be further harmonised between themselves in order to become useful as subdomains of an integrated data model bottom-up

A practical approach brainstorming to create a tentative conceptual (data) model based on existing theories and existing or foreseen data Sundgren (1973), Rosén&Sundgren (1991), Sundgren (2005), Sundgren (2006), and Sundgren (2007) useful both as specification and documentation the conceptual model may very easily (even automatically) be transformed into a physical data model, e.g. a relational data model to be implemented by relational database software microdata views and macrodata views

Macrodata view On a general level all official statistics are –estimated values of parameters of populations of objects (statistical units) where the parameters are –summarised (aggregated) values of variables of the individual objects in the populations; aggregation function: sum, average, correlation, etc Figure 10. A statistical cube based on observations of trade transactions.

Macrodata view Very often the population (e.g. a population of Persons) is broken down into subpopulations, or domains of interest, by crossclassifying the objects in the population by means of a number of classification variables (e.g. Sex, Region, AgeGroup) Data about the population and its subpopulations may be thought of as belonging to cells in a multidimensional cube, a hypercube, where the dimensions of the hypercube are spanned by the classification variables defining the subdomains of the population, and where the cells contain estimated values of the parameters for the respective subdomains obtained by crossclassifying the population, corresponding to the whole cube. Sundgren (2001) describes and explains this so-called -model of multidimensional statistical data.

Microdata view A user of official statistics may not be able to state exactly which parameters of which populations she is interested in, but faced with a short list of object types (or types of populations), and/or topics, and/or parameters/variables, she may be able to select a subset of official statistics of potential interest by selecting, step by step, a subset of object types (populations), variables, and parameters. In order to provide a user with short lists of object types and variables as a starting-point for the users drill-down operations, we must be able to give an overview of the contents of statistics in terms of a small number of concepts. Figure 11. Population statistics: a basic model.

Microdata view The populations occurring in official statistics are based upon a number of basic object types. Some of these object types may be described as (conscious) actors, objects that are capable of purposeful acting, e.g. persons and organisations. Other objects are acted upon by the actors but are not capable of purposeful acting themselves, e.g. natural resources, products, assets; a common label for basic objects of this kind is things (in a broad sense) or utilities. In addition to the basic object types there are different kinds of complex object types, relating one of more basic objects. Examples: events/transactions (instantaneous), and relationships/processes/activities, lasting for some time. Figure 11. Population statistics: a basic model.

Generic model of the contents of official statistics All objects are described by means of variables. Some variables are classification variables, e.g. Sex of Person. Other variables are summation variables, e.g. Income of Person. By using the relations between the objects, one may define derived variables, adjoined variables, such as DwellingSize of Person = Size of Dwelling of Residence of Person. By using summation operators on summation variables, one may derive values of parameters of populations or domains of objects of a certain type, e.g. average(Income of Persons). The two types of derivations may also be combined, even repeatedly and recursively. Claim: The contents of all branches of official statistics can be expressed as specialisations of this generic model. This has been verified in a large number of practical examples, and no counter-examples have been found.

UNESCO example, version 1

What can a statistical agency do, in order to help a user - find potentially relevant statistical data? - judge the relevance of data retrieved? Provide overviews of available data Provide search tools Provide informative metadata

Helicopter overview + zoom (cf Google Maps, Google Earth, etc)

Contents By Example (based on a simple generic model) Actors Utilities Complex objects

Everything clickable OBJECT VARIABLE Lefthand click Righthand click Select: - object - variable Retrieve metadata: - definition - value set, classification - questionnaire - quality declaration - survey documentation

My presentation Why a systems approach – and what does it mean? Some history Some different views of statistical systems Quality and efficiency of statistical systems Different perspectives on statistical systems: customers, stakeholders, organisation, information system (contents/data + processes), technical Modelling the contents of a statistical system: helicopter, zoom, contents by example, everything clickable Conclusion: Statistical systems are unperceivable systems (Langefors), and the systems approach is a strategy for coping with statistical systems without losing either overview or precision in details

Unperceivable systems (Langefors) (imperceivable, imperceptible) In summary, the systems approach is a paradigm for managing the design, operation, and evaluation of unperceivable systems, systems which are too big and complex for the human brain to grasp in one go Statistical systems are such systems Combination of top-down and bottom-up, overview and detail, comprehensiveness and precision, soft and hard