Download presentation
Presentation is loading. Please wait.
Published byMilo Collins Modified over 9 years ago
1
Business needs and context for DDI and SDMX ESS DDI/SDMX Workshop 2013.06.05
2
Overview I.Why are common standards important? II.DDI and SDMX : The relationship so far…. III.Better to have one standard than two? IV.Will everyone need to learn two standards? V.ABS approach, so far…. VI.Discussion VII.Optional bonus material I.The challenge II.Application III.Use cases
3
I.Why are common standards important? II.DDI and SDMX : The relationship so far…. III.Better to have one standard than two? IV.Will everyone need to learn two standards? V.ABS approach, so far…. VI.Discussion VII.Optional bonus material I.The challenge II.Application III.Use cases
4
Standard – ISO definition … provide rules, guidelines or characteristics for common and repeated use for activities [eg production of official statistics] or their results [eg statistical products and services] aimed at the achievement of the optimum degree of order in a given context
5
Standards promote interoperability “interoperability” – ability of diverse systems and organizations to work together (inter-operate). different subject matter domains and functional teams within an agency specific collaborations between agencies, and/or as an industry (ESS vision, HLG vision) Levels of interoperability – Technical – Semantic – Organisational (eg business process alignment) – Legal
6
Efficiencies Interoperability leads to economies of scale – supports shared development, deployment and evolution of the processes, methods, IT components and information which represent the “means of production” possible vendor interest In addition, standards reduce – non productive decision making processes which adds cost and time to projects – unnecessary diversity which adds cost (eg training, maintenance, lost opportunities) longer term
7
ESS.VIP programme Transformation programme for the modernisation of the production systems in the European Statistical System (ESS) through: moving towards more common solutions and shared services and environment economies of scale and efficiency gains, sharing costs
8
ESS.VIP business and information principles Maximum reuse of existing process components and segments Metadata driven processes allowing adaptation and extension to other contexts New business process built as a sequence of modular process steps / services Information objects structured according to available information models and stored in corporate registries/repositories in view of reuse Adherence to industry and open standards as available (e.g. Plug & Play)
9
Metadata Driven Business Processes systematic and consistent use of metadata to determine the inputs, outputs and behaviour of a statistical business process Characteristics 1.Metadata is used systematically Metadata is used in a planned and managed way across the organisation. 2.Metadata is used consistently Authoritative ‘single source of truth’ metadata is used throughout the end-to-end lifecycle of an activity and/or across activities. 3.Metadata is used actively Metadata is used to guide definition and automate execution of statistical processes Metadata is structured so as to be machine-consumable
10
Metadata Driven Business Processes Operational benefits include Reduced time and cost of statistics production Improved quality of statistical products Increased agility in meeting new demands for statistical products and services. Increased agility in harnessing new sources of statistical data. Strategic benefits Provides a basis for designing and sharing components which can be configured flexibly, using agreed business objects, to meet diverse needs and operate in diverse environments This is particularly relevant when defining the information interfaces and business behaviours of such components Supports standards based industrialisation / modernisation
11
IATA : International Air Transport Association Founded 1945 2004 : Simplifying the business 5 initiatives to save $6.5 billion per year Includes Bar Coded Boarding Pass
12
Information Models Standards Objectives: To ensure that ESS.VIP have access to a set of agreed-upon standards supporting the modernisation of statistical production processes. To increase coherence between standards, at the same time ensuring that these are consistent with best practices and recommendations from the international community. To define information models that can be used across the ESS to model structural metadata for micro-data and aggregated data. To set up guidelines for designing and documenting business processes. To provide support mechanisms (e.g., capacity-building and training).
13
I.Why are common standards important? II.DDI and SDMX : The relationship so far…. III.Better to have one standard than two? IV.Will everyone need to learn two standards? V.ABS approach, so far…. VI.Discussion VII.Optional bonus material I.The challenge II.Application III.Use cases
14
The journey beings…. Importance and value of agreeing on common standards for metadata to be used structurally in the statistical production process has been recognised since at least 2009. Many existing standards were identified which could provide useful support for specific purposes Two existing standards were identified as providing broadest (not necessarily comprehensive) support – DDI and SDMX
15
Characterizing the Standards: DDI DDI Lifecycle can provide a very detailed set of metadata, covering: –The study or series of studies –Many aspects of data collection, including surveys and processing of microdata –The structure of data files, including hierarchical files and those with complex relationships –The lifecycle events and archiving of data files and their metadata –The tabulation and processing of data into tables (Ncubes) It allows for a link between microdata variables and the resulting aggregates
16
Characterizing the Standards: SDMX Describes the structure of aggregate/dimensional data (“structural metadata”) Provides formats for the dimensional data Provides a model of data reporting and dissemination Provides a way of describing and formatting stand- alone metadata sets (“reference metadata”) Provides standard registry interfaces, providing a catalogue of resources Provides guidelines for deploying standard web services for SDMX resources Provides a way of describing statistical processes
17
The SDMX-DDI approach Informal meetings (2010-2013) between members of SDMX and DDI communities Initiative of the SDMX Secretariat through its Technical Working Group Approach to using SDMX and DDI interchangeably Now, we are at the stage where implementations are being investigated and prototyped –Not “if”, but “how”
18
DDI SDMX An initial broad overlay on GSBPM (2010)
19
GSBPM, DDI and SDMX: towards a complete system? DDI SDMX
20
DDI offers a very rich model for the documentation of micro- data SDMX offers a very integrated exchange platform for statistical outputs (IT architectures, tools, web services) DDI and SDMX The combined use of both standards could allow a higher level of integration of the complete production process But: The devil is in the detail!
21
Dealing with the devil…. Need to consider other context for a business process – eg are you are collecting, processing and analysing macrodata or microdata? Both standards have the capacity to be “stretched” to support many things. – eg SDMX Reference Metadata can carry any information How to decide what is appropriate? – Neither standard was originally designed to support all needs for structured metadata associated with all phases of the statistical production process – It would be useful to have common agreement on business definitions and purposes for this metadata so “business fit” (and integration) can be considered, not just technical feasibility.
23
GSIM is complementary to GSBPM A model is needed to describe information objects and flows within the statistical business process
24
What is GSIM? A reference framework of information objects setting out definitions and (commonly agreed) attributes and relationships Provides : Information model for “business objects” at the conceptual (and, to some degree, logical) level Common (reference) semantics Does not provide Physical representation for information objects Relationship of GSIM with SDMX/DDI Alignment was “designed in” where relevant In adoption/implementation, complementary (with synergies) but no formal dependency. 24
25
BusinessProduction ConceptsStructures
26
CONCEPTS PRODUCTION BUSINESS STRUCTURES Statistical Need Business Case PopulationConcept Statistical Program Design changes design of Statistical Program Statistical Activity has includes Data Channel has Data Resource uses Process Step Data Set uses includes Unit Classification Variable Data Structure specifies Process output Process Input specifies has describes identifies defines measures defines is associated with comprises describes specifies may include may initiate comprises Acquisition Activity Production Activity Dissemination Activity initiates
27
GSIM Timeline GSIM V1.0 was released in December 2012. The most detailed documentation of GSIM is UML in Enterprise Architect – More than 100 information objects Higher level views and a glossary are also provided. The next level of detail regarding correspondences between GSIM Information Objects and constructs in DDI & SDMX was completed in May. – Included general identification of “gaps”, “overlaps”, strong alignments and partial alignments. Important not to see standards as completely “fixed” in current form – A further level of detail will be required to arrive at concrete recommendations for representing GSIM objects using SDMX and DDI
28
Analysis of use cases The SDMX TWG has been defining a set of relevant use cases where the two standards could be compared and, if possible, used together: 1.Survey data collection 2.Administrative and register data 3.Combined use of DDI and SDMX 4.Micro-data access and on-demand tabulation of micro-data 5.Metadata and quality reporting
29
I.Why are common standards important? II.DDI and SDMX : The relationship so far…. III.Better to have one standard than two? IV.Will everyone need to learn two standards? V.ABS approach, so far…. VI.Discussion VII.Optional bonus material I.The challenge II.Application III.Use cases
30
Better to have one standard rather than two? Theoretically, under ideal circumstances, perhaps, but in practice… – How many standards are there associated with the components to build a car or a house? – The press chose to have standards for Images, News, Events, Sports What is often more important in practice is to define a “standard” means to harness multiple standards to support the interoperability needs of a particular industry. – More on “how” later – It is better to have a suite of standards each of which support particular needs well rather than a single standard which provides mediocre support for some needs? – Commonly, only some of the relevant underlying standards are “indigenous” to the industry This is partly because many interoperability needs are broader than a single industry.
31
Considerations Both DDI and SDMX have constituencies beyond official statistics, eg – Research Institutes, Data Archives for DDI – Central banks for SDMX This is positive in several regards – Broader interoperability, broader economies of scale – Shared cost of maintaining and supporting the standard If proposed further evolutions of DDI and SDMX would add complexity without value for other constituencies, or would contradict their business model, these may be resisted. – (DDI/SDMX interoperability interests some other constituencies)
32
How about…. The official statistics industry develops and maintains an independent representation standard based on SDMX, DDI and on good models from statistical agencies?
33
Downsides include…. Standards are – slow and costly to design and agree – costly to document and to support through tools and expertise – costly to maintain in terms of improving fitness for purpose and evolving as business needs change If official statistics adopt similar but increasingly divergent standards to SDMX and DDI, external interoperability and economies of scale will decrease over time – implications include decrease in possible vendor/market interest
34
How about…. ….we use SDMX for everything Given one aim is structured metadata to drive business processes then a lot of required metadata is not structurally defined in the SDMX Information Model – Extending SDMX to model this structurally would make the standard much larger Central Banks (and others) do not seem enthusiastic about this – Reference metadata, however, underpinned by metadata structure definitions, can carry any information Reference metadata, however, is not necessarily intended (and sometimes not readily modelled) for structural use
35
Considerations Structural use of reference metadata in a standard manner requires common semantics – eg, one quality declaration held as reference metadata is not necessarily comparable with another unless both are structured according to common semantics (eg ESMS) – There would need to be a large exercise of defining common semantics for SDMX reference metadata to be used for structural purposes Where semantics are already defined in DDI, would we choose something different? – If Yes, why invest in defining and maintaining different semantics? – If no, we are – in effect - representing DDI in SDMX. » Why not also support DDI syntax, with an option to use DDI tools? » The option of using SDMX reference metadata for this purpose may, however, be useful in other circumstances.
36
Concept of profiles on standards Many industries and developments have faced similar challenges. “Application profiles” refer to a way of applying one or more standards in a particular context, eg – an industry and/or environment, or – an initiative, or – IT applications Examples – The W3C standard for representing dates and times is a profile on ISO 8601 – INSPIRE, FGDC, ANZLIC etc use profiles on common ISO standards for the semantics and representation of geospatial metadata
37
Possible application to official statistics An overall profile setting out how the official statistics industry proposes DDI and SDMX be used to support structured metadata needs of statistical business processes could be a target. – The overall profile could be built up progressively as practical business needs related to particular business functions / sub-processes within the GSBPM are agreed Helps minimise the risk of unnecessary differences in the way semantically equivalent metadata (eg classifications) are represented for different business operations Specific IT applications can define which (usually small) subset of the overall profile they support
38
I.Why are common standards important? II.DDI and SDMX : The relationship so far…. III.Better to have one standard than two? IV.Will everyone need to learn two standards? V.ABS approach, so far…. VI.Discussion VII.Optional bonus material I.The challenge II.Application III.Use cases
39
Business Perspective Most business staff should not need a detailed knowledge of SDMX and DDI. – They should understand aspects of a common “business level” information model for the statistical information objects which are relevant to their work – Somewhat similarly most users of the web don’t have a detailed knowledge of HTML – They do, however, experience impacts if developers of browsers and web pages get it wrong (or right) Most of those putting the INSPIRE directive into effect don’t need a detailed knowledge of ISO 19115.
40
Developer Perspective Ideally, most developers of new IT components to support business processes, staff responsible for selecting & configuring IT components to support specific statistical business processes won’t need a detailed knowledge of SDMX and DDI. They may need to understand aspects of any agreed “industry profile” that are relevant to their work.
41
This would seem to require A core team who, collectively, understand – the standards and what they support – “as is” business operations and “to be” target of transformation – the perspective and needs of business staff and developers. Effective ongoing engagement with business and developer communities to ensure proposed approaches will meet their needs requires an appropriate level of common language underpinned by common understanding – GSBPM and GSIM as common points of reference can support some, but not all, of this communication An iterative, agile approach – coherent but not monolithic - across the range of metadata requirements
42
Possible Dynamics Different aspects of exploring and proposing “industry application” (eg regarding different types of metadata and regarding support for different business functions) could be led by teams in different agencies – This is more efficient and effective than seeking to determine everything from first principles in a single committee? There would, however, need to be active, practically oriented review by many agencies to ensure all business considerations, including local considerations, were supported to the extent which was practical. There would also need to be checking for coherence and consistency across different “packages” of the work.
43
Part IV I.Why are common standards important? II.DDI and SDMX : The relationship so far…. III.Better to have one standard than two? IV.Will everyone need to learn two standards? V.ABS approach, so far…. VI.Discussion VII.Optional bonus material I.The challenge II.Application III.Use cases
44
“Profiles” on DDI SDMX etc “Canonical” Information Standards Conceptual Model “Business Objects” MRR (Metadata Registry/Repository ) Repository (Logical) “Objects live here” Registry Information Model (RIM) Objects are defined and retrieved based on standards Allows discovery of objects and their address Information standards are used to represent (“instantiate”) business objects Registry lets you find/reference specific (instances of) business objects
45
“Profiles” on DDI SDMX etc “Canonical” Information Standards Conceptual Model “Business Objects” MRR (Metadata Registry/Repository ) Repository (Logical) “Objects live here” Registry Information Model (RIM) Objects are defined and retrieved based on standards Allows discovery of objects and their address Information standards are used to represent (“instantiate”) business objects Registry lets you find/reference specific (instances of) business objects 2011 – Defining the ABS Transitional Metadata Model (ATMM) ATMM “Technical” ATMM “Conceptual” ATMM “Alignment with standards”
46
“Profiles” on DDI SDMX etc “Canonical” Information Standards Conceptual Model “Business Objects” MRR (Metadata Registry/Repository ) Repository (Logical) “Objects live here” Registry Information Model (RIM) Objects are defined and retrieved based on standards Allows discovery of objects and their address Information standards are used to represent (“instantiate”) business objects Registry lets you find/reference specific (instances of) business objects 2012 ATMM “Technical” ATMM “Conceptual” ATMM “Alignment with standards” GSIM (Under development)
47
“Profiles” on DDI SDMX etc “Canonical” Information Standards Conceptual Model “Business Objects” MRR (Metadata Registry/Repository ) Repository (Logical) “Objects live here” Registry Information Model (RIM) Objects are defined and retrieved based on standards Allows discovery of objects and their address Information standards are used to represent (“instantiate”) business objects Registry lets you find/reference specific (instances of) business objects 2013 ATMM GSIM@ABS InfoStandard s @ ABS Only a small core currently Growing daily, prioritised by need. GSIM V1.0 GSIM/SDMX GSIM/DDI relationships (UNECE) Early work on model based, GSIM aligned, DDI (DDI Alliance)
48
“Profiles” on DDI SDMX etc “Canonical” Information Standards Conceptual Model “Business Objects” MRR (Metadata Registry/Repository ) Repository (Logical) “Objects live here” Registry Information Model (RIM) Objects are defined and retrieved based on standards Allows discovery of objects and their address Information standards are used to represent (“instantiate”) business objects Registry lets you find/reference specific (instances of) business objects Future? ATMM GSIM@ABS InfoStandard s @ ABS Only a small core currently Growing daily, prioritised by need. GSIM V1.0 GSIM/SDMX GSIM/DDI relationships (UNECE) Early work on model based, GSIM aligned, DDI (DDI Alliance) BEAN S!
49
I.Why are common standards important? II.DDI and SDMX : The relationship so far…. III.Better to have one standard than two? IV.Will everyone need to learn two standards? V.ABS approach, so far…. VI.Discussion VII.Optional bonus material I.The challenge II.Application III.Use cases
50
I.Why are common standards important? II.DDI and SDMX : The relationship so far…. III.Better to have one standard than two? IV.Will everyone need to learn two standards? V.ABS approach, so far…. VI.Discussion VII.Optional bonus material I.The challenge II.Application III.Use cases
51
The challenge Is not about which flavor of XML we use (XML doesn’t really matter) It’s about data and metadata! –If I want to use DDI to describe my data, and you want to use SDMX, how can we ensure that we are getting the same data and metadata?
52
The challenge (2) If I am using SDMX, but I am sent DDI, a simple transformation must give me the same payload of data and metadata Vice-versa for SDMX users Conventions will need to be established regarding identifiers and the way the unit record files are structured There will need to be agreed models for each business case
53
Combined DDI-SDMX approaches Mixing the two standards within an implementation, allowing for the expression of the same metadata in both standards, so that the information could be transformed from one format to the other. This way, it would become possible to select either DDI or SDMX for a particular operation, similar to what we discussed above regarding application functionality. Metadata stored and indexed in such a fashion that it can be expressed either as SDMX or DDI on an as-needed basis. Metadata Repository and Registry project at ABS. The actual format used for metadata storage may be neither SDMX nor DDI, so long as it can be expressed using both standards. GSIM to be implemented through a combination of SDMX and DDI?
54
Generic Statistical Information Model (GSIM) SDMX DDI ISO 11179 Etc.
55
GSIM GSBPM MethodsTechnology GSIMGSBPMGSIM Service Outputs Service Service Inputs informs enables business process Service defined by methods and business need informs Generalised Statistical Production System Conceptual Practical Expanding on the diagram Standards Based e.g. DDI, SDMX
56
Survey A Survey is targeted at a specific Population and comprises Questions Questions may be linked to a Variable which specifies - conceptual meaning (Concept) -valid set of responses that are allowed (Category Scheme and contained Category) Output from the Survey is a Unit Record Data Set
57
The Proposed Approach The full set of information includes: –The unit record data –Structural information about the variables and representations –Additional information about how the data has been generated/collected/processed In DDI, this set of information can be expressed as a DDI instance and a data file –Both the structural and processing metadata can be expressed as a single DDI instance
58
Output Tables
59
Concepts
60
Metadata Set Unit Record Data DDI Instance ASCII Data File SDMX Data Set SDMX Structural Metadata SDMX Metadata Report
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.