Download presentation
Presentation is loading. Please wait.
1
Logical information model LIM Geneva 22-24 june
2
Why LIM The gap between the conceptual nature of the GSIM and the practical implementation focus of the CSPA was too wide. To bridge this gap, a new layer, a Logical Information Model (LIM), was needed.
3
Stats Agency Environment
CSPA Service How to make best use of GSIM in the context of CSPA Stats Agency Environment
4
GLUE
5
CSPA Logical Information Model
6
LIM The aim of the LIM is to translate the conceptual GSIM information objects into physical specifications of the information that flows in and out of statistical services. LIM describes the information objects and logical relationships required to support a CSPA service, in a manner which is consistent with GSIM. LIM is independent of the terminology used in existing standards such as SDMX and DDI.
7
Scope for LIM Not all GSIM information objects will make it to LIM.
The LIM is only concerned with the service and as such will not be taking all GSIM information objects down to LIM level.
8
The way forward: Build the LIM as we need it!
9
LIM development criteria
CSPA services development roadmap Statistical agencies internal service development roadmap Reusability factor Coverage provided by existing standards
10
LIM possible physical representations
The picture shows the CSPA LIM and its possible physical representations. LIM will make use of a number of standard logical models, e.g. DDI, SDMX, and others. In some cases there will be an overlap between standard logical models, i.e. a particular LIM object could be represented by different standards. There are also some areas in the standards that are outside LIM. Those areas consist of objects around which there is no clear usage agreement among practitioners: people in different domains represent the same information by means of different logical objects. Those ambiguous areas will be reduced over time as we include more objects in LIM and we disambiguate their usage for data exchange among CSPA services. We will identify which existing standards (if any) are relevant to the GSIM (conceptual) information objects which are in scope of the particular service. Logical modelling for CSPA will align to the maximum practical extent with the logical models associated with the candidate standards. In cases where complete alignment with existing standards is not practical, the usual decision will be for the LIM to align with one or other of the choices on a "best fit" basis. Depending on what information is being represented in practice, DDI and SDMX are currently expected to provide the primary basis for the physical representation of statistical information (e.g. data and metadata) in CSPA. In future, VTL also for validation, perhaps Datacube for Linked Data.
11
Adding attributes to existing GSIM classes: Base classes and Code List
All attributes in Identifiable Artefact, except local ID, are mandatory. This required a change in the cardinality of some attributes w.r.t. GSIM and the moving of some attributes to other information objects. After mapping to DDI, three new attributes were added: Local ID, Version Date and Version Rationale. A Code List is a type of Node Set that contain Code Items, i.e. a combination of the meaning of a Category with a Code representation. Specific Code Items can be created to support imputation of missing values. The Code List inherits all identification and administrative information from Identifiable Artefact.
12
Adding class extensions to GSIM: Process Execution
A Process Control defines the flow within Process Steps and between different Process Steps. In particular, it can describe the flow within a service and also between different services. Process Input and Process Output are used to identify the information objects to be passed across the service boundary. There are three types of Process Input in LIM: Transformable Inputs, Process Support Inputs, and Parameter Inputs. And three types of Process Output: Transformed Output, Process Execution Log and Process Metric. These six classes are additions to GSIM 1.1. There are three types of Process Input in LIM: Transformable Inputs, which is any object passed into a service that will be manipulated by the execution of the service, e.g. Data Sets and structural metadata. Process Support Inputs, which is a resource that is referenced or used to guide the service in completing its execution, e.g. code lists, background data, auxiliary data sets, or classifiers used as part of the service Parameter Inputs: Inputs used to specify which configuration should be used for a specific Process Step, which has been designed to be configurable. There are three types of Process Output in LIM: Transformed Output, which is the product of the actions that were executed by the Service, e.g. the updated ("value added") version of one or more Transformable Inputs supplied to the Service (Data Set where values that were missing previously have been imputed). Process Execution Log captures, which captures the output of a Process Step not directly related to the Transformed Output it produced, e.g. data that was recorded during the real time execution of the Process Step. Process Metric, which records the quality or methodological measurements about the execution of a service. One purpose for a Process Metric may be to provide a quality measure related to the Transformed Output, e.g. measure of how many records were imputed, and a measure of how much difference, statistically, the imputed values make to the Data Set overall
13
Developing LIM for each service:
Stage 1: Determine the information inputs and outputs (GSIM objects) Check if the inputs/outputs are covered by LIM If yes, service builder uses relevant part of the LIM, as well as a recommended physical representation If no, contact CSPA implementation group for guidance, possible LIM development Look at which standards are in scope of the request Stage 2: LIM development team consults more broadly on needs On acceptance of development, make it CSPA mandated
14
Status for LIM
15
Candidate Services
16
Further development of LIM
CSPA Implemention group Subgroup LIM Variables identified as high priority Help avoid risk of implementors using different Variable incorrectly -> easier to share services Neuchâtel terminology model definitions
17
Where to find information
18
Thank you! Eva Holm - Statistics Sweden David Barraclough – OECD
Flavio Rizzolo – Statistics Canada Based on original slides by Therese Lalor
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.