Download presentation
Presentation is loading. Please wait.
Published byGinger Holmes Modified over 6 years ago
1
Italian National Institute of Statistics - Istat
The Italian Integrated System of Statistical Registers On the Design of an Ontology-based Data Integration Architecture R. Radini M. Scannapieco , G.Garofalo Italian National Institute of Statistics - Istat Monica Scannapieco – Brussels, NTTS, March 2017
2
Outline Introduction to ISSR OBDM and examples Data architecture
Correspondence with EARF DV vs DW Conclusions Monica Scannapieco – Brussels, NTTS, March 2017
3
ISSR – Italian Integrated System of Statistical Registries
Istat engaged a modernization programme aimed at a significant revision of the statistical production One of the main pillars of this revision is the design of production processes based on an Integrated System of Statistical Registers Single logical environment to support the consistency of statistical production processes in Istat, in particular consistency in “identification” and “estimation” for the whole integrated system of units and variables Monica Scannapieco – Brussels, NTTS, March 2017
4
ISSR: Types of Registers
RSE (Extended registers) extends the information of a specific RSB on a specific RSB’s population RST (Thematic registers) supports more statistical processes through a consistent and shared treatment on some topics RSB (Base registers) contains several statistical populations and the minimum set of variables useful to characterize stat units Monica Scannapieco – Brussels, NTTS, March 2017
5
OBDM Ontology Based Data Management System
Ontology (or computational ontology): conceptual data representation expressed through «computational» languages In mathematical logic: assiomatic first order theory expressable in description logic OBDM is an integration system where the usual ER global schema is replaced by the conceptual model of the application domain formulated as an ontology Monica Scannapieco – Brussels, NTTS, March 2017
6
OBDM Architecture Main features
Data source transparency property (called data virtualization by IT platform) Global view Consistency Ontology Mapping Data source 1 Data source 3 Data source 2 Three-level architecture: Ontology, Sources, Mapping Monica Scannapieco – Brussels, NTTS, March 2017
7
Excerpt of the Ontology of the Working Relationships
Employee Self-employee Worker Monica Scannapieco – Brussels, NTTS, March 2017
8
Excerpt of the Population Ontology
Family registry Common law family Family Individual Monica Scannapieco – Brussels, NTTS, March 2017
9
Data Integration: same concept
Individual (Population Ontology) Individual (Working relationships ontology) Monica Scannapieco – Brussels, NTTS, March 2017
10
Querying over the ontology
Query: We would like to query for people that have the residence in a certain region and classify them by age, educational degree and employment condition We don’t have to know how information are stored in the sources! Monica Scannapieco – Brussels, NTTS, March 2017
11
by employment condition
Query Ontology Mapping Mapping Query rewritten over the sources RS of Individuals RS of Labour people that have residence in a certain region classified by age and educational degree by employment condition Monica Scannapieco – Brussels, NTTS, March 2017
12
High expressive power It is possible to give different definition of a concept dependending on the istance It is possible to express different constraints related to each definition CorporationManager- Labour Force Employee Self-employee Corporation Manager NationalAccount CorporationManager has a different semantics according to the domain Monica Scannapieco – Brussels, NTTS, March 2017
13
Data architecture Compliance to EARF (Enterprise Architecture Reference Framework) Metadata Management Primary Data Storage Quality Assessment Unitary Metadata System Logical centralization of ISSR Data consistency ODBM Monica Scannapieco – Brussels, NTTS, March 2017
14
Data architecture: IT View
Features DV DW Storage of Historical Data NO YES Capture Every Change in Production Data (requires integration with CDC) Multi-Dimensional Data Structures Data Pre-Aggregation Query performance on large amounts of data SLOW (relative to DW) FAST (relative to DV) Data Integration on Demand Operational Cost LOW HIGH Time-To-Market Easy to Make Changes Dependence on IT Monica Scannapieco – Brussels, NTTS, March 2017
15
Conclusions EA approach for ISSR design and implementation
ISSR Data Architecture: Hybrid solution with DV and DW E.g. DV-based data architecture with DW for historical data and dissemination Next steps: Prototypes of RSB Individual, Families and Cohabitations and RST Working Relationships Guidelines for the Management of the Integrated System of Statistical Registers Monica Scannapieco – Brussels, NTTS, March 2017
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.