Presentation is loading. Please wait.

Presentation is loading. Please wait.

“The role of S-DWH in the ESS 2020 modernization process”

Similar presentations


Presentation on theme: "“The role of S-DWH in the ESS 2020 modernization process”"— Presentation transcript:

1 “The role of S-DWH in the ESS 2020 modernization process”
ESS - Collaboration in Research and Methodology for Official Statistics “Centre of Excellence on Data Warehousing” National Institute of Statistics – Italy “The role of S-DWH in the ESS 2020 modernization process” Antonio Laureti Palma - ISTAT CoE Workshop on Data warehousing Wiesbaden 23 – 24 November 2016

2 Summary 2020 ESS vision New data source: Big Data Critical issues of traditional S-DWH Register based production process Corporate Statistical Data Warehouse Evolution of the S-DWH

3 Introduction In the communication society, there is a growing need for more and better official statistics for simple official communication as well as for decision-making, planning, and evaluation. At the same time there are budget and time constraints forcing all producers of official statistics to impure infrastructure and economize on resources. In order to answer to the communication society needs and to reap any new potential benefits it has been created an agreement on ESS vision 2020 as guiding development frame

4 The innovation process - strategy
2020 ESS vision is based on: KEY AREA 1: Identifying user needs and cooperation with stakeholders KEY AREA 2: Quality of European statistics KEY AREA 3: New data sources KEY AREA 4: Efficient and robust statistical processes

5 New data source: Big Data
“Big Data” is a term for the collection of data sets so large and complex that they cannot easily be managed by traditional data warehouse technologies. Big Data is generated by devices; blogs and social feeds; mobile applications; clickstreams; ATM, RFID, and sensors; feeds for eGov, weather, traffic, and market sites; and so much more. Big Data is unstructured, unsanitized, non-relational and is not generated or owned by the business.

6 Big Data in the modern business environment:

7 The innovation process - strategy
2020 ESS vision KEY AREA 3: New data sources establish alliances and partnerships with data owners; invest in new IT tools and methodological development; consider organizational challenges in harnessing new data sources continue to improve existing data collection methods

8 Critical issues of traditional S-DWH
The traditional data warehouse is under pressure from the trends: increasing data volumes, new sources not relational and not structured data. The traditional data warehouse based on RDBMS is not architected to support not-structured data Big Data. It could result in decreased performance and slower time-to-value. The traditional data warehouse needs to evolve, in terms of model and technologies towards a flexible infrastructures

9 Statistical production system
Flexible production infrastructure issues: offer opportunities to create new, more advanced, products; make all data sources usable; support statistical methodology and appropriate survey designs; possess the competence and technologies to utilize any new available external data source; taking in account implications of the whole production system.

10 Requirements for usability of external sources:
information on the source from data supplier; stability of the delivery of the source; analysis and data editing of the source; Integrability with surveys with similar variables. integrability of the source with the statistical registers;

11 External sources can be used in 4 different ways:
completely alone (register surveys); alone but combined with a base register; in combination with base registers and other administrative data; alternately, an external source can be also used simply to improve other surveys.

12 Register based production process
A base registers system, and the links between them, could constitute the basis of production systems that would utilize external data to its full extent: external sources can be used to improve the base register – its coverage, its actuality, its classifications (as Sector, Economic Activity or Regions); register surveys use them as population registers and use the links to integrate data from other sources; sample surveys continue to use them as sampling frame of high quality;

13 ..a example from Anders e Britt Wallgren (2007)

14 Corporate Data Warehouse: organizational views
Based on Sundgren ‘97- The Systems Approach to Official Statistic organizational integrations DW DW high functional organization matrix organization autonomous stovepipes co-ordinated stovepipes DW low low high process integrations

15 Generic Enterprise Model for Statistics
Core aspects of a statistical enterprise and its interrelations: from ESS-”Building the future of statistical system”

16 Corporate Statistical-Data Warehouse
The corporate S-DWH as central repository of a register based information system. The S-DWH will act as a clearinghouse, enabling input data from different sources to be combined according to a “many-to-many” corresponding to a multitude of user needs. A high level of coordination is necessary both within different topics and within different operational phase activities, and between the topics and activities. To define and enable its evolution requires creating, communicating and improving the key requirements, principles and models.

17 Corporate Statistical-Data Warehouse
Organizational view Functional Organization: the stovepipes are no longer visible; the number of input processes is potentially reduced and the number of outputs from the system will grow. the organization has functional units; the production systems are fully integrated in a common linked registers system; production processes are standardized and interact with the common S-DWH.

18 Corporate Statistical-Data Warehouse
S-DWH layered functional architecture: source layer, is the level in which we locate all the activities related to storing and managing internal (surveys) or external (archives) raw data sources; integration layer, on this layer performs all typical Extraction Transformation and Loading functions in order to make data linkable; interpretation and data analysis layer is specialized to support all evaluations and analysis phases on raw or cleaned linkable data; access layer is addressed to several users for the presentation of the information sought.

19 STATISTICAL DATA WAREHOUSE
Corporate Statistical-Data Warehouse Layered functional Architecture view STATISTICAL DATA WAREHOUSE DATA WAREHOUSE ACCESS LAYER INTERPRETATION AND ANALYSIS LAYER OPERATIONAL DATA INTEGRATION LAYER SOURCES LAYER

20 Two basic business processes
Basically, in function on the source we can distinguish from two extreme and different basic production processes: Stable source, typical iterative linear statistical process, where data sources are controlled by NSI: design build collect process analyze disseminate 2 3 4 5 6 7 Not-stable source , not linear statistical process, statistical process where data source are not controlled by NSI: collect 4 build process design 2 3 5 disseminate 6 7 analyze

21 STATISTICAL DATA WAREHOUSE
Corporate Statistical-Data Warehouse The GSBPM phase in the functional layers view Case stable business process: STATISTICAL DATA WAREHOUSE DATA WAREHOUSE ACCESS LAYER DISSEMINATE INTERPRETATION AND ANALYSIS LAYER ANALAYZE OPERATIONAL DATA INTEGRATION LAYER PROCESS SOURCES LAYER COLLECT

22 STATISTICAL DATA WAREHOUSE
Corporate Statistical-Data Warehouse The GSBPM phase in the functional layers view Case not-stable business process: STATISTICAL DATA WAREHOUSE DATA WAREHOUSE ACCESS LAYER DISSEMINATE INTERPRETATION AND ANALYSIS LAYER ANALAYZE BUILD OPERATIONAL DATA INTEGRATION LAYER PROCESS SOURCES LAYER COLLECT

23 Evolution of the S-DWH The new sources move the Architecture for data warehousing towards a Logical Data Warehouse, i.e. more focused on the logic of information. Basically, this means add semantic data abstraction based on: virtual (any data) management, high quality level of metadata, active system self-monitoring, distributed processes (parallel-processing), service level tracking.

24 Statistical Data Warehouse – functional layers
Source Layer: is responsible for the physical storing of the data from internal sources (controlled by NSI) or external sources for statistical purpose. where to check of the integrity of the external sources, data and metadata; in the contest of Big Data, as example, data store could be based on Hadoop Distributed File System (HDFS).

25 Statistical Data Warehouse – functional layers
Integration layer: where operational activities are deployed, recurring (cyclic) involved in the running of the whole, or part, of a statistical production process; manage logics: raw and reconciled micro data, integration activities, data linking and editing on versioned data; is a physical area where data and metadata are organized in a RDBMS or in integration with a noSql technologies (MapReduce, MPP) in order to retrieve big data (structured or not) directly from the source layer.

26 Statistical Data Warehouse – functional layers
Interpretation and data analysis layer: the basic functions performed at this level are advanced analysis and interpretation of data-elaborations; here “statistical expert users” operate to create strategic value information; activities on this layer improve the S-DWH capabilities; it could be physical or virtual area where to explore primary linked data based on dimensional schemas; it provides a single query mechanism(as example by HiveQL), across different types of data, through the integration layer.

27 Statistical Data Warehouse – functional layers
Access Layer: support the breadth of tools that organizations can use to get actionable results from the data: self-service tools that make it easy to extract micro or macro data for analysis, OLAP Web output system, BI-Tools (Business Intelligence), Integration services (SDMX interface) supports cross-border service interoperability. is a physical layer where macro data are reorganized and designed for a wide typology of users and computer instruments (MOLAP) for the delivery of the information sought;

28 Thanks


Download ppt "“The role of S-DWH in the ESS 2020 modernization process”"

Similar presentations


Ads by Google