Presentation is loading. Please wait.

Presentation is loading. Please wait.

Revision Project of the Business Register (BR) and Business Statistics in 2009-2014 - General overview of the project (Sami) - Data Linking and Data Warehousing,

Similar presentations


Presentation on theme: "Revision Project of the Business Register (BR) and Business Statistics in 2009-2014 - General overview of the project (Sami) - Data Linking and Data Warehousing,"— Presentation transcript:

1 Revision Project of the Business Register (BR) and Business Statistics in 2009-2014 - General overview of the project (Sami) - Data Linking and Data Warehousing, statistical point of view (Sami) - Technical point of view (Riitta) 6 April 2011 Sami Saarikivi Riitta Piela

2 Overlapping work is made between different business statistical systems/units Data copies are made and data are processed differently Changes are made to several different databases or in the worst case errors are corrected only in one place! Different definitions in different databases/production systems Unit structures (establishment, enterprise, etc.) Industry (NACE), sector, turnover, employees... → Inconsistent statistics 2 Background / Results from the present status description (1)

3 Change pressures for comprehensive development of the data system are also produced by: Statistics Finland's operational strategy and strategy for economic statistics International regulations and decrees The previous data system revision was made in 1995-1998 → The technical service life is coming to an end 3 Background / Results from the present status description (2)

4 21.10.20154Information Services

5 21/10/20155Information Services Statistical systems in the integrated system ( defining phase 2009 ) 1. ”Dependent” statistics of the integrated production database A. Business Register (Business Trends department) B. Some SBS and other statistics: (Business Structures department) * Financial statement * Regional and industrial statistics * International trade in services * FATS statistics (inward and outward) * Industrial output /Commodity (PRODCOM) C. STS statistics (Business Trends department) 2. ”Independent and small” statistics to the warehouse: Business services (the first system) Others in near future

6 BR into the core of data on enterprises – timetable for 2009-2014

7 Temporary unit 7 YR45U project organisation in 2011-2014 Main project (YR45U) Top authority and project marketing Steering group Other experts Help when needed Sub-project1 Sub-project n Project Manager 1 + members Project Manager 2 + members Project Manager n + members Tasks in 2011 Supporting group of statistics Supporting group of IT YR45U (main project) Databases Sub-projects YR45L Applications 1 YR45M Applications 2 YR45N Direct data collections YR45P Conversion YR45X Introduction

8 21.10.20158Yritysten suhdanteet / Yritysrekisteri Collected results of the desired state Yritystilastojen yhteiset asiat Yritystietovaraston määrittely System partKey development targetsBenefits to be gained Reception of administrative data Utilisation of metadata Process management Variable editor and editing Intensification / Introduction of uniform practices to pre-checking of data Direct data collections Uniform tools Facilitation of responding Improvement of quality and intensification of activity Databases Adoption of the enterprise concept Uniform production database Improvement of quality International comparability Non-response and response burden get smaller Application programs and processing of data Intensification of processing (decrease of applications) Introduction of a process management application Improvement of quality and intensification of activity Warehouse of data on enterprises Uniform location for data on enterprises Basic data search Comparison of data Coherence of statistics Statistics and products Introduction of a unit information service (Internet) Adoption of a chargeable target group definition service (Internet) Improvement of quality and intensification of activity Improvement of customer service

9 Data Linking and Data Warehousing (DW) Statistical point of view - Timetable - Basic features - Benefits - Challenges - Other requirements 6 April 2011 Sami Saarikivi

10 Timetable of the DW Years 2009-2010 Basic ideas and requirements for the structure and process Analysis of stakeholders and exploration of the needs Year 2011 (spring) Basic structure of the database (architecture) Choice of technology Years 2011 (autumn) – 2012 Designing and implementation of application programs Year 2013 Ready for use 10

11 Basic features of the DW Covers all business data at Statistics Finland (in the long run) Includes all needed unit structures (enterprise groups, legal units, enterprises, establishments (LKAU), local unit, KAU) and other relevant variables for the statistical process Passive / only readable (changes to the production/operational database) Up-to-date information from the Business Register (updating once a day) Ready/validated unit data from other business statistics (permanent / compatible with releases/publications) 11

12 Benefits concerning the DW Uniform location for data on enterprises Basic data search and data comparisons The same unit structures are used The same classification of variables Unification of technical solutions and modes of action → easy to use, coherence of statistics Makes further processing of business data easier National accounts Research and information services Publishing and archiving of data 12

13 Benefits concerning the integrated production/operational databases Comparison of data at the beginning of the statistical process → Intensification of activity → Coherence of statistics No different kinds of production systems and databases, error checking and changes are made to one place → Intensification of activity → Quality of data → Reduction in overlapping work → Unification of the production process Profits from other systems easier to implement 13

14 Overall challenges Diverse use and requirements Single unit base (validating, information services/legal unit) Samples and research For statistical purposes “Transition from several old systems to one new system” Consensus between different kinds of business statistics Variable definitions (naming and values of variables) Uniform processes and modes of actions Employees’ work tasks are going to change Lack of examples → MEETS/ESSnet (3.1) results? Timetable conflict 14

15 More challenges, solutions and benefits Management of a large system and process? A uniform process-controlling application will be developed Improving the transparency of the data process Reducing person-dependency New possibilities to divide work Use is made of the XML-based statistical metadata system Data concerning the process management Data concerning the data content E.g. names and descriptions of variables, validation rules,… 15

16 Requirements of the process-controlling application Covers the whole process (from data collection to publishing) The progression of the process Tasks needed for the process, their order, whether performed successfully (traffic lights?) and when Responsible persons Reading more precise descriptions (e.g. process descriptions, processing rules) Starting tasks and batch runs (e.g. data updates, limited rights) Automatic email messages (e.g. when a certain task is done or behind schedule) 16

17 17 Basic requirements for the structure of the DW Clarity, ease of use and understandability Flexibility and expandability Convenience of use: speed of inquiries Compatibility with existing data systems Security: access will be subject to granted user rights.

18 18 Mr Sami Saarikivi, tel. +358 9 1734 3345 email: yritys.rekisteri@stat.fi Further information about the revision: Thank you!

19 Data Warehouse Architecture in the Revision Project of the Business Register and Business Statistics Technical point of view 6 April 2011 Riitta Piela

20 Two definitions Data warehouse is a copy of transaction data specifically structured for query and analysis (Ralph Kimball) a subject-oriented, integrated, time-variant and non- volatile collection of data in support of management's decision making process (Bill Inmon) 21/10/201520Riitta Piela

21 Statistical process Data CollectionEditing and AnalyzingPublication Data Source Layer Data Extraction Layer Data Storage Layer ETL LayerStaging Area Data Logic Layer Data Presentation Layer Metadata Layer Data Warehouse Architecture 21/10/201521Riitta Piela

22 Data Source Layer represents the different data sources that feed data into the System Data Extraction Layer data gets pulled from data sources into the System Staging Area data sits prior to being scrubbed and transformed into a data warehouse ETL Layer data gains its intelligence as logic is applied to transform the data from a transactional nature to an analytical nature 21/10/201522Riitta Piela

23 Data Storage Layer where the transformed and cleansed data sit Data Logic Layer where business rules are stored Data Presentation Layer refers to the information that reaches the users Metadata Layer where information about the data stored in the data warehouse system is stored also information on how the data warehouse system operates, such as ETL job status or process phase (technical metadata) 21.10.201523Riitta Piela

24 Data Collection Admin data Survey data (XML data) Technical validation process (SAS) SAS files Admin Data + Survey Data Database (Relational Database) Metadata Layer Data Source Layer Data Extraction Layer 21/10/201524Riitta Piela

25 Statistics production begins Workflow of receiving administrative data 21.10.201525Riitta Piela

26 Editing and analyzing Metadata Layer Admin Data Database (original data) Operational Database A start-up application for process management and batch runs (.NET + SAS) Data Warehouse Staging Area ETL Layer 21/10/201526Riitta Piela

27 Publication Metadata Layer Data Warehouse (unit data) Cubes (aggregated data) Information Service Application Browsers (ProClarity, SAS EG, Excel) EnterpriseCube EstablishmentCube Data Storage Layer Data Logic Layer Data Presentation Layer Publication 21/10/201527Riitta Piela

28 Data Warehouse (relational database) Data in Analysis Services’ own data structure ROLAP HOLAP MOLAP Low response timeReal-time data ”formats there in between” 2009 2008 2007-2000 21/10/201528Riitta Piela

29 Browsing the Data Measures Classifications Classification hierarchies 21/10/201529Riitta Piela

30 Goal Low response time Harmonise all statistical processes at Statistics Finland Universal interface for all purposes Speed the Statistical Process through good technical solutions Optimal database structures for different purposes Extensive Data Content 21/10/201530Riitta Piela

31 31 Technical challenges: Riitta Piela SAS.NET SQL Server eXist Analysis Services 1.To combine all technical components into a one functional entity 2.Hide the technological diversity from the end-user 21/10/2015


Download ppt "Revision Project of the Business Register (BR) and Business Statistics in 2009-2014 - General overview of the project (Sami) - Data Linking and Data Warehousing,"

Similar presentations


Ads by Google