Statistical data editing near the source using cloud computing concepts George Pongas, Christine Wirtz -Eurostat.

Slides:



Advertisements
Similar presentations
Chapter 1: The Database Environment
Advertisements

Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
Welcome to Middleware Joseph Amrithraj
Programming Basic Concepts © Juhani Välimäki 2003.
24/5/2011 MSIS 2011 – May 2011, Luxembourg 1 Statistical data editing near the source using cloud computing concepts George Pongas, Christine Wirtz.
Chapter 3: System design. System design Creating system components Three primary components – designing data structure and content – create software –
Fundamental System Concepts Asper School of Business University of Manitoba Systems Analysis & Design Instructor: Bob Travica Updated: September 2014.
© 2003, Prentice-Hall Chapter Chapter 2: The Data Warehouse Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas.
Software Engineering for Cloud Computing Rao, Feng 04/27/2011.
Database Systems COMSATS INSTITUTE OF INFORMATION TECHNOLOGY, VEHARI.
C Copyright © 2009, Oracle. All rights reserved. Appendix C: Service-Oriented Architectures.
Cloud computing.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 10: The Data Warehouse Decision Support Systems in the 21 st.
File Systems and Databases Lecture 1. Files and Databases File: A collection of records or documents dealing with one organization, person, area or subject.
Plan  Introduction  What is Cloud Computing?  Why is it called ‘’Cloud Computing’’?  Characteristics of Cloud Computing  Advantages of Cloud Computing.
Metadata driven application for data processing – from local toward global solution Rudi Seljak Statistical Office of the Republic of Slovenia.
Francesco Rizzo (ISTAT - Italy) Stefano De Francisci (ISTAT – Italy) An integration approach for the Statistical Information System of Istat using SDMX.
Editing Building Block (EBB) Validation Tool for FDI and ITS Balance of Payments Working Group 02 April 2012 Unit B4, IT for Statistical Production Georges.
SDMX IT Tools Introduction
Open GSBPM compliant data processing system in Statistics Estonia (VAIS) 2011 MSIS Conference Maia Ennok Head of Data Warehouse Service Data Processing.
1 The EDIT System, Overview European Commission – Eurostat.
EDIT – Eurostat’s editing tool
IT Directors’ Group Meeting October 2010 Sharing data validation tools in the ESS Christine WIRTZ – Head of Unit B3 Georges PONGAS – Unit B3 Daniel.
15-16 December 2010 CGST Meeting 1 IT Developments TRIS 1 – TRIS 1 / TRIS 2 Item 7.1 on the agenda 1 TRIS = TRansport Information System.
Database application development 1. Chapter 8 © 2013 Pearson Education, Inc. Publishing as Prentice Hall OBJECTIVES  Define terms  Explain three components.
TRITON - An event driven SOA architecture MSIS Jakob Engdahl, Statistic Sweden
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
Introduction to Enterprise Systems. Slide 2 Objectives Review the enterprise ecosystem.
Chapter 4. CONCEPT OF THE OPERATING SYSTEM MANAGING ESSENTIAL FILE OPERATIONS.
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Unit 3 Virtualization.
CLOUD ARCHITECTURE Many organizations and researchers have defined the architecture for cloud computing. Basically the whole system can be divided into.
Software Testing.
Basic 1960s It was designed to emphasize ease of use. Became widespread on microcomputers It is relatively simple. Will make it easier for people with.
Hardware and Software Hardware refers to the physical devices of the computer system e.g. monitor, keyboard, printer, RAM etc. Software is a set of programs,
MANAGEMENT OF STATISTICAL PRODUCTION PROCESS METADATA IN ISIS
The Client/Server Database Environment
VLEEM Software project
Results from Essnet for SDMX WP7 PC-Axis SDMX Integration
Course: Introduction to Computers
Rudi Seljak, Aleš Krajnc
Chapter 9: The Client/Server Database Environment
Introduction to Cloud Computing
Introduction to Enterprise Systems
Evolving Data Processing in the Statistics Centre – Abu Dhabi
Network Services, Cloud Computing, and Virtualization
Design and Maintenance of Web Applications in J2EE
EUROSTAT Unit B3 IT for statistical production Ewa Stacewicz
S-DWH layered architecture – Statiscs Finland
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Tomaž Špeh, Rudi Seljak Statistical Office of the Republic of Slovenia
Good Morning/Afternoon/Evening
Lecture 1 File Systems and Databases.
Chapter 1: The Database Environment
The Database Environment
Sharing data validation activities in the ESS.
Cloud Computing: Concepts
Fundamental Concepts and Models
Education and Training Statistics Working Group – 2-3 June 2016
AIMS Equipment & Automation monitoring solution
Data validation handbook
Data processing German foreign trade statistics
Education and Training Statistics Working Group – 1-2 June 2017
SOA initiatives at Istat
European Census Hub: a cooperation model for dissemination of EU statistics Paper prepared by Ioannis Xirouchakis Presentation: Christine WIRTZ, Eurostat.
Work Session on Statistical Metadata (Geneva, Switzerland May 2013)
Sending data to EUROSTAT using STATEL and STADIUM web client
EDIT data validation system Ewa Stacewicz EUROSTAT VALIDATION TEAM
Validation Activities in the ESS What you will hear today…
Login Main Functions Via SAS Information Delivery Portal
Presentation transcript:

Statistical data editing near the source using cloud computing concepts George Pongas, Christine Wirtz -Eurostat

Editing near the source Accelerates speed of final delivery to users and institutions Checks and imputations are near the respondent Data knowledge is frequently more profound in the primary collector institutions Logical proximity is better than physical: Data and application sharing

Cloud and SOA in few Lines Separates ownership and usage of data storage computer power and application development and execution (cloud) Cloud variants are IaaS, PaaS, SaaS Cloud architectures are: Public Private Mixed Community Based on web technologies and independent software components to interlink on demand (SOA)

Data Editing in Eurostat High volume of arrivals (>60.000 per year) Format heterogeneity Data checking absorbs substantial volume of human resources Erroneous data imply communications with MS Eurostat as a rule does not Impute… Interest to have a Common distributed solutions

Eurostat’s web enabled system for editing (Editing building block (Ebb) Completely Metadata Driven Exists in 2 versions: PC version Web-based version Technologies used: ANTLR Java Tomcat or Weblogic Hibernate Postgres or Oracle

EBB Information Flow

Implementation Details EBB is written using a set of Web services of the following types: Administration Program Job

EBB functionalities Support of categorical, text and numeric variables Separation of programmer and user interfaces Conditional and unconditional rules Multi-record rules Deterministic imputation Use of auxiliary data File operations Special functions (unicity, duplication checks ...) Outliers (HB, Sigma Gap, Terror) Input/output of data/metadata Reporting

Usage until now Embedded in SAS (for microdata editing) To distribute to data providers as standalone version FDI (foreign direct investments) ITS (international trade in services) SBS (structural business statistics) CVTS (continuous vocational training survey), AES (adult education survey)