WebDataNet Conference 2015 Salamanca, 26th – 28th May 2015

Slides:



Advertisements
Similar presentations
Will ‘big data’ transform official statistics?
Advertisements

Frank Yu Australian Bureau of Statistics Unstructured Data 1.
United Nations Economic Commission for Europe Statistical Division The Data Deluge: What Does It Mean for Official Statistics? Steven Vale UNECE
| Alper Ortac | Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 1 Knowledge Management in Web.
United Nations Economic Commission for Europe Statistical Division NTTS 2015 – Satellite Workshop on Big Data March 9, 2015 The Big Data Project – The.
Big Data at Eurostat and the ESS
ONS Big Data Project. Plan for today Introduce the ONS Big Data Project Provide a overview of our work to date Provide information about our future plans.
Big Data and Official Statistics The UN Global Working Group
Research and Innovation Research and Innovation Research and Innovation Research and Innovation Research Infrastructures and Horizon 2020 The EU Framework.
International Seminar on Modernizing Official Statistics:
United Nations Economic Commission for Europe Statistical Division Big Data International Cooperation Steven Vale UNECE
UN Global Working Group on Big Data UNECE Workshop on Statistical Data Collection Washington, DC 29 April – 1 May 2015 United Nations Statistics Division.
ESTAT International Seminar on Modernizing Official Statistics: Meeting Productivity and New Data Challenges Tianjin, People’s Republic of China
Towards EU big data economy Kimmo Rossi European Commission
The 2014 International Conference on Internet Computing and Big Data (ICOMP'14), USA, Las-Vegas, July 21-24, science.org/worldcomp14/ws/conferences/icomp14/submission.
UN Global Working Group on Big Data for Official Statistics NTTS Workshop 9 March 2015.
Changes in the Markets Changes in the Technologies therefore Changes in the Publishing Industry New Business Models in a rapidly evolving World Robert.
LaunchGuru Software Development Private Limited Digital Marketing Services Presentation by Venkatesh Sudha.
The Open Method of Coordination in the area of Innovation Policy
Globalisation processes in the field of statistics Discussion DGINS, Budapest, 2007 Irena Križman Director-General of the Statistical Office of the Republic.
Charles Tappert Seidenberg School of CSIS, Pace University
Big Data Activities at Eurostat Workshop on Statistical Data Collection, 29 Apr – 1 May 2015, Washington D.C, USA
1 26 October 2013 Observation and Reflection on Official Statistics against Big Data Challenge Yuan Pengfei Research Institute of Statistical Sciences.
Modernisation of ESS infrastructure: The ESS instruments - a review E. di Meglio – P. Jacques – J.M. Museux.
Eurostat Web activity evidence to increase timeliness of official statistics IAOS – 10 October.
Data Science and Big Data Analytics Chap1: Intro to Big Data Analytics
ESDS International Celia Russell and Susan Noble Economic and Social Data Service University of Manchester ESDS International Conference 2007.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
Eurostat WebDataNet Conference 2015 Salamanca, 26 th – 28 th May 2015 Fernando Reis, Big Data Task-Force European Commission (Eurostat) Web activity evidence.
The industrial relations in the Commerce sector EU Social dialogue: education, training and skill needs Ilaria Savoini Riga, 9 May 2012.
United Nations Economic Commission for Europe Statistical Division UNECE Big Data Work Steven Vale UNECE
ESSnet(s) Big Data I + II Item 8 of the agenda Joint DIME-ITDG Plenary Luxembourg, 24 Feb 2015.
Freedom to think: The Science of Data Dr Quentin Williams.
Modernization of official statistics Eric Hermouet Statistics Division, ESCAP
Eurostat Opportunities for transformation of official statistics from an ESS perspective Walter J. Radermacher Director-General of Eurostat Budapest, 20.
Stefan Schweinfest Acting Director, United Nations Statistics Division International Seminar on Modernizing Official Statistics: Meeting Productivity and.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
European strategies for digitisation: the context of i2010 digital libraries Pat Manson Head of Unit Cultural Heritage and Technology Enhanced Learning.
El valor de la información: el reto del Big Data
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
Big Data and Official Statistics: Philippine Context Erniel B. Barrios.
A research project proposal: “Data for qolexity: big data, nowcasting and the construction of wellbeing indicators“ Enrico di Bella University of Genoa.
Data Science in Official Statistics: The Big Data Team
Data Analytics 1 - THE HISTORY AND CONCEPTS OF DATA ANALYTICS
CSPA and the Digital Transformation in the ESS
WEB SCRAPING FOR JOB STATISTICS
Content of the presentation
Michail SKALIOTIS Eurostat Ημερίδα
Innovation in statistical processes and products: a European view
Steering Group Admin Project, 12 May 2016
Mobile Commerce and the Internet of Things
United Nations Development Account 10th Tranche Statistics and Data
Eurostat's open data and experimental statistics
New ways to get the data Multiple mode and big data
About OI-Net.
WIS Strategy – WIS 2.0 Submitted by: Matteo Dell’Acqua(CBS) (Doc 5b)
Working Group European Statistical System – Learning and Development Framework (ess-ldf) & Human Resources Management (hrm) ESTP III ( ) Item.
Use of Wikipedia for Statistics on Culture
ESS Vision 2020.
Tommaso Rondinella (Istat)
Tommaso Rondinella (Istat)
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
Marc Debusschere, Statistics Belgium
The Data Revolution and Official Statistics
Albrecht Wirthmann ESS TF Big Data
کتابهای خریداری شده فن آوری اطلاعات سلامت 1397
Ethical Implications of using Big Data for Official Statistics
ESS Vision 2020.
DIME / ITDG Meeting Luxemburg, 14 Feb 2017
Big Data in Official Statistics: Generalities
Presentation transcript:

WebDataNet Conference 2015 Salamanca, 26th – 28th May 2015 Big data and official statistics - progress at European level Fernando Reis, Big Data Task-Force European Commission (Eurostat)

The data deluge © Copyright Brett Ryder 2010

The data deluge The origins Decreasing storage price Source: JC McCallum http://www.jcmit.com/disk2014.htm Some methodology and references can be found in: JC McCallum, Price-Performance of Computer Technology (1st edition), pages 4-1 to 4-18 in The Computer Engineering Handbook, Vojin G. Oklobdzija, editor, CRC Press, Boca Raton, Florida, USA, 2002

The data deluge The origins Decreasing storage price Developments in distributed computing paradigm Copyright © 2014 The Apache Software Foundation.

The data deluge The 3 V’s of big data Volume (large N and large P) Velocity (data streaming and real-time statistics) Variety

The data deluge Organic data / exhaust data / digital footprint Data ubiquity: text, sound, images, video Emergent use of unstructured data Reality mining

The data deluge Communication Mobile phone data Social Media WWW Web Searches Businesses' Websites E-commerce websites Job advertisements Real estate websites Sensors Traffic loops Smart meters Vessel Identification Satellite Images Process generated data Flight Booking transactions Supermarket Cashier Data Financial transactions Crowd sourcing VGI websites (OpenStreetMap) Community pictures collection The key critical success factor in the action plan is an agreement at ESS level to embark on a number of concrete pilot projects. A real understanding of the implications of big data for official statistics can only be gained through hands-on experience, ‘learning by doing’. Different actors have already gained experience conducting pilots in their respective organisations at global, European and at national level. The basic principle for the identification of the pilots should be their potential for producing official statistics. In addition, the pilots should cover a variety of characteristics related to big data and production of statistics. On the basis of the principles described in the action plan and roadmap, the he ESS Big Data ask Force should make proposals for pilots which would have to be agreed by the ESSC. Participation in the pilots will be guided by the collaboration principles of the ESS Vision on a case-by-case approach.

Analytics This photo, “Cartoon: Big Data” is copyright (c) 2014 Thierry Gregorius and made available under an Attribution 2.0 Generic license.

Analytics How to deal with exhaust data? Massive datasets Dealt by machine learning / predictive analytics Massive datasets Foster machine learning Data science: a new discipline? Multiple inference Overfitting

Analytics Visualisation Data-driven applications Predictive analytics The end of theory Predictive analytics Ex: Google translate, voice recognition, suggestions systems, health applications New types of data products Official stat: chances of getting a new job Ex: Inflation Calculator: Bureau of Labor Statistics

Predicting Personality Using Mobile Phone-Based Metrics de Montjoye, Yves-Alexandre, et al. "Predicting personality using novel mobile phone-based metrics." Social Computing, Behavioral-Cultural Modeling and Prediction. Springer Berlin Heidelberg, 2013. 48-55.

Multidimensional Poverty Index (Lighter colour indicates higher poverty) Poverty map estimated based on mobile phone data Poverty map in finer granularity estimated based on mobile phone data Smith, Christopher, Afra Mashhadi, and Licia Capra. "Ubiquitous sensing for mapping poverty in developing countries." Paper submitted to the Orange D4D Challenge (2013).

Big data industry sector

An emergent market Monetisation of data: Data is the new oil Data as a new factor of production (competitive differentiating factor for businesses) A threat to official statistics? (ex: Argentina) Data ecosystem The cases of Google and Facebook

What does big data mean for official statistics? Change of paradigm From: finite population sampling methodology To: additional statistical modelling and machine learning from designers of data collection processes to designers of statistical products Privacy Use of digital footprint Data subject lack of control of data High data detail and insight from analytics

Big data strategy of the European Statistical system Scheveningen Memorandum ESS big data action plan

Scheveningen Memorandum September 2013 Big data strategy Roadmap Heads of the National Statistical Institutes of the EU

Scheveningen Memorandum on Big Data Examine the potential of Big Data sources for official statistics Official Statistics Big Data strategy as part of wider government strategy Address privacy and data protection Collaboration at European and global level Address need for skills Partnerships between different stakeholders (government, academics, private sector) Developments in Methodology, quality assessment and IT Adopt action plan and roadmap for the European Statistical System At their meeting in September 2013, the directors general of the European Statistical System agreed on a memorandum on Big Data. The so called Scheveningen memorandum called for the submission of the action plan and roadmap on Big Data. It also expresses the main points of such an action plan, such as the integration of a Big Data strategy for statistics into an overall government strategy, the need for skills, the collaboration at European as well as at global level, the need for methodological developments and the protection of personal data. The action plan and roadmap was adopted by the ESS at its meeting on 25 Sep 2014.

Big data strategy Start with concrete pilots 3 time-frames Short-term: pilots, where we are, what do we need Medium-term: Long-term: Review the roadmap

Big data sources integrated in ESS official statistics production Roadmap "As is" versus "To be" Long term Vision Medium term aims Short term objectives Topics developed for integration of Big Data into official statistics Government Strategy Pilots finalised IT infrastructure, Methods, Quality Framework Skills Partnerships Ethics > 2020 Big data sources integrated in ESS official statistics production By 2020 Commission big data strategy Identification and analysis of output portfolio Pilot projects launched Skills and training requirements identified Horizon 2020 Research Framework Programme Communication strategy Exchange with stakeholders Long Term vision beyond 2020: Big data sources integrated in ESS official statistics production By 2020, we want to prepare the grounds for the successful integration of Big Data into official statistics. The issue should be tackled by the topics which have already been identified in the Scheveningen memorandum and further developed in the action plan. Medium term aims: Topics developed for integration of Big Data into official statistics Government Strategy Pilots finalised IT infrastructure, Methods, Quality Framework Skills Partnerships Ethics Short term goals: Commission big data strategy Identification and analysis of output portfolio Pilot projects launched Skills and training requirements identified Horizon 2020 Research Framework Programme Communication strategy Exchange with stakeholders Long term vision Big data sources are available to the ESS (in such a way that business continuity is guaranteed) The European and national legislations enable Big Data usage Statistics graduates with data science skills available across the ESS Methods, tools, IT infrastructures and quality frameworks are reviewed and adjusted to new requirements Medium term aims Official Statistics integrated in governmental big data strategies at national levels across the ESS Pilots are finalised and provide input as regards implementation of statistics based on big data IT infrastructures are designed for processing big data sources Methodological and quality frameworks are developed (and/or reviewed) Data science skills integral part of official statistics education Public Private Partnerships are in place on big data and Official Statistics. Ethical guidelines for big data usage are in place Implementation of communication strategy ensures positive attitude of general public towards use of big data sources in official statistics. Short term goals Official statistics integrated in the Commission big data strategy Identification and analysis of output portfolio of big data sources Pilot projects on big data applications in official statistics launched and delivering first results The requirements in terms of skills and ESS professional training programmes are established Research lines on big data relevant to official statistics are included into the 2016-2017 Work Programme of the Horizon 2020 Research Framework Programme Communication to the general public on planned and ongoing big data activities of the ESS with users/policy makers on their needs, and with big data owners on data aspirations Exchange of information with stakeholders within the statistical system and the research community, e.g. a European conference on big data in Official Statistics By 2016

Ethics / Communication T O P I C S Policy Quality Skills Experience sharing Legislation IT Infrastructures Methods Ethics / Communication Pilots The slide shows the horizontal topics that have been developed in the action plan and roadmap. The ESSC reinforced in its opinion the legal and ethical issues as well as the development of appropriate skills. Some of the topics should be directly further developed such as policy, communication or skills, while other topics should be further elaborated as part of the pilots that arer planned to be run at EU level.

Blending of Sources and multipurpose Statistics Mobile Phone Data Tourism Statistics Population Statistics Migration Statistics Traffic Statistics Commuting Statistics Statistics Population Mobile phone data Smart Meters VGI websites Satellite Images A Big Data source could provide data for different statistical areas. For example, mobile phone data could be used in the context of tourism statistics or population statistics. The pilots should explore the potential of different data sources for different statistical domains. On the other hand, different statistical domains could be enhanced from information that is derived from different data sources.

Thank you for your attention Eurostat Task Force on Big Data Fernando Reis Eurostat Task Force on Big Data fernando.reis@ec.europa.eu https://github.com/reisfe/ https://twitter.com/reisfe/ https://linkedin.com/in/reisfe/