Data Science in Official Statistics: The Big Data Team

Slides:



Advertisements
Similar presentations
B2Bdatapartners Website:- | Phone: | -
Advertisements

Burton Reist Chief, 2020 Research and Planning Office U.S. Census Bureau 2014 SDC and CIC Steering Committee Meeting March 5, Census Updates.
Frank Yu Australian Bureau of Statistics Unstructured Data 1.
EGM – Population & Housing Censuses Eurostat / UNECE - Geneva - 24/25 May 2012 Beyond 2011 The future of population statistics (England & Wales) Alistair.
ONS Big Data Project. Plan for today Introduce the ONS Big Data Project Provide a overview of our work to date Provide information about our future plans.
Electronic reporting in Poland 27th Voorburg Group Meeting Warsaw, Poland October 1st to October 5th, 2012 Central Statistical Office of Poland.
2011 CENSUS Coverage Assessment – What’s new? OWEN ABBOTT.
Census Mapping A Case of Zambia UN Workshop on Census Cartography and Management, Lusaka, 8-12 th October 2007.
General Register Office for S C O T L A N D information about Scotland's people Demographic Statistics 2005 Cecilia Macintyre General Register Office for.
United Nations Economic Commission for Europe Statistical Division UNECE Big Data Work Steven Vale UNECE
Regional Investment Climate Assessment 21 January 2015 Ankara, Turkey.
Digitizing Farms and building a connected Ecosystem Traceability, Accountability and Real Time Decision Support.
INFO 7470/ECON 7400/ILRLE 7400 Register-based statistics John M. Abowd and Lars Vilhuber March 4, 2013 and April 4, 2016.
Adjusting for coverage error in administrative sources in population estimation Owen Abbott Research, Development and Infrastructure Directorate.
New data sources (such as Big Data) and Traditional Sources Work Package 2.
Planning, preparation and conducting TQS in Tajikistan Agency on statistics under the President of Tajikistan.
A research project proposal: “Data for qolexity: big data, nowcasting and the construction of wellbeing indicators“ Enrico di Bella University of Genoa.
Census Transformation: progress in New Zealand ONS Beyond 2011 Research Conference, May 2014.
Transforming official Statistics
UNECE Data Integration Project
WEB SCRAPING FOR JOB STATISTICS
Handbook on Residential Property Price Indices
Compilation and Dissemination of Distributive Trade Statistics
Developing reporting system for SDG and Agenda 2063, contribution of National Statistical System, issues faced and challenges CSA Ethiopia.
Dominik Rozkrut Central Statistical Office of Poland
Priorities and coordination of capacity building in Azerbaijan
Technology and procedures for correction of errors and discrepancies of data Workshop on Experiences with Data Management to Improve Land Administration.
INTER-AMERICAN DEVELOPMENT BANK CAPACITY BUILDING AND TRAINING.
MED-HIMS: Surveys on Migration
ESSNet Pilot: Web Scraping for Job Vacancy Statistics
Integrating administrative data – the 2021 Census and beyond
Make the SDGs more actionable through Geospatial Information:
The SWFP COMPASS Project
Removing Duplicate Job Ads
High-Level Forum on Strategic Planning in Statistics for Central Asian Countries Bishkek, May 17-19, 2006.
The Jobs Group MANDATE AND Work program Mary Hallward-Driemeier
Implementation of the Sustainable Development Goals (SDG) in the Republic of Uzbekistan Geneva, April 12, 2017.
“CareerGuide for Schools”
Steering Group Admin Project, 12 May 2016
Sharne Bailey, Tony Byrne UK, Office for National Statistics
United Nations Development Account 10th Tranche Statistics and Data
Eurostat's open data and experimental statistics
Big Data ESSNet: Web Scraping for Job Vacancy Statistics Nigel Swier UK Office for National Statistics.
Marek Šturc European Commission - Eurostat
New ways to get the data Multiple mode and big data
CPD Programme for Policing Data Specialists Fundamentals
Sub-regional workshop on integration of administrative data, big data
Dissemination Workshop ESSnet Big Data Sofia, February 2017
ESSNet Pilot: Web Scraping for Job Vacancy Statistics
ESTP programme for 2016 Živilė Aleksonytė-Cormier
Scanning the environment: The global perspective on the integration of non-traditional data sources, administrative data and geospatial information Sub-regional.
Software Systems for Survey and Census
Sub-regional workshop on integration of administrative data, big data
Smart Tourism statistics: improving the range of service offering in Rome Massimo De Cubellis Istat -Italy.
The Statistical Registers Integration Project
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
Activities implemented in the field of Statistics
NUAC conference 18th June 2014 Daniel SANCHEZ-SERRA
The OECD Analytical Database on Individual Multinationals and Affiliates (ADIMA) SESSION 5 Diana Doyle, Fabienne Fortanier, Graham Pilgrim OECD Statistics.
Dissemination Working Group John Allen
Transformation of the National Statistical System: Experience
Labour Market Information (LMI) What does it tell us?
Ethical Implications of using Big Data for Official Statistics
Pete Benton , Beyond 2011 Programme Director
Population Statistics without a Census or Register
Presentation to Primary Health Alliance 7 June 2019
Meeting Of The European Directors of Social Statistics
Case Study: HLG Big Data Sandbox
Big Data in Official Statistics: Generalities
Merging statistics and geospatial information Grants 2012
Presentation transcript:

Data Science in Official Statistics: The Big Data Team Owen Abbott Office for National Statistics

ONS Big Data team Launched in January 2014 The Big Data team: focus on high priority ONS challenges that may be solved through new forms of data or the application of data science techniques to help deliver better statistics undertake research (including data access, technological, methodological, ethical) support ONS business areas in implementation Approach has involved a combination of collaborative working/partnerships and practical projects

Who we work with Commercial Sector International Government Privacy Groups Academia Over the two and a half years, we have worked with a number of organisations and partners Working on cross-european projects with other national statistical institutes Working with commercial sector on exploring sources of data Working with academia on data science research projects Working with privacy groups to explore perceptions and acceptability of work in this area Working across government to share research and best practice Government

Projects Demographics, migration Job vacancy statistics smart meters SIC: Web scraping enterprise websites Food diary coding Inflation: Web scraping Online retail prices Property: Zoopla Data

Cross cutting research Privacy and ethical issues Investigating new big data tools and technologies – transforming ONS data infrastructure Web-scraping Need to pay attention to ethical and privacy issues on an ongoing basis Work to support the transformation of ONS’ data infrastructure – investigation of big data technologies – Spark for fast processing, Graph databases for data linkage Adjusting for bias in big data sources

Some specific projects

Job Vacancy Statistics useful findings here: The box and whisker plot shows how the different sources compare. Basically, all the on-line sources are underestimating the survey counts. This may be due to a number of factors - jobs not being advertised on-line, jobs being advertised via agencies, mismatch between advertising employers and survey reporting units Once you remove outliers (i.e. very large diffferences between counts) the counts from enterprise websites are closer to the survey than any of the job portals. Indeed is then closest to enterprise website counts followed by Careerjet.

Census Address register Using big data sources to provide intelligence on addresses e.g. caravans vacant properties – to improve address register and hence field operations Gated Communities Private rented properties 2011 Census spent £6 million enumerating vacant properties

Population mobility and migration Population estimates and travel to work statistics using mobile phone data Milan – visualisation using open data Working on samples of mobile data for travel to work flows

Challenges Obtaining data Understanding data quality Assessing (and correcting for) bias Sampling issues Large datatsets Training datasets for ML Integrating sources Linkage Estimation Tools and Environments Ethics

Further Information Big Data Team www.ons.gov.uk/aboutus/whatwedo/programmesandprojects/theonsbigdataproject Email: ons.big.data.project@ons.gov.uk

Projects I Housing Data Living Costs and Food Survey coding British Crime Survey coding Household type from Smart Meter data Identifying caravan parks from Aerial Imagery Using graph databases in record linkage Commuting patterns from Mobile phone data

Projects II Web scraping Using Twitter to examine internal migration Job Vacancy statistics Enterprise statistics Price indices SDGs (corporate sustainability reporting) Ethics/policy/service Using Twitter to examine internal migration Address and Business Index – both matching services Methodology - Adjusting for biases in Big Data Methodology – Using Big Data in Small Area Estimation

Projects III Exploring data sources Opencell ID Oyster Card Facebook Twitter Linkedin Land Registry Price Paid TED (EU procurement) Zoopla Moneysupermarket Google Trends Satellite Imagery Probably some more….