WP7 MULTI DOMAINS.

Slides:



Advertisements
Similar presentations
What is Big Data? “… a blanket term for any collection of data sets so large and complex that it becomes difficult to process using on-hand database management.
Advertisements

Work Package 5: Integrating data from different sources in the production of business statistics Daniel Lewis Office for National Statistics (UK)
Auditing and Certification Scheme to increase the quality of Sustainable Urban Mobility Plans in Cities Organisation Surname, Name Meeting / Conference.
Electronic reporting in Poland 27th Voorburg Group Meeting Warsaw, Poland October 1st to October 5th, 2012 Central Statistical Office of Poland.
1 Women Entrepreneurs in Rural Tourism Evaluation Indicators Bristol, November 2010 RG EVANS ASSOCIATES November 2010.
Monitoring public satisfaction through user satisfaction surveys Committee for the Coordination of Statistical Activities Helsinki 6-7 May 2010 Steve.
New sources – administrative registers Genovefa RUŽIĆ.
Committee on Earth Observation Satellites CEO Team Plenary Agenda Item #14b 29 th CEOS Plenary Kyoto International Conference Center Kyoto, Japan 5 – 6.
Copyright 2010, The World Bank Group. All Rights Reserved. Managing processes Core business of the NSO Part 1 Strengthening Statistics Produced in Collaboration.
Big Data activities at SURS Statistical Office of the Republic of Slovenia DIME/ITDG meeting, February 2016.
Big Data for Measuring the Information Society INTERNATIONAL TELECOMMUNICATION UNION BIG DATA PROJECT - INNOVATIVE WAYS TO UTILIZE BIG DATA AS A NEW DATA.
University of Macedonia © University of Macedonia Co-financed by the European Regional Development Fund (ERDF) (75%) and the Greek National Funds (25%)
ARIES WP2 Task 2.2 kick-off Coordination, support and enhancement of communication/outreach activities for accelerators in Europe Jennifer Toes (CERN),
UNECE Data Integration Project
PROFILING USERS BY ESTIMATING COMPOSITE AND MULTI-VALUED ATTRIBUTES FROM BIG DATA SOURCES FOR SOCIAL STATISTICS PURPOSES NTTS 2017, Brussels, March.
Developing cross-border statistics by collaboration NSI’s
Key findings on comparability of language testing in Europe ECML Colloquium 7th December 2016 Dr Nick Saville.
Tayseer Anis. IT Manager Dept. of Statistics (DOS)
The NRN evaluation process in Poland
Anna Długosz Central Statistical Office of Poland
UNECE Work Session on Gender Statistics, Belgrade,
Istituto Nazionale di Statistica – Istat
Steering Group Admin Project, 12 May 2016
Guidelines for planning the costs of statistical surveys and other work implemented by the organisational units of official statistics services.
United Nations Development Account 10th Tranche Statistics and Data
Big Data ESSNet: Web Scraping for Job Vacancy Statistics Nigel Swier UK Office for National Statistics.
WP8 Methodology (SGA2) Piet Daas NL, AT, BG, IT, PT, PL, SL.
Internal WP7 meeting Warsaw, June 12-13, 2017
Consumer Satisfaction Measurement: State of Affairs
New ways to get the data Multiple mode and big data
Engagement Follow-up Resources
Civil Protection Financial Instrument – Prevention Projects
Investigation of the Potential of Big Data in EGYPT
Rolling Review of Education Statistics
Dissemination Workshop ESSnet Big Data Sofia, February 2017
Progress of the ESS.VIP ADMIN Special focus on the ESSnet on quality of multiple sources statistics. DIME/ITDG SG, Fabrice Gras, unit B1.
Scanning the environment: The global perspective on the integration of non-traditional data sources, administrative data and geospatial information Sub-regional.
Introduction on the outline and objectives of the workshop
Sub-regional workshop on integration of administrative data, big data
Smart Tourism statistics: improving the range of service offering in Rome Massimo De Cubellis Istat -Italy.
Statistical Office of the Republic of Slovenia
Internal WP7 meeting Warsaw, June 12-13, 2017
Background to the development of a European Victimisation Survey
ESS.VIP ADMIN Sorina Vâju.
Exchange and Sharing of Economic Data
ESS.VIP ADMIN Sorina Vâju.
2nd meeting of the task force on survey based disability statistics
Engagement Follow-up Resources
Item 3 of the draft agenda ESS.VIP ADMIN: progress report
Adding Value to the Urban Audit
WP7 – COMBINING BIG DATA - STATISTICAL DOMAINS
CVTS 2015 – Draft Commission Regulation amending Regulation (EC) No 198/2006 Agenda item 2.2 DSS Meeting 3-4 April 2014.
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
United Nations Statistics Division
FLOODsite Integrated Project
Statistics in the Enlargement context
Workshop on Pesticide Indicators
Other urban data collections
Costs and Benefits associated with the implementation of the Water Framework Directive, with a special focus on agriculture Summary & recommendations.
Project Border Region data collection
European Statistical System Network on Culture (ESSnet Culture)
Customer Satisfaction Measurement in European Public Administrations
Radiation Safety Information Management System (RASIMS)
Integrated Flood Risk Analysis and Management Methodologies
Boosting Cross-border cooperation capacities of local actors in the South Baltic Sea Anna Topp Gustavsen Sorø, 9 May 2019.
Towards Census 2021 in Hungary
Privacy and personal data protection
Big Data in Official Statistics: Generalities
UN-GGIM: Europe – Work Plan
Item 5 Modernisation of the EU-SILC Production
Presentation transcript:

WP7 MULTI DOMAINS

WP7 Multi domains WP7 Multi domains WP7 Multi domains

1. Population

2. Tourism/border crossing

3. Agriculture

Country leaders of each domain WP7 TEAM Janusz Dygaszewicz Project Manager of Polish work Jacek Maślankowski Coordinator of methodology Anna Nowicka Leader cooperation PARTNERS Piet Daas John Sheridan Nigel Swier Regional statistical office in Poznań Regional statistical office in Bydgoszcz Population Regional statistical office in Rzeszów Department of Social Research Tourism/ border crossing Department of Agriculture Regional statistical office in Olsztyn AGRICULTURE Coordinator of domain area (SGA-1) Cooperation on domain area Cooperation on domain area Country leaders of each domain

Aim of WP7 is to find out how a combination of: Big Data sources administrative data statistical data may enrich statistical output in domains:

WP7 - Future perspectives Suggest pilots and domains with successful implementation potential for further elaboration in the second wave of pilots in 2018

WP 7 – General tasks Data access (SGA-1) Data feasibility (SGA-1) Data combination (SGA-2) Summary plus future perspectives (SGA-2)

Milestones and deliverables(SGA-1) Progress and technical report of internal WP-meeting; by M4 Milestone 2. List of availability Big Data sources in the domain(s); by M8 Milestone 3. Recommendation for using two or three Big Data sources in the domain(s); by M12 DELIVERABLE the partial report for each domain containing basic information on: The data access (with legal and privacy aspects) The data quality issues The methodology (focus also on combining data) The technical aspects by M13 We are here now

TASK 1 & TASK 2 BRAINSTORMING RESULTS QUESTIONNAIRE RESULTS INTERNAL MEETING MILESTONE 7.4 PROGRESS AND TECHNICAL REPORT OF INTERNAL WP-MEETING MILESTONE 7.5 „LIST OF AVAILABLE BIG DATA SOURCES IN THE DOMAIN(S)”

Why did we do the brainstorm? to create the widest possible range of Big Data sources (a cafeteria); possible sources of data that public statistics could use for new developments or supplement existing ones, so that in the later stages these sources can be verified from different points of view and gradually part of them will be eliminated as the least useful. to analyze as many as possible use cases of using Big Data sources to take into account the most popular source Big Data is a new phenomenon we should take into account that the potential of each source may still change.

to the QUESTIONNAIRE From BRAINSTORMING

Why did WP7 carry out the questionnaire? to find out more about the possibilities of technical, methodological quality, access in different countries recommending the source to the pilots after 2018 to know the plans for Big Data of different countries questionnaire was sent to countries outside the FPA (but EU country), because we recommend beyond the period of its duration recognize the obstacles of using Big Data sources

The questionnaire results

Questionnaire - results

Results

Results Population Agriculture Tourism Respondents were asked i.e. to indicate domain assuming, that the data source is accessible. For each of three domains (Population, Agriculture and Tourism/border crossing) respondents indicated the most promising BD sources: Mobile sensors (tracking) – Mobile phone location; Social Networks; Data produced by Public Agencies; Internet searches; Websites; Population Mobile sensors (tracking) – Satellite images; Agriculture Data produced by business – Credit cards; Traffic sensors. Tourism

Common WP6 & WP7 face to face meeting took place on 28-30 of June in Warsaw 1. Exchange of information/experience in using BD sources and arrangements for future work WP7 2. Build the list of potential sources for each domain 3. Preparation and establish a framework for cooperation to SGA-2

Results Access Legal Quality Organization IT Methodology Agriculture Tourism/ Border crossing Organization Agriculture IT Population Methodology

Results The results were used to elaborate the next milestone (Milestone 2): „List of availability Big Data sources in the domain(s)”; by M8

Use cases for SGA-2 Domain List of available Big Data sources in the domain(s) Domain Population Agriculture Tourism/Border Crossing Name of the use case Everyday citizen satisfaction Estimation of Agricultural statistics – pilot case study on crop types based on satellite data Border movement Big Data source Social media/blogs/Internet portals Satellite images Traffic sensors Responsibility UK – coordinator (SGA-1) RSO Poznań/Bydgoszcz Department of Agriculture, RSO Olsztyn + IE RSO Rzeszów, Department of Social Survey + NL Brief overview of the methodology Webscraping Data/Text/Web mining Machine learning combining data – data fusion on radar and optical remote sensing data; data comparison with traditional surveys e.g. FSS; combining data – administrative data sources with satellite data. Intertemporal disaggregation and interpolation, Latent variable models, Cross entropy econometrics.

Use case for POPULATION „Everyday citizen satisfaction „ Responsibility: UK – coordinator, supported by PL, PT Data sources: Social media/Blogs/Internet portals Methodology: Webscraping, Data/Text/Web mining, Machine learning The goal of the case study: to examine the level of daily satisfaction by analyzing the content of messages for the presence of defined expressions describing emotional states, e.g., happiness, joy, sadness, fear, anger; to present the moods of people associated with various public events; to observe morbidity areas, e.g., flu. Plan of Combining Datasets: Combine in one repository the selected data from all Big Data sources, Comparison with the results of social studies to add more detailed information, Supplement of information gained in social studies. Main benefits and value added for official statistics: Support traditional European Social Survey, supplement of the research methodology of some phenomena that are difficult to measure through traditional polls. Everyday citizen satisfaction

Use case for POPULATION „Everyday citizen satisfaction„ Responsibility: UK – coordinator, supported by PL, PT Data sources: Social media/Blogs/Internet portals Methodology: Webscraping, Data/Text/Web mining, Machine learning The goal of the case study: to examine the level of daily satisfaction by analyzing the content of messages for the presence of defined expressions describing emotional states, e.g., happiness, joy, sadness, fear, anger; to present the moods of people associated with various public events; to observe morbidity areas, e.g., flu. Plan of Combining Datasets: Combine in one repository the selected data from all Big Data sources, Comparison with the results of social studies to add more detailed information, Supplement of information gained in social studies. Main benefits and value added for official statistics: Support traditional European Social Survey, supplement of the research methodology of some phenomena that are difficult to measure through traditional polls.

Use case for TOURISM/ BORDER CROSSING Border movement

Use cases for TOURISM/ BORDER CROSSING „Border movement” Responsibility: PL – coordinator, supported by NL and PT. Data sources: Traffic sensors. Methodology: intertemporal disaggregation and interpolation; latent variable models; cross entropy econometrics. The goal of the case study: to estimate border traffic through internal border of EU (Polish-German, Polish-Slovakian, Polish-Czech and Polish-Lithuanian border) also regarding to some mirror statistics. Partial estimation of domestic traffic may be an extra result. Plan of Combining Datasets: Intertemporal disaggregation of data if it is the case (data frequency issue); Latent variable model for data imputation for roads without traffic sensors; Data smoothing if needed; Preparing comparable data sets (common set of variables); Combining traffic data from different sources with cross-entropy econometrics method. Main benefits and value added for official statistics: Decreased burden of interviewers, more detailed results than from the survey solely, data consistent with mirror statistics.

Use case for AGRICULTURE Estimation of Agricultural statistics – pilot case study on crop types based on satellite data

Use case for AGRICULTURE Estimation of Agricultural statistics – pilot case study on crop types based on satellite data Responsibility: PL – coordinator, supported by IE. Data sources: Satellite images, administrative data, in situ surveys. Methodology: combining data – data fusion on radar and optical remote sensing data; data comparison with traditional surveys e.g. FSS; combining data – administrative data source s with satellite data.  The goal of the case study: Crop type: look at the types of crops being grown and see if we can tell this accurately from the imagery; analysis of possibilities of using satellite images. Plan of Combining Datasets: Data fusion – combining data sources by spatial reference. Main benefits and value added for official statistics: Increase the quality of the agricultural surveys; Decrease of respondents burden; More detailed data published by official statistics; Potential decrease of the cost of conducting surveys.

a.nowicka@stat.gov.pl