WP1: Web scraping Job Vacancies- ELSTAT

Slides:



Advertisements
Similar presentations
Challenges in designing mixed mode in business surveys Dr Mojca Noc Razinger, Statistical Office of the Republic of Slovenia.
Advertisements

Electronic reporting in Poland 27th Voorburg Group Meeting Warsaw, Poland October 1st to October 5th, 2012 Central Statistical Office of Poland.
1 1 Definitions and basic concepts Statistical Training Course Use of Administrative Registers in Production of Statistics Warzaw 14 – 17 October 2014.
1 National Job Vacancy Surveys: The Same or Still Different? Anja Kettner and Michael Stops Institute for Employment Research, Nuremberg (Germany) European.
Georgia: business register data and gender-disaggregated indicators Tengiz Tsekvava Technical Meeting on Measuring Entrepreneurship from Gender Perspective.
Virtual Platform for Adult Learning Hindi Portal In Brief PRIA, DVV and ASPBAE.
Managing Content with SharePoint 2007 Module 0. Overview  Introduction  About This Course  Course Outline  Using Virtual PC.
Big Data activities at SURS Statistical Office of the Republic of Slovenia DIME/ITDG meeting, February 2016.
Advertising for Positions. Why is it important to be careful when advertising your position? It is often potential staff's first impression To attract.
Data Science in Official Statistics: The Big Data Team
Book Chapter Project Aisha Al-Bgumi ,Finance
Statistical Business Register– the Estonian experience
Sharing of previous experiences on scraping Istat’s experience
Project support meeting
WEB SCRAPING FOR JOB STATISTICS
The recruitment process
Towards more flexibility in responding to users’ needs
Standardized and modernized data editing in Statistics Denmark
Advertising for Positions
ESSNet Pilot: Web Scraping for Job Vacancy Statistics
The Development of Statistical Business Registers in
Removing Duplicate Job Ads
Istituto Nazionale di Statistica – Istat
OECD-Eurostat Expert Meeting on Trade in Services Statistics
Job Title: Bus Operator
Introduction to e-Commerce
Manufacturing Statistics in Turkey
Big Data ESSNet: Web Scraping for Job Vacancy Statistics Nigel Swier UK Office for National Statistics.
Classifying enterprises by economic activity
Dublin, april 2012 Role of Business Register in coordinated sampling
Quality Aspects and Approaches in Business Statistics
Profile of Danish enterprises with 0 employees
ESSnet project "Automated data collection and reporting in accommodation statistics"   Objectives, achievements and results
Use of the business register in the Dutch labour statistics
A way of collecting, organizing, and interpreting data
Working Group on Labour Statistics for MEDSTAT countries October 2013
Business Register Quality Improvement
ESSNet Pilot: Web Scraping for Job Vacancy Statistics
ESTP – Course Structural Business Statistics
Guidelines on the use of estimation methods for the integration of administrative sources WG Methodology 2018/05/03.
Workshop on Reporting Environmental Goods and Services
VAT data in Business Register and Business Statistics
Sub-regional workshop on integration of administrative data, big data
Statistical Office of the Republic of Slovenia
Canada’s trade in services by industry
LAMAS Working Group January 2016
Fresh business general template
Basic Statistics on Korean SMEs
Boro Nikic WP1&WP2 meeting Rome, November 2016
Concepts of industry, occupation and status in employment - Overview
Workshop II: Implementation of a more efficient way of collecting data
Overview of Approaches to Register-Based Populating Censuses
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
Regional Seminar on Developing a Program for the Implementation of the 2008 SNA and Supporting Statistics Gülçin ERDOĞAN September 2013 Ankara.
Istat - Structural Business Statistics
EBS Manual Status quo with emphasis on Prodcom related issues Josef Richter, GOPA item 4 of the agenda Prodcom Working Group Meeting October 21, 2013.
RESEARCH REPORT Presented By: Dr. Ajit Singh yadav
SiVA Online Tutorial SiVA 11 Create Advertisement
Statistical Business Register– the Estonian experience
Research process.
Hungarian Central Statistical Office
ESSnet on Consistency Workshop
Parallel Session: BR maintenance Quality in maintenance of a BR:
Course Objectives This course introduces basic statistical methods with applications for management. Management of business involves working with target.
TITLE OF THE PRESENTATION
Student name Student ID Degree program Area of specialization
Literary PPT Template General template of literary and artistic business Fresh Business General Template Applicable to Enterprise Introduction, summary,
Literary PPT Template Lovely Fresh Business Common Ppt Templates
Joint UNECE/Eurostat/OECD
Title of Article First Author: Second Author: Third Author:
Presentation transcript:

WP1: Web scraping Job Vacancies- ELSTAT Hellenic Statistical Authority WP1: Web scraping Job Vacancies- ELSTAT Christina Pierrakou – Eleni Bisioti Essnet Big Data: WP1:Webscraping/Job Vacancies

Overview Introduction Web Scraping Tools Deduplication Web scraping experiment Matching Results Next Steps

Introduction Scrape ads directly from two job portals using a web scraping tool Collect key variables: job title job description Location Company name Posted date Salary and job type (full time/temporary)

Web Scraping Tools Import.io “point and click” tool for general scraping purposes Content Grabber setting scraping agents, waiting for selectors handling error cases

Deduplication In the structure of a job portal there is a specific “point” from where one could scrape data and produce data sets without duplicates. This approach worked well for each portal More work is needed to examine the removal of duplicates in the joint data set.

Web Scraping Experiment Matching advertising businesses on job portals with enterprises contained in the: Sample of Job Vacancy Survey Statistical Business Register.

Matching Results (1/3) Sample: 3060 deduplicated ads Period: 15.6.2016 - 15.8.2016. 55% of ads, the employing enterprises were identified 45% of ads, the employing enterprises were not identified

Matching Results (2/3) No company name available for 77% of ads Systematic way of starting such as “Leading Company…”; or “Well Known Firm…” etc. Probably “ghost vacancies”

Matching Results (3/3) 256 enterprises were identified 9% of enterprises matched  sample of Job Vacancy Survey 30% of enterprises were matched  Statistical Business Register

Classification of Companies by Economic Activities (NACE rev.2)

Classification of Ads by Economic Activities (NACE rev.2)

Next Steps Main focus is continuing the work on matching the enterprises names from job portals with JV survey data and Statistical Business Register to understand the job portal coverage.

Thank you very much for your attention c.pierrakou@statistics.gr e.bisioti@statistics.gr