ESSNet Pilot: Web Scraping for Job Vacancy Statistics

Slides:



Advertisements
Similar presentations
Regional Workshop for African Countries on Compilation of Basic Economic Statistics Pretoria, July 2007 Administrative Data and their Use in Economic.
Advertisements

Challenges in designing mixed mode in business surveys Dr Mojca Noc Razinger, Statistical Office of the Republic of Slovenia.
Regional GDP Workshop. Purpose of the Project October Regional GDP Workshop Regional GDP Scope Annual Current price (nominal) GDP By region.
Combining administrative and survey data: potential benefits and impact on editing and imputation for a structural business survey UNECE Work Session on.
EU-Regional Policy Structural actions 1 LESSONS FROM THE THEMATIC EVALUATION OF THE TERRITORIAL EMPLOYMENT PACTS Veronica Gaffey, DG Regional.
1 National Job Vacancy Surveys: The Same or Still Different? Anja Kettner and Michael Stops Institute for Employment Research, Nuremberg (Germany) European.
Research Data Centre network for transnational access - four years of experiences by seven European RDCs Karen Dennison (UK Data Archive) and David Schiller.
Emerging methodologies for the census in the UNECE region Paolo Valente United Nations Economic Commission for Europe Statistical Division International.
ESS NET ON C ONSISTENCY OF C ONCEPTS AND A PPLIED M ETHODS OF B USINESS AND T RADE -R ELATED S TATISTICS W ORK P ACKAGE 2: 2011 PROJECT ON TARGET POPULATION,
>>. ESSnet Measuring Global Value Chains 1.Globalisation indicators 2.Methodological development and support for International Organisation and Sourcing.
2008LIFE presentation LIFE+ call for proposals.
Opportunities  Search for a job  Creating a CV  Contact a EURES adviser  Preparing for living and working in EEA country.
Experience and response in developing countries: the twinning project with the Tunisian National Statistical Institute Monica Consalvi ISTAT, Division.
© Enterprise Europe Network South West 2009 The Eurostars Programme Kenny Legg R&D Funding for the Environmental Sector – 29 June 2010 European Commission.
ESS-net DWH ESSnet on microdata linking and data warehousing in statistical production Harry Goossens – Statistics Netherlands Head Data Service Centre.
Beijing, October 19, th International Roundtable on Business Survey Frames Co-ordinating role of the Business Register in Economic Statistics Results.
Big Data activities at SURS Statistical Office of the Republic of Slovenia DIME/ITDG meeting, February 2016.
13-Jul-07 State of the art of the ISCO-08 implementation.
14-Sept-11 The EGR version 2: an improved way of sharing information on multinational enterprise groups.
Your first EURES job Making it easier to move and work to recruit young people in Europe.
Data Science in Official Statistics: The Big Data Team
Sharing of previous experiences on scraping Istat’s experience
EUROPEAN UNION – MAKING OFF European Economic Community
WEB SCRAPING FOR JOB STATISTICS
Discussion: Timely estimates of economic indicators – Session C3 –
WP2 Internal Meeting 15:00-15:30 Next Milestones and proposed workplan
ESSNet Pilot: Web Scraping for Job Vacancy Statistics
WP1: Web scraping Job Vacancies- ELSTAT
14.00 – The common EURES IT platform & the mapping process - workshop Martin Le Vrang, DG EMPL Kornelia Kozovska, DG EMPL Zoltan Patkai, DG EMPL.
Removing Duplicate Job Ads
Istituto Nazionale di Statistica – Istat
ARTEMIS Industry Association & ARTEMIS Joint Undertaking
Big Data ESSNet: Web Scraping for Job Vacancy Statistics Nigel Swier UK Office for National Statistics.
Classifying enterprises by economic activity
State of play of OP negotiations and OP implementation
Bettina Wistrom OECD Statistics Directorate
SOCIAL DIALOGUE IN THE SOCIAL SERVICES SECTOR IN EUROPE
The Inventory Questionnaire
Selection of cities Anastasios Maroudas Eurogramme
Profile of Danish enterprises with 0 employees
Adult Education Survey
Working Group on Labour Statistics for MEDSTAT countries October 2013
Goals and objectives of Work package 2 of the ESSnet on Consistency of concepts and applied methods of business and trade-related statistics Norbert Rainer,
Dissemination Workshop ESSnet Big Data Sofia, February 2017
A legal framework for Urban Audit – Next steps
WORKSHOP ON THE DATA COLLECTION OF OCCUPATIONAL DATA Luxembourg, 28 November 2008 Occupation as a core variable in social surveys Sylvain Jouhette
Organization of efficient Economic Surveys
Statistical Office of the Republic of Slovenia
Italian situation in the following areas:
Regional Accounts
Representing the coordinators: Maja Fromseier Mortensen
Exchange and Sharing of Economic Data
Boro Nikic WP1&WP2 meeting Rome, November 2016
ESF Informal Technical Working Group meeting Brussels,
WP7 – COMBINING BIG DATA - STATISTICAL DOMAINS
Administrative Data and their Use in Economic Statistics
Big Data ESSNet WP 1: Web scraping / Job Vacancies Pilot
"Environmental Expenditure Statistics"
European Forest Accounts – Quality of the results for 2014 and 2015
International Conference on Real Estate Statistics
CSPA: The Future of Statistical Production
ARTEMIS Industry Association & ARTEMIS Joint Undertaking
REPORTING ON DELIVERY OF EU BIODIVERSITY ACTION PLAN
Multinational enterprise groups in the EU Dissemination from the EGR
Eurostat Workshop on ISCO - 19 November 2010
Task Force 3, Cultural Industries Kutt Kommel
The trend towards protecting distributors in case of termination: a general overview Silvia Bortolotti.
JOINT UN-ECE/EUROSTAT MEETING ON POPULATION AND HOUSING CENSUSES
Wiesbaden Group Neuchatel 24 – 27 September 2018
Big Data in Official Statistics: Generalities
Presentation transcript:

ESSNet Pilot: Web Scraping for Job Vacancy Statistics

Current Official Estimates (Survey) Rationale Current Official Estimates (Survey) Web data Frequency Monthly Real-time? Industry Sector  Enterprise Size Job type / skills  Sub-national National Totals More frequent More timely More granular Cheaper???

Participants (SGA-1) United Kingdom (lead) Germany Sweden Slovenia Italy Greece

Broad Approach Understand the landscape of web-based job vacancy data in each country Focus first on job portals, later explore enterprise websites Try to replicate existing outputs, then investigate opportunities to produce new types of output. Develop specific approaches that are appropriate to the circumstances in each country Develop common approaches where possible

Data Access 1. Web scraping Job Portals 2. Job Portal APIs 4. Public Sector Agencies 3. Web scraping Enterprise Websites 5. Commercial Suppliers

Job Portals – Evaluation Criteria What 1. Position 2. Occupation 3. Education 4. Type of job (temporary or permanent, full-time, or part time) When 5. Date of advertised vacancy 6. Date of application deadline 7. Date to fill a vacancy Where 8. Location of job Who 9. Direct employer or agency 10. Economic activity of employer (NACE)

Classification of Job Portals 2. Job Search Engines 1. Job Boards 3. Hybrid

Conceptual Definitions Job Ad Job Vacancy

Conceptual Definitions Job Ad Job Vacancy

Conceptual Definitions Job Ad Job Vacancy

Conceptual Definitions Job Ad Job Vacancy “Ghost “ Vacancy

Coverage Issues ‘Ghost’ Vacancies Target Population: All job vacancies Employing business identifiable Advertised through agency Advertised on a job portal Advertised on enterprise website

Assessing Coverage Job Portal Job Portal Job Portal Enterprise Advertising employer differs from reporting unit Trading name differs from legal name Duplicate names on business register Enterprise Matching Business Register Job Vacancy Survey

Removing Duplicates Concatenated list Final deduplicated list Job Portal Concatenated list Deduplicate Final deduplicated list 1. Create common variable list: Job_title Job_description Location_city Location_region Date_posted Enterprise name 2. Clean data: e.g. " .NET Developer - Stoke-On-Trent - £35-£40K " 3. Run dedup to produce candidate matches 4. Active learning step (manual coding of > 100 records) 5. Rerun to automatically remove “duplicate” job ads

Conclusion Job portal data is very rich, but complex and messy Difficult to align to established statistical concepts Need to understand coverage issues and how to tackle them Making progress but a long way to go.

Future Steps Produce measures of job portal coverage Explore approaches for enhancing coverage (including web scraping enterprise websites) Develop methods for combining vacancy survey and job ads from the web Develop methods for feature extraction and coding/classifying textual data (to enrich existing survey data) Explore other uses of on-line job vacancy data

Future Steps Additional ESS partners joining from July 2017: Portugal Belgium France Denmark … the beginnings of a longer term network?

Thank you for your attention!