Download presentation
Presentation is loading. Please wait.
Published byPiers McKenzie Modified over 6 years ago
1
Big Data ESSNet: Web Scraping for Job Vacancy Statistics Nigel Swier UK Office for National Statistics
2
Potential of On-line Job Vacancy Data
Current Official Estimates (Survey) Online data Frequency Monthly (Rolling Qtr) Real-time? Industry Sector Enterprise Size Job type / skills Geography National Totals More frequent More timely More granular Less burden Cheaper???
3
Six challenges with using On-line Job Vacancy (OJV) data for statistical purposes
Not all jobs are advertised on-line. Coverage is therefore incomplete and not fully representative. There is no definitive data source Much OJV data is unstructured. Text processing and analysis is required to extract useful information. Some job advertisements are not within the scope of official statistics definitions of a job vacancy (EU) The official definition of a job advertisement does not correspond directly to the concept of a live job advertisement The specific job vacancy data landscape varies between countries
4
On-line Job Vacancy Data Landscape
Job Boards Private Employment Agencies Employers Job Search Engines National Employment Agency Enterprise Websites Data Aggregators Public Policy Cedefop Official Job Vacancy Statistics
5
Approaches to Data Access
Direct web scraping Point and click Progammatic (e.g. Python Scrapy) Web-scraping enterprise websites Agreed Access National employment agency Private job portals Commercial providers CEDEFOP Images: Creative Commons
6
[e.g. classifying textual data with machine learning]
Data Handling: Text analysis and classification [e.g. classifying textual data with machine learning] Occupation is fairly straightforward in this case Industry is more difficult. This company is an employment agency not the employer. But there are clues…. Can industry and occupation be classified from a job ad?
7
Data Handling: Flow to stock transformation
Job Vacancy Lifecycle
8
Assessment against survey aggregates: by industry sector
9
Conclusions/Questions
Agreed access arrangements are generally better than direct web scraping OJV data cannot replace the Job Vacancy Survey (in EU) OJV data does not correspond to target concepts and only measures part of the labour market. How useful are these measures? If useful, how should these measures be presented alongside the official estimates? (EU) Collaboration with CEDEFOP is essential. How do we get the best possible quality data for official statistics purposes?
10
Other possibilities ? Time series analysis, leading to flash estimates ? Data driven analysis: new insights into existing statistics, e.g. timing of advertisements being placed ? International/Overseas jobs as indicators of labour market “tightness” ? Identification of new job titles – assisting development of standard statistical classifications ?
11
Drivers of Cedefop RLMI work
Better labour market information for better policies Lack of comparable data and systematic analysis RLMI – Real time labour market information The main aim of Cedefop’s department for Skills and labour market is to bring closer world of education and world of work. Therefore, for more than a decade, they develop, apply and promote various skills intelligence tools. - Skills forecasting - Sectoral skills studies - Europeans Skills and Jobs Survey However, it is still rather difficult to gather sufficiently robust and at the same time detailed and cross country comparable information on skills requirements by Employers. Moreover in current dynamically changing world the information is required much faster than can be produced by traditional sources – surveys. Cedefop therefore believes that information contained with online vacancies can provide us with this information and complement skills intelligence toolkit. Complement skills intelligence toolkit
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.