Presentation is loading. Please wait.

Presentation is loading. Please wait.

Crawling the Web for Job Knowledge

Similar presentations


Presentation on theme: "Crawling the Web for Job Knowledge"— Presentation transcript:

1 Crawling the Web for Job Knowledge
Lévai András Széchenyi István University, RGDI and Center of Job Knowledge Research

2 Research topic: regional science, creative regions
Speaker’s Bio 3rd year PhD student Research topic: regional science, creative regions Database administrator Web developer Dátum: Előadó: Lévai András

3 Crawling data – URL Fetching Processing data – HTML Parsing
Development Roadmap Crawling data – URL Fetching Processing data – HTML Parsing Creating User Interface for the Database Adding DataTable as Datagrid Dátum: Előadó: Lévai András

4 Sqlite3/MySQL/MongoDB database engines
Specs Ubuntu servers Cloud technology/VPS Python Scrapy Flask framework Sqlite3/MySQL/MongoDB database engines Dátum: Előadó: Lévai András

5 Scrapy Dátum: Előadó: Lévai András

6 An open source web scraping framework for Python Simple Productive
Scrapy An open source web scraping framework for Python Simple Productive Fast Extensible Well documented Dátum: Előadó: Lévai András

7 Define the data you want to scrape Write a spider to extract the data
Scrapy - Workflow Pick a website Define the data you want to scrape Write a spider to extract the data Run the spider to extract the data Review scraped data Dátum: Előadó: Lévai András

8 Different crawler for different sites
Crawling Issues Speed vs DoS Different crawler for different sites Sites are always under development API Dátum: Előadó: Lévai András

9 Framework for support research activities
Dátum: Előadó: Dr. Lévai András

10 Generated map Dátum: Előadó: Lévai András

11 Job-Knowledge-Analytics-UI
Dátum: Előadó: Lévai András

12 Dátum: Előadó: Dr. Minta Katalin egyetemi docens


Download ppt "Crawling the Web for Job Knowledge"

Similar presentations


Ads by Google