Download presentation
Presentation is loading. Please wait.
1
MIS 324 -- Professor Sandvig MIS 424 Professor Sandvig
12/31/2018 Screen Scraping MIS 424 Professor Sandvig
2
MIS 324 -- Professor Sandvig
12/31/2018 Today What is Screen Scraping Also called web scraping When to use it How Legal Issues
3
What is Screen Scraping
MIS Professor Sandvig 12/31/2018 What is Screen Scraping Programmatically “scraping” information from a web page Two steps: Retrieve Page Scrape desired information Regular Expressions
4
MIS 324 -- Professor Sandvig
12/31/2018 When to Use Data not available via more direct methods: APIs Designed to expose data Structured web services RSS database
5
MIS 324 -- Professor Sandvig
12/31/2018 When to Use Examples Search engines Google, Bing, Yahoo, … News sites Google news, Yahoo news, … PadMapper, MapCraigs Scrape Craigslist Interface with Legacy Systems No support for web services, RSS, etc.
6
MIS 324 -- Professor Sandvig
12/31/2018 How Handout: ScreenScrape Example: scrape CBE Faculty/Staff Directory
7
MIS 324 -- Professor Sandvig
12/31/2018 Legal Issues Potential to violate copyright laws Many lawsuits: LinkedIn sues 100 individuals for scraping user data (Oct. 2016) Europe battles Google News over 'snippet tax' proposal Belgian Newspapers Claim Retaliation By Google After Copyright Victory
8
MIS 324 -- Professor Sandvig
12/31/2018 Legal Issues MapCraigs.com Scraped Craigslist real estate Displayed on Google maps Blocked IP PadMapper vs. Craigslist lawsuit Paid Craigslist $1,000,000 History: Is Web Scraping Legal? Use cautiously
9
Summary Screen Scraping Useful tool for collecting data from web pages
When API not available Many legal uses: Search engines Legacy systems Can violate copyrights
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.