Download presentation
Presentation is loading. Please wait.
Published byPenelope Harrison Modified over 8 years ago
1
Background Technology Future Design Demo IS250 Project 2 Jean-Anne Fitzpatrick Jennifer English Ian Liu
2
Background Technology Future Design Demo Tired of these? Error 404 - The page you requested was not found!
3
Background Technology Future Design Demo LINK-INATOR ID bad links for termination! Original concept: Dhea Client application for site admins Checks outbound links & reports status Reduces errors via “best effort” service Two Java Modules: CheckLink PageParser
4
CheckLink PageParser (checkHrefType, getHostPart, getPathPart, getPagePart, getPort, removePort) 2d array: host, path, page, port text of page if first line is OK, read rest of page closeSocket connect wholePage connect closeSocket goodPage send HTTP GET request read first line Connect parseFile send HTTP GET request read first line Connect LINK-INATOR
5
Background Technology Future Design Demo
6
Background Technology Future Design Demo Design Considerations Evaluate links on one page only HTTP message formats and status codes per RFC 2616 Many link variations to consider Python vs. Java for parsing
7
Background Technology Future Design Demo Future Considerations E-mail notification of bad links Checking more than one page Support for frames Recursive crawling
8
Background Technology Future Design Demo Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.