iCrawl – Hiwis Jobs and Master Thesis Context iCrawl Project – A novel approach for the creation of high quality Web Archives Easy to use and extensible Web archive crawler framework Usable also by non-technicians User Interface Key Component to interact with the crawler Setting up crawls Maintaining and monitoring crawls Quality assurance of crawls Thomas Risse 08/12/18
Hiwi Job in the context of Web Archiving Topic User Interface development for setup, maintaining and monitoring of crawls Easy to use (also for non-computer scientists) Near-real-time information Requirements Interest in doing cool things in the context of a research project A “feeling” for good design and user friendliness Programming skills in Java Contact: Thomas Risse (L3S), risse@L3S.de Thomas Risse 08/12/18
Master Thesis: Crawl Specification Wizard Problem Statement Quality of a Web Archive depends on the quality of the Crawl specification Crawl specification for focused crawls are complex and hard to define (Initial Starting points, good descriptions of terms, entities, etc.) Crawl specification are similar to search engine queries but more complex Aim of the Master Thesis Development of an semi-automatic tool that learns the intention of a crawl Based on a set of reference pages or on search engine results Iterative and interactive process Requires analysis and extraction of information from Web pages Requirements Interest in doing cool things in the context of a research project A “feeling” for good design and user friendliness Programming skills in Java Contact: Thomas Risse (L3S), risse@L3S.de Thomas Risse 08/12/18