Bc. Anton Balucha Assignment from subject Information Retrieval
search engine results many information about many people strewed, not integrated Anton Balucha - Identification of peopleAvailable at 2
create an application, which identify occurence of person on various web sites Available at 3 Anton Balucha - Identification of people
– (easy to use, transparent list of results) (search people only in USA) (search people on social networks) (search people with some entered parameters – mane, surname, town, state, ) (plugin into Firefox browser) (search people only in USA in entered state) (search people on various portals - Google+, Wikipedia, LinkedIn, Flickr, Twitter) (search people on various portals - Google+, Wikipedia, LinkedIn, Flickr, Twitter) (search people only in USA in entered state, possibility to hire person for searching) Available at Balucha - Identification of people 4
programmed in Java web application available from z no static data active using of results from search engines Available at Balucha - Identification of people 5
Available at Balucha - Identification of people 6 Googleresultsweb pages remove HTML remove stop words remove diacritics stemmingTF-IDF identify keywords show results
Anton Balucha Mária Bieliková Pavol Návrat Peter Borga Petra Majzúnová Miloš Blaško Available at Balucha - Identification of people 7
# Meno a priezvisko |D||R||I||RI| Presnoť (Precision) Pokrytie (Recall) 1. Anton Balucha Mária Bieliková Pavol Návrat Peter Borga Petra Majzúnová Miloš Blaško Available at Balucha - Identification of people 8
Anton Balucha Mária Bieliková Peter Borga Available at Balucha - Identification of people 9
better text processor better stemming better keyword identification just right number of keywords Available at Balucha - Identification of people 10
I found what is stemming & lemmatization what is TF-IDF what is precision & recall how interesting is text research Available at Balucha - Identification of people 11
intallation of Java intallation of Apache Tomcat deploy external applications access to the Internet access to the application Available at Balucha - Identification of people 12
[1]Michal Laclavík, Martin Šeleng: Vyhľadávanie informácií. Vyhľadávanie informácií. Dostupné na ( ) [2]Porter Stemmer. Dostupné na ( ) Available at Balucha - Identification of people 13