Intelligent Database Systems Lab Presenter : Chuang, Kai-Ting Authors : Rafael Odon de Alencar, Clodoveu Augusto Davis Jr., Marcos André Gonçalves 2010, ACM Geographical classification of documents using evidence from Wikipedia
Intelligent Database Systems Lab Outlines Motivation Objectives Methodology Experiments Conclusions Comments
Intelligent Database Systems Lab Motivation Geography-related terms are often used in Web search queries.
Intelligent Database Systems Lab Objectives It is important to recognize the association of documents to places in order to adequately respond to such queries.
Intelligent Database Systems Lab Methodology This paper shows a technique for classifying documents according to their association to places, based on the occurrence of terms that coincide with Wikipedia entry titles.
Intelligent Database Systems Lab Methodology
Intelligent Database Systems Lab Methodology
Intelligent Database Systems Lab Experiments
Intelligent Database Systems Lab Experiments
Intelligent Database Systems Lab Experiments
Intelligent Database Systems Lab Experiments We defined 100 place names to be removed from the documents. 10-fold cross validation was used. Impact in precision: – Wikipedia Model: more than 30% of loss. – TF-IDF Bag-of-words model: about 6% of loss.
Intelligent Database Systems Lab Conclusions Experiments showed that a high level of precision can be achieved with this approach.
Intelligent Database Systems Lab Comments Advantages – The approach is helpful. Applications – Geographic information retrieval.