Download presentation
Presentation is loading. Please wait.
Published byMarjory Adams Modified over 9 years ago
1
Similarity Measures for Query Expansion in TopX Caroline Gherbaoui Universität des Saarlandes Naturwissenschaftlich-Technische Fak. I Fachrichtung 6.2 - Informatik Max-Planck-Institut für Informatik AG 5 - Datenbanken und Informationssysteme Prof. Dr. Gerhard Weikum
2
Overview background knowledge similarity measures for the query expansion evaluation of the computed similarity values changes in TopX conclusion
3
Background top-k query processing provides k most relevant results query expansion extends source query terms word sense disambiguation extracts correct meaning ontology amount of terms with their meanings and semantic relations
4
Word Sense Disambiguation „java, coffee“ „java “ „island“ „coffee“ „programming language“ …
5
Query Expansion „COFFEE“„drink, espresso“
6
TopX top-k retrieval engine text and XML data word sense disambiguation query expansion ontology
7
TopX – WordNet Ontology lexicon for the English language hierarchical relations one relation one direction ~160,000 words ~120,000 synsets ~210,000 relations
8
TopX – YAGO Ontology Wikipedia and WordNet hierarchical and not hierarchical relations one relation two directions ~2,100,000 words ~2,200,000 concepts ~6,000,000 relations
9
Similarity Measures Dice similarity the already used measure in TopX NAGA similarity applied measure for YAGO Best WordNet similarity measure with best result among WordNet measures
10
Dice Similarity Measure sdfsdf measures the intersection of two regions
11
NAGA Similarity Measure sdfasfsdf combination of the confidence of a relation and the informativeness of a relation
12
Best WordNet Similarity Measure sdfsdfsdf product of the transfer function of the path length and the transfer function of the concept depth
13
Evaluation
14
DICE measure applicable also on the YAGO ontology NAGA measure applicable with omitting of the forward direction Best WordNet measure not applicable due to the density of YAGO
15
Changes for TopX tuning of some procedures Dijkstra algorithm word sense disambiguation query expansion extension of configuration file
16
Conclusion larger knowledge base more flexibility increased complexity further measure for the similarity computation NAGA similarity
17
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.