Download presentation
Presentation is loading. Please wait.
Published byMarilynn Eaton Modified over 8 years ago
1
QUERY-PERFORMANCE PREDICTION: SETTING THE EXPECTATIONS STRAIGHT Date : 2014/08/18 Author : Fiana Raiber, Oren Kurland Source : SIGIR’14 Advisor : Jia-ling Koh Speaker : Shao-Chun Peng
2
Outline Introduction Related Work Approach Experimental Conclusion
3
Introduction What is “Query”? query Browser method corpus Retrieved list
4
Introduction What the user wants when they query? The document really relevant with query.
5
Motivation Why we need to “predict” the query Performance ? Improved prediction methods do not lead to improved retrieval methods Bad query Browser method corpus Retrieved list good query Don’t change method
6
Purpose How to estimate retrieval effectiveness in the absence of relevance judgments.
7
Outline Introduction Related Work Approach Experimental Conclusion
8
Prediction task Prediction over corpora Prediction over retrieved lists Prediction over queries pre retrieval post retrieval
9
Prediction task notations Q queries C document corpora M retrieval methods L Retrieved list R =1 if the retrieval was effective 0 otherwise query corpus method list
10
Prediction over corpora Federated search Fix Q=q for each c any assignment m query corpus Relevant ?
11
Prediction over retrieved lists Fusion task Lists differ due to the retrieval method Fix Q=q C=c for each l query list Relevant ?
12
Prediction over queries pre retrieval Fix C=c for each q post retrieval Fix C=c estimate for each pair of q and m
13
Related Work why the expectation that using previously proposed query-performance predictors would help to improve retrieval effectiveness was not realized. How to improve retrieval effectiveness by using query-performance predictors?
14
Outline Introduction Related Work Approach Experimental Conclusion
15
Approach Prediction over corpora Cluster Ranking Prediction over retrieved lists Learning to rank queries using Markov Random Fields Prediction over queries Learning to rank queries using Markov Random Fields
16
Markov Random Fields
17
Features selection SCQ Term and corpus simularity(Tf.idf based) VAR variance of the tf.idf values of a term over the documents in the corpus in which it appears IDF inverse document frequency
18
Features selection Entropy High entropy of the term distribution in the document potentially indicates content breadth Cohesion compute for each document d in L its similarity with all documents in L(average) Sw1 the ratio between the number of stopwords and non- stopwords Sw2 the fraction of stopwords in a stop word list
19
Features selection Clarity KL divergence between a relevance language model induced from the list and that induced from the corpus ImpClarity a variant of Clarity proposed for Web corpora
20
Features selection WIG the difference between the mean retrieval score in the list and that of the corpus which represents a pseudo non- relevant document NCQ the standard deviation of retrieval scores in the list UEF(clarity) UEF(ImpClarity) UEF(WIG) UEF(NCQ)
21
Outline Introduction Related Work Approach Experimental Conclusion
22
Data Set
23
Experimental X QC XLXL X LC X QLC
24
Experimental
25
Outline Introduction Related Work Approach Experimental Conclusion
26
Conclusion why using previously was not shown to improve retrieval effectiveness devised a learning-to-rank approach for predicting performance over queries.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.