Download presentation
Presentation is loading. Please wait.
Published byRussell Adams Modified over 8 years ago
1
Information Retrieval Quality of a Search Engine
2
Is it good ? How fast does it index Number of documents/hour (Average document size) How fast does it search Latency as a function of index size Expressiveness of the query language
3
Measures for a search engine All of the preceding criteria are measurable The key measure: user happiness …useless answers won’t make a user happy
4
Happiness: elusive to measure Commonest approach is given by the relevance of search results How do we measure it ? Requires 3 elements: 1.A benchmark document collection 2.A benchmark suite of queries 3.A binary assessment of either Relevant or Irrelevant for each query-doc pair
5
Evaluating an IR system Standard benchmarks TREC: National Institute of Standards and Testing (NIST) has run large IR testbed for many years Other doc collections: marked by human experts, for each query and for each doc, Relevant or Irrelevant On the Web everything is more complicated since we cannot mark the entire corpus !!
6
General scenario Relevant Retrieved collection
7
Precision: % docs retrieved that are relevant [issue “junk” found] Precision vs. Recall Relevant Retrieved collection Recall: % docs relevant that are retrieved [issue “info” found]
8
How to compute them Precision: fraction of retrieved docs that are relevant Recall: fraction of relevant docs that are retrieved Precision P = tp/(tp + fp) Recall R = tp/(tp + fn) RelevantNot Relevant Retrievedtp (true positive) fp (false positive) Not Retrievedfn (false negative) tn (true negative)
9
Some considerations Can get high recall (but low precision) by retrieving all docs for all queries! Recall is a non-decreasing function of the number of docs retrieved Precision usually decreases
10
Precision vs. Recall Relevant Highest precision, very low recall Retrieved Precision: fraction of retrieved docs that are relevant Recall: fraction of relevant docs that are retrieved
11
Relevant Lowest precision and recall Retrieved Precision: fraction of retrieved docs that are relevant Recall: fraction of relevant docs that are retrieved Precision vs. Recall
12
Relevant Low precision and very high recall Retrieved Precision: fraction of retrieved docs that are relevant Recall: fraction of relevant docs that are retrieved Precision vs. Recall
13
Relevant Very high precision and recall Retrieved Precision: fraction of retrieved docs that are relevant Recall: fraction of relevant docs that are retrieved Precision vs. Recall
14
Precision-Recall curve We measures Precision at various levels of Recall Note: it is an AVERAGE over many queries precision recall x x x x
15
A common picture precision recall x x x x
16
Interpolated precision If you can increase precision by increasing recall, then you should get to count that…
17
Other measures Precision at fixed recall most appropriate for web search: 10 results 11-point interpolated average precision The standard measure for TREC: you take the precision at 11 levels of recall varying from 10% to 100% by 10% of retrieved docs each step, using interpolation, and average them
18
F measure Combined measure (weighted harmonic mean) : People usually use balanced F 1 measure i.e., with = 1 or = ½ thus 1/F = ½ (1/P + 1/R) Use this if you need to optimize a single measure that balances precision and recall.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.