Assessing The Retrieval A.I Lab 2007.01.20 박동훈. Contents 4.1 Personal Assessment of Relevance 4.2 Extending the Dialog with RelFbk 4.3 Aggregated Assessment.

Assessing The Retrieval A.I Lab 2007.01.20 박동훈

Contents 4.1 Personal Assessment of Relevance 4.2 Extending the Dialog with RelFbk 4.3 Aggregated Assessment : Search Engine Performance 4.4 RAVE : A Relevance Assessment Vehicle 4.5 Summary

4.1 Personal Assessment of Relevance 4.1.1 Cognitive Assumptions – Users trying to do ‘object recognition’ – Comparison with respect to prototypic document – Reliability of user opinions? – Relevance Scale – RelFbk is nonmetric

Relevance Scale

Users naturally provides only preference information Not(metric) measurement of how relevant a retrieved document is! RelFbk is nonmetric

4.2 Extending the Dialog with RelFbk RelFbk Labeling of the Retr Set

Query Session, Linked by RelFbk

4.2.1 Using RelFbk for Query Refinment

4.2.2 Document Modifications due to RelFbk Fig 4.7 Change documents!? More/less the query that successfully / un matches them

4.3 Aggregated Assessment : Search Engine Performance 4.3.1 Underlying Assumptions –RelFbk(q,di) assessments independent –Users’ opinions will all agree with single ‘omniscient’ expert’s

4.3.2 Consensual relevance Consensually relevant

4.3.4 Basic Measures Relevant versus Retrieved Sets

Contingency table NRel : the number of relevant documents NNRel : the number of irrelevant documents NDoc : the total number of documents NRet : the number of retrieved documents NNRet : the number of documents not retrieved

4.3.4 Basic Measures (cont)

4.3.5 Ordering the Retr set Each document assigned hitlist rank Rank(di) Descending Match(q,di) Rank(di) Match(q,dj) –Rank(di) Pr(Rel(dj)) Coordination level : document’s rank in Retr –Number of keywords shared by doc and query Goal:Probability Ranking Principle

A tale of two retrievals Query1Query2

Recall/precision curve Query1

Retrieval envelope

4.3.6 Normalized recall ri : i 번째 relevant doc 의 hitlist rank Worst Best

4.3.8 One-Parameter Criteria Combining recall and precision Classification accuracy Sliding ratio Point alienation

Combining recall and precision F-measure –[Jardine & van Rijsbergen71] –[Lewis&Gale94] Effectiveness –[vanRijsbergen, 1979] E=1-F, α=1/(β 2 +1) α=0.5=>harmonic mean of precision & recall

Classification accuracy accuracy Correct identification of relevant and irrelevant

Sliding ratio Imagine a nonbinary, metric Rel(di) measure Rank1, Rank2 computed by two separate systems

Point alienation Developed to measure human preference data Capturing fundamental nonmetric nature of RelFbk

4.3.9 Test corpora More data required for “test corpus” Standard test corpora TREC:Text Retrieval Evaluation Conference TREC’s refined queries TREC constantly expanding, refining tasks

More data required for “test corpus” Documents Queries Relevance assessments Rel(q,d) Perhaps other data too – Classification data (Reuters) – Hypertext graph structure (EB5)

Standard test corpora

TREC constantly expanding, refining tasks Ad hoc queries tasks Routing/filtering task Interactive task

Other Measure Expected search length (ESL) –Length of “path” as user walks down HitList –ESL=Num. irrelevant documents before each relevant document –ESL for random retrieval –ESL reduction factor

4.5 Summary Discussed both metric and nonmetric relevance feedback The difficulties in getting users to provide relevance judgments for documents in the retrieved set Quantified several measures of system perfomance

Assessing The Retrieval A.I Lab 2007.01.20 박동훈. Contents 4.1 Personal Assessment of Relevance 4.2 Extending the Dialog with RelFbk 4.3 Aggregated Assessment.

Similar presentations

Presentation on theme: "Assessing The Retrieval A.I Lab 2007.01.20 박동훈. Contents 4.1 Personal Assessment of Relevance 4.2 Extending the Dialog with RelFbk 4.3 Aggregated Assessment."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Assessing The Retrieval A.I Lab 2007.01.20 박동훈. Contents 4.1 Personal Assessment of Relevance 4.2 Extending the Dialog with RelFbk 4.3 Aggregated Assessment.

Similar presentations

Presentation on theme: "Assessing The Retrieval A.I Lab 2007.01.20 박동훈. Contents 4.1 Personal Assessment of Relevance 4.2 Extending the Dialog with RelFbk 4.3 Aggregated Assessment."— Presentation transcript:

Similar presentations

About project

Feedback