Active Feedback: UIUC TREC 2003 HARD Track Experiments Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Active Feedback: UIUC TREC 2003 HARD Track Experiments Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign

Goal of Participation Our general goal is to test and extend language modeling approaches for a variety of different tasks Language Modeling Retrieval Methods HARD Active feedback RobustGenomicsWeb Robust feedback Semi-structured query model Relevance propagation model this talk notebook papers

Outline Active Feedback Three Methods HARD Track Experiment Design Results Conclusions & Future Work

What is Active Feedback? An IR system actively selects documents for obtaining relevance judgments If a user is willing to judge k documents, which k documents should we present in order to maximize learning effectiveness? Aim at minimizing a users effort…

Normal Relevance Feedback Feedback Judgments: d 1 + d 2 - … d k - Query Retrieval Engine Top K Results d 1 3.5 d 2 2.4 … d k 0.5 User Document collection

Active Feedback Feedback Judgments: d 1 + d 2 - … d k - Query Retrieval Engine Which k docs to present ? User Document collection Can we do better than just presenting top-K? (Consider redundancy…)

Active Feedback Methods Top-K (normal feedback) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 … Gapped Top-K K-cluster centroid Aiming at high diversity …

Evaluating Active Feedback in HARD Track Query Select 6 passages Clarification form User + Completed form ++ + - - Initial Results No feedback (Top-k, gapped, clustering) Feedback Results (doc-based, passage-based)

Retrieval Methods (Lemur toolkit) Query Q Document D Results Kullback-Leibler Divergence Scoring Feedback Docs F={d 1, …, d n } Active Feedback Default parameter settings Mixture Model Feedback Only learn from relevant docs

Results Top-k is always worse than gapped top-k and the clustering method Clustering generates fewer, but higher quality examples Passage-based query model updating performs better than document-based updating

Comparison of Three Active Feedback Methods CollectionActive FB Method #Rel Include judged docsExclude judged docs MAPPr@20docMAPPr@20doc TREC2003 (Official) Top-K1460.3250.4980.3020.470 Gapped1500.3280.5040.3110.477 Clustering1050.330*0.514*0.326*0.503* AP88-89 Top-K1980.2300.3270.1930.300 Gapped1800.234*0.342*0.2140.322 Clustering1180.2320.3410.221*0.328* Top-K is the worst! Clustering uses fewest relevant docs bold font = worst * = best

Appropriate Evaluation of Active Feedback New DB Original DB with judged docs + - + Original DB without judged docs + - + Cant tell if the ranking of un-judged documents is improved Different methods have different test documents See the learning effect more explicitly But the docs must be similar to original docs

Comparison of Different Test Data (Learning on AP88-89) Test DataActive FB Method #RelMAPPr@20doc AP88-89 Including judged docs Top-K1980.2300.327 Gapped1800.234*0.342* Clustering1180.2320.341 AP88-89 Excluding judged docs Top-K1980.1930.300 Gapped1800.2140.322 Clustering1180.221*0.328* AP90Top-K1980.2200.278 Gapped1800.2220.283* Clustering1180.223*0.282 Top-K is consistently the worst! Clustering generates fewer, but higher quality examples

Effectiveness of Query Model Updating: Doc-based vs. Passage-based JudgmentsUpdating Method MAPPr@20doc NoneBaseline (no updating) 0.3080.485 GappedDoc-based0.3320.503 Passage-based0.3510.517 Improvement+ 5.7%+2.7% ClusteringDoc-based0.3290.502 Passage-based0.3470.522 Improvement+5.4%+4.0% Mixture model query updating methods are effective Passage-based is consistently better than doc-based

Conclusions Introduced the active feedback problem Proposed and tested three methods for active feedback (top-k, gapped top-k, clustering) Studied the issue of evaluating active feedback methods Results show that –Presenting the top-k is not the best strategy –Clustering can generate fewer, higher quality feedback examples

Future Work Explore other methods for active feedback (e.g,. negative feedback, MMR method) Develop a general framework that –Combines all the utility factors (e.g., being informative and best for learning) –Can model different questions (e.g., model both term selection and relevance judgments) Further study how to evaluate active feedback methods

Active Feedback: UIUC TREC 2003 HARD Track Experiments Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Similar presentations

Presentation on theme: "Active Feedback: UIUC TREC 2003 HARD Track Experiments Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Active Feedback: UIUC TREC 2003 HARD Track Experiments Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.

Similar presentations

Presentation on theme: "Active Feedback: UIUC TREC 2003 HARD Track Experiments Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign."— Presentation transcript:

Similar presentations

About project

Feedback