Tingdan Luo tl3xd@virginia.edu 05/02/2016 Interactively Optimizing Information Retrieval Systems as a Dueling Bandits Problem Tingdan Luo tl3xd@virginia.edu.

Slides:



Advertisements
Similar presentations
A Support Vector Method for Optimizing Average Precision
Advertisements

ICML 2009 Yisong Yue Thorsten Joachims Cornell University
Date: 2013/1/17 Author: Yang Liu, Ruihua Song, Yu Chen, Jian-Yun Nie and Ji-Rong Wen Source: SIGIR12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Adaptive.
Active Appearance Models
Super Awesome Presentation Dandre Allison Devin Adair.
Presenting work by various authors, and own work in collaboration with colleagues at Microsoft and the University of Amsterdam.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
Beat the Mean Bandit Yisong Yue (CMU) & Thorsten Joachims (Cornell)
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Contextual Advertising by Combining Relevance with Click Feedback D. Chakrabarti D. Agarwal V. Josifovski.
Optimal Design Laboratory | University of Michigan, Ann Arbor 2011 Design Preference Elicitation Using Efficient Global Optimization Yi Ren Panos Y. Papalambros.
Optimizing Estimated Loss Reduction for Active Sampling in Rank Learning Presented by Pinar Donmez joint work with Jaime G. Carbonell Language Technologies.
Evaluating Search Engine
Learning to Rank: New Techniques and Applications Martin Szummer Microsoft Research Cambridge, UK.
1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.
INFO 624 Week 3 Retrieval System Evaluation
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Nearest Neighbor Retrieval Using Distance-Based Hashing Michalis Potamias and Panagiotis Papapetrou supervised by Prof George Kollios A method is proposed.
Ensemble Learning (2), Tree and Forest
Online Search Evaluation with Interleaving Filip Radlinski Microsoft.
Adapting Deep RankNet for Personalized Search
TransRank: A Novel Algorithm for Transfer of Rank Learning Depin Chen, Jun Yan, Gang Wang et al. University of Science and Technology of China, USTC Machine.
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
Modern Retrieval Evaluations Hongning Wang
Learning with large datasets Machine Learning Large scale machine learning.
Focused Matrix Factorization for Audience Selection in Display Advertising BHARGAV KANAGAL, AMR AHMED, SANDEEP PANDEY, VANJA JOSIFOVSKI, LLUIS GARCIA-PUEYO,
Some Vignettes from Learning Theory Robert Kleinberg Cornell University Microsoft Faculty Summit, 2009.
Thesis Proposal PrActive Learning: Practical Active Learning, Generalizing Active Learning for Real-World Deployments.
Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Karthik Raman, Pannaga Shivaswamy & Thorsten Joachims Cornell University 1.
Online Learning for Collaborative Filtering
A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.
Processing of large document collections Part 3 (Evaluation of text classifiers, term selection) Helena Ahonen-Myka Spring 2006.
Distributed Information Retrieval Server Ranking for Distributed Text Retrieval Systems on the Internet B. Yuwono and D. Lee Siemens TREC-4 Report: Further.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.
Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Finding the Right Facts in the Crowd: Factoid Question Answering over Social Media J. Bian, Y. Liu, E. Agichtein, and H. Zha ACM WWW, 2008.
Modern Retrieval Evaluations Hongning Wang
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
NTU & MSRA Ming-Feng Tsai
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.
Autumn Web Information retrieval (Web IR) Handout #14: Ranking Based on Click Through data Ali Mohammad Zareh Bidoki ECE Department, Yazd University.
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008 Annotations by Michael L. Nelson.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Distributed Learning for Multi-Channel Selection in Wireless Network Monitoring — Yuan Xue, Pan Zhou, Tao Jiang, Shiwen Mao and Xiaolei Huang.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Learning Recommender Systems with Adaptive Regularization
Modern Retrieval Evaluations
Evaluation of IR Systems
An Empirical Study of Learning to Rank for Entity Search
Learning Preferences on Trajectories via Iterative Improvement
Aviv Rosenberg 10/01/18 Seminar on Experts and Bandits
Intent-Aware Semantic Query Annotation
Instance Based Learning
Date : 2013/1/10 Author : Lanbo Zhang, Yi Zhang, Yunfei Chen
How does Clickthrough Data Reflect Retrieval Quality?
Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007
INF 141: Information Retrieval
Learning to Rank with Ties
Presentation transcript:

Tingdan Luo tl3xd@virginia.edu 05/02/2016 Interactively Optimizing Information Retrieval Systems as a Dueling Bandits Problem Tingdan Luo tl3xd@virginia.edu 05/02/2016

Offline Learning to Rank Goal: Finding an optimal combination of features. Features: BM25, LM, PageRank... Offline Evaluation Metrics: P@k, MAP, NDCG, MRR… Downsides: Separate training and testing data Human annotated dataset Relevance change over time Train / Test split. Requires human annotations, dataset could be small and expensive. Relevance of documents to queries can change over time : news search.

Online Learning to Rank Goal: Finding an optimal combination of features. Features: BM25, LM, PageRank... Learn from interactive feedback in real time Conduct training and testing at the same time No need to calculate P@k, MAP, NDCG, MRR… Adapt and improve the result to the changing preference. Mouse click, mouse movement Behavior-based metrics: clicks per query, time to first click, time to last click, abandon rate, reformulation rate

Ranker BM25 LM PageRank … 1.3 0.8 2.5

How to generate a new ranker without calculation? BM25 LM PageRank … 1.3 0.8 2.5 How to generate a new ranker without calculation? Do not calculate P@k, MAP, NDCG, MRR

How to generate a new ranker without calculation? Random! BM25 LM PageRank … 1.3 0.8 2.5 How to generate a new ranker without calculation? Random!

Dueling Bandit Gradient Descent

Which Ranker is better?

Team-Draft Interleave Which Ranker is better? Team-Draft Interleave

Dueling Bandit Gradient Descent

Experiment Result Making multiple comparisons per update has no impact on performance. However, sampling multiple queries is very realistic, since a search system might be constrained to, e.g.,making daily updates to their ranking function. Performance on the validation and test sets closely follows training set performance (so we omit their results). This implies that our method is not overfitting.

How to choose 𝛿 and 𝛾 ?

How to choose 𝛿 and 𝛾 ? the average (across all iterations) and final training NDCG@10

Theoretical Analysis Regret formulation Regret: The performance gap between the proposed algorithm and the optimal algorithm A good algorithm should achieve sublinear regret in T , which implies decreasing average regret.

Regret Bound Lipschitz condition Choosing to achieve the regret bound requires knowledge of t (i.e., L ), which is typically not known in practical settings. So sublinear regret is achievable using many choices for

Limitations Not efficient enough One single random vector => large variance

Limitations Not efficient enough Do not consider historical exploration One single random vector => large variance

Questions?