Linear Submodular Bandits and their Application to Diversified Retrieval Yisong Yue (CMU) & Carlos Guestrin (CMU) Optimizing Recommender Systems Every.

Slides:



Advertisements
Similar presentations
A Support Vector Method for Optimizing Average Precision
Advertisements

The K-armed Dueling Bandits Problem
An Interactive Learning Approach to Optimizing Information Retrieval Systems Yahoo! August 24 th, 2010 Yisong Yue Cornell University.
ICML 2009 Yisong Yue Thorsten Joachims Cornell University
TAU Agent Team: Yishay Mansour Mariano Schain Tel Aviv University TAC-AA 2010.
Presenting work by various authors, and own work in collaboration with colleagues at Microsoft and the University of Amsterdam.
Efficient Processing of Top- k Queries in Uncertain Databases Ke Yi, AT&T Labs Feifei Li, Boston University Divesh Srivastava, AT&T Labs George Kollios,
Efficient summarization framework for multi-attribute uncertain data Jie Xu, Dmitri V. Kalashnikov, Sharad Mehrotra 1.
Diversified Retrieval as Structured Prediction Redundancy, Diversity, and Interdependent Document Relevance (IDR ’09) SIGIR 2009 Workshop Yisong Yue Cornell.
Optimizing Recommender Systems as a Submodular Bandits Problem Yisong Yue Carnegie Mellon University Joint work with Carlos Guestrin & Sue Ann Hong.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Spread of Influence through a Social Network Adapted from :
Randomized Sensing in Adversarial Environments Andreas Krause Joint work with Daniel Golovin and Alex Roper International Joint Conference on Artificial.
1 The PageRank Citation Ranking: Bring Order to the web Lawrence Page, Sergey Brin, Rajeev Motwani and Terry Winograd Presented by Fei Li.
Taming the monster: A fast and simple algorithm for contextual bandits
Beat the Mean Bandit Yisong Yue (CMU) & Thorsten Joachims (Cornell)
L EARNING TO D IVERSIFY USING IMPLICIT FEEDBACK Karthik Raman, Pannaga Shivaswamy & Thorsten Joachims Cornell University 1.
Turning Down the Noise in the Blogosphere Khalid El-Arini, Gaurav Veda, Dafna Shahaf, Carlos Guestrin.
Maximizing the Spread of Influence through a Social Network
Machine Learning & Data Mining CS/CNS/EE 155 Lecture 17: The Multi-Armed Bandit Problem 1Lecture 17: The Multi-Armed Bandit Problem.
Online Distributed Sensor Selection Daniel Golovin, Matthew Faulkner, Andreas Krause theory and practice collide 1.
Beyond Keyword Search: Discovering Relevant Scientific Literature Khalid El-Arini and Carlos Guestrin August 22, 2011 TexPoint fonts used in EMF. Read.
Information Retrieval Models: Probabilistic Models
Mortal Multi-Armed Bandits Deepayan Chakrabarti,Yahoo! Research Ravi Kumar,Yahoo! Research Filip Radlinski, Microsoft Research Eli Upfal,Brown University.
CS246: Page Selection. Junghoo "John" Cho (UCLA Computer Science) 2 Page Selection Infinite # of pages on the Web – E.g., infinite pages from a calendar.
Near-optimal Nonmyopic Value of Information in Graphical Models Andreas Krause, Carlos Guestrin Computer Science Department Carnegie Mellon University.
An Improved Approximation Algorithm for Combinatorial Auctions with Submodular Bidders.
Multi-armed Bandit Problems with Dependent Arms
Aggregation Algorithms and Instance Optimality
Influence Maximization
Efficiently handling discrete structure in machine learning Stefanie Jegelka MADALGO summer school.
Mariam Salloum (YP.com) Xin Luna Dong (Google) Divesh Srivastava (AT&T Research) Vassilis J. Tsotras (UC Riverside) 1 Online Ordering of Overlapping Data.
Hierarchical Exploration for Accelerating Contextual Bandits Yisong Yue Carnegie Mellon University Joint work with Sue Ann Hong (CMU) & Carlos Guestrin.
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
Reinforcement Learning Evaluative Feedback and Bandit Problems Subramanian Ramamoorthy School of Informatics 20 January 2012.
1 Information Filtering & Recommender Systems (Lecture for CS410 Text Info Systems) ChengXiang Zhai Department of Computer Science University of Illinois,
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
Some Vignettes from Learning Theory Robert Kleinberg Cornell University Microsoft Faculty Summit, 2009.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Karthik Raman, Pannaga Shivaswamy & Thorsten Joachims Cornell University 1.
Scheduling policies for real- time embedded systems.
December 7-10, 2013, Dallas, Texas
Index Interactions in Physical Design Tuning Modeling, Analysis, and Applications Karl Schnaitter, UC Santa Cruz Neoklis Polyzotis, UC Santa Cruz Lise.
Randomized Composable Core-sets for Submodular Maximization Morteza Zadimoghaddam and Vahab Mirrokni Google Research New York.
Hypothesis Testing.  Select 50% users to see headline A ◦ Titanic Sinks  Select 50% users to see headline B ◦ Ship Sinks Killing Thousands  Do people.
Diversifying Search Results Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, Samuel Ieong Search Labs, Microsoft Research WSDM, February 10, 2009 TexPoint.
Lecture 3-1 Independent Cascade Weili Wu Ding-Zhu Du University of Texas at Dallas.
1 Monte-Carlo Planning: Policy Improvement Alan Fern.
COMP 2208 Dr. Long Tran-Thanh University of Southampton Bandits.
Vasilis Syrgkanis Cornell University
Polyhedral Optimization Lecture 5 – Part 3 M. Pawan Kumar Slides available online
Jian Li Institute for Interdisciplinary Information Sciences Tsinghua University Multi-armed Bandit Problems WAIM 2014.
Contextual Bandits in a Collaborative Environment Qingyun Wu 1, Huazheng Wang 1, Quanquan Gu 2, Hongning Wang 1 1 Department of Computer Science 2 Department.
Basics of Multi-armed Bandit Problems
Independent Cascade Model and Linear Threshold Model
Monitoring rivers and lakes [IJCAI ‘07]
Adaptive, Personalized Diversity for Visual Discovery
Moran Feldman The Open University of Israel
Optimizing Submodular Functions
Multi-armed Bandit Problems with Dependent Arms
Bandit’s Paradise: The Next Generation of Test-and-Learn Marketing
Independent Cascade Model and Linear Threshold Model
Rank Aggregation.
Structured Learning of Two-Level Dynamic Rankings
Chapter 2: Evaluative Feedback
Topic 3: Prob. Analysis Randomized Alg.
Independent Cascade Model and Linear Threshold Model
Chapter 2: Evaluative Feedback
Presentation transcript:

Linear Submodular Bandits and their Application to Diversified Retrieval Yisong Yue (CMU) & Carlos Guestrin (CMU) Optimizing Recommender Systems Every day, users come to news portal For each user, News portal recommends L articles to cover the user’s interests Users provide feedback (clicks, ratings, “likes”). System integrates feedback for future use. Challenge 1: Making Diversified Recommendations Should recommend optimally diversified sets of articles.  “Israel implements unilateral Gaza cease-fire :: WRAL.com” “Israel unilaterally halts fire, rockets persist” “Gaza truce, Israeli pullout begin | Latest News” “Hamas announces ceasefire after Israel declares truce - …” “Hamas fighters seek to restore order in Gaza Strip - World - Wire …” “Israel implements unilateral Gaza cease-fire :: WRAL.com” “Obama vows to fight for middle class” “Citigroup plans to cut 4500 jobs” “Google Android market tops 10 billion downloads” “UC astronomers discover two largest black holes ever found” Challenge 2: Personalization Modeling Diversity via Submodular Utility Functions We assume a set of D concepts or topics Users are modeled by how interested they are in each topic Let F i (A) denote the how well set of articles A covers topic i. (“topic coverage function”) We model user utility as F(A|w) = w T [F 1 (A), …, F D (A)] Linear Submodular Bandits Problem At each iteration t: A set of available articles, A t Each article represented using D submodular basis functions Algorithm selects a set of L articles A t Algorithm recommends A t to user, receives feedback Assumptions: Pr(like | a,A) = w T Δ(a|A) (conditional submodular independence) Regret: (1-1/e)OPT – sum of rewards Goal: recommend a set of articles that optimally covers topics that interest the user. Each topic coverage function F i (A) is monotone submodular! A function F is submodular if i.e., the benefit of recommending a second (redundant) article is smaller than adding the first. Properties of Submodular Functions Sums of submodular functions are submodular So F(A|w) is submodular Exact inference is NP-hard! Greedy algorithm yields (1-1/e) approximation bound Incremental gains are locally linear! Both properties will be exploited by our online learning algorithm We address two challenges: Diversified recommendations Exploration for personalization Example: Probabilistic Coverage LSBGreedy News Recommender User Study OR ? Different users have different interests Can only learn interests by recommending and receiving feedback Exploration versus exploitation dilemma We model this as a bandit problem! Mean Estimate by TopicUncertainty of Estimate 10 days, 10 articles per day Compared against Multi. Weighting (no exploration) [El-Arini et al, ‘09] Ranked Bandits + LinUCB (reduction approach, does not directly model diversity) [Radlinski et al, ’08; Li et al., ‘10] Comparing learned weights for two sessions (LSBGreedy vs MW) 1 st session, MW overfits to “world “ topic 2 nd session, user liked few articles, and MW did not learn anything Maintain mean and confidence interval of user’s interests Greedily recommend articles with highest upper confidence utility In example below, chooses article about economy Theorem: with probability 1- δ average regret shrinks as ComparisonWin / Tie / LossGain per Day % Likes LSBGreedy vs Static 24 / 0 / % LSBGreedy vs MW 24 / 1 / % LSBGreedy vs RankLinUCB 21 / 2 / % + Each article a has probability P(i|a) of covering topic I Define topic coverage function for set A as Straightforward to show that F is monotone submodular [El-Arini et al., ‘09]