Structured Learning of Two-Level Dynamic Rankings

Slides:

Advertisements

Similar presentations

A Support Vector Method for Optimizing Average Precision

Advertisements

Diversified Retrieval as Structured Prediction

A Framework for Result Diversification

Less is More Probabilistic Model for Retrieving Fewer Relevant Docuemtns Harr Chen and David R. Karger MIT CSAIL SIGIR2006 4/30/2007.

Davide Mottin, Senjuti Basu Roy, Alice Marascu, Yannis Velegrakis, Themis Palpanas, Gautam Das A Probabilistic Optimization Framework for the Empty-Answer.

Hadi Goudarzi and Massoud Pedram

Diversity Maximization Under Matroid Constraints Date : 2013/11/06 Source : KDD’13 Authors : Zeinab Abbassi, Vahab S. Mirrokni, Mayur Thakur Advisor :

Diversified Retrieval as Structured Prediction Redundancy, Diversity, and Interdependent Document Relevance (IDR ’09) SIGIR 2009 Workshop Yisong Yue Cornell.

Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:

On-line learning and Boosting

Online Max-Margin Weight Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.

Chapter 5 Fundamental Algorithm Design Techniques.

Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.

L EARNING TO D IVERSIFY USING IMPLICIT FEEDBACK Karthik Raman, Pannaga Shivaswamy & Thorsten Joachims Cornell University 1.

WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.

Toward Whole-Session Relevance: Exploring Intrinsic Diversity in Web Search Date: 2014/5/20 Author: Karthik Raman, Paul N. Bennett, Kevyn Collins-Thompson.

Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.

Linear Submodular Bandits and their Application to Diversified Retrieval Yisong Yue (CMU) & Carlos Guestrin (CMU) Optimizing Recommender Systems Every.

Bayesian Learning Rong Jin. Outline MAP learning vs. ML learning Minimum description length principle Bayes optimal classifier Bagging.

Personalized Search Result Diversification via Structured Learning

Carnegie Mellon 1 Maximum Likelihood Estimation for Information Thresholding Yi Zhang & Jamie Callan Carnegie Mellon University

Scalable Text Mining with Sparse Generative Models

Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.

Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:

TREC 2009 Review Lanbo Zhang. 7 tracks Web track Relevance Feedback track (RF) Entity track Blog track Legal track Million Query track (MQ) Chemical IR.

A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA

« Pruning Policies for Two-Tiered Inverted Index with Correctness Guarantee » Proceedings of the 30th annual international ACM SIGIR, Amsterdam 2007) A.

Karthik Raman, Pannaga Shivaswamy & Thorsten Joachims Cornell University 1.

A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.

Modeling term relevancies in information retrieval using Graph Laplacian Kernels Shuguang Wang Joint work with Saeed Amizadeh and Milos Hauskrecht.

Diversifying Search Result WSDM 2009 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University Center for E-Business.

Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science ＆ Information Engineering.

LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.

Graph Query Reformulation with Diversity – Davide Mottin, Francesco Bonchi, Francesco Gullo 1 Graph Query Reformulation with Diversity Davide Mottin, University.

The Greedy Method. The Greedy Method Technique The greedy method is a general algorithm design paradigm, built on the following elements: configurations:

Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.

Diversifying Search Results Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, Samuel Ieong Search Labs, Microsoft Research WSDM, February 10, 2009 TexPoint.

Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.

Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web Danushka Bollegala Yutaka Matsuo Mitsuru Ishizuka International.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

The P YTHY Summarization System: Microsoft Research at DUC 2007 Kristina Toutanova, Chris Brockett, Michael Gamon, Jagadeesh Jagarlamudi, Hisami Suzuki,

Spring 2008The Greedy Method1. Spring 2008The Greedy Method2 Outline and Reading The Greedy Method Technique (§5.1) Fractional Knapsack Problem (§5.1.1)

Strong Supervision From Weak Annotation Interactive Training of Deformable Part Models ICCV /05/23.

Inferring Networks of Diffusion and Influence

Learning Deep Generative Models by Ruslan Salakhutdinov

Department of Computer Science

SLAQ: Quality-Driven Scheduling for Distributed Machine Learning

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Moran Feldman The Open University of Israel

Video Summarization via Determinantal Point Processes (DPP)

Janardhan Rao (Jana) Doppa, Alan Fern, and Prasad Tadepalli

The Subset Sum Game Revisited

Merge Sort 11/28/2018 2:18 AM The Greedy Method The Greedy Method.

The Greedy Method Spring 2007 The Greedy Method Merge Sort

Merge Sort 11/28/2018 8:16 AM The Greedy Method The Greedy Method.

John Lafferty, Chengxiang Zhai School of Computer Science

Merge Sort 1/17/2019 3:11 AM The Greedy Method The Greedy Method.

Learning a Policy for Opportunistic Active Learning

Feature Selection for Ranking

Relevance and Reinforcement in Interactive Browsing

Canonical Computation without Canonical Data Structure

Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007

Merge Sort 5/2/2019 7:53 PM The Greedy Method The Greedy Method.

Learning to Rank with Ties

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Submodular Maximization with Cardinality Constraints

Presentation transcript:

Structured Learning of Two-Level Dynamic Rankings Karthik Raman, Thorsten Joachims & Pannaga Shivaswamy Cornell University

Handling Query Ambiguity Non-Diversified Diversified

Ranking Metrics: Two Extremes Non-Div. Diversifie d t1 t2 t3 t1 t2 t3 1/2 1/3 1/6 4 3 P(t1) =1/2 P(t2) =1/3 P(t3) =1/6 U(d1|t) U(d2|t) U(d3|t) U(d4|t) 4 3 d1 d2 Over all intents need to take exprectation d3 d4 8 6 3 ………………??…….………. 4 3 = 8/2 + 6/3 + 3/6 = 4/2 + 3/3 + 3/6

KEY: Diminishing Returns Non-Diversified Diversified Original concept from Economics Resembles a concave function

KEY: Diminishing Returns Original concept from Economics Resembles a concave function For a given query and user intent: Marginal benefit of seeing additional relevant doc. decreases.

General Ranking Metric Given ranking θ = (d1, d2,…. dk) and concave function g t1 t2 t3 1/2 1/3 1/6 4 3 d1 d2 d3 Over all intents need to take exprectation g(x) = √x d4 √8 √6 √3 - Can incorporate position discount also = √8 /2 + √6/3 + √3/6

Computing Optimal Rankings These utility functions are submodular. Given the utility function, can find ranking that optimizes it using a greedy algorithm: At each iteration: Choose Document that Maximizes Marginal Benefit Algorithm has (1 – 1/e) approximation bound. d1 2.2 d2 1.7 d3 0.4 d4 1.9 d1 2.2 d2 1.7 1.4 d3 0.4 0.2 1.3 d4 1.9 0.1 d1 2.2 d2 1.7 1.4 d3 0.4 0.2 d4 1.9 ? d1 Look at Marginal Benefits ? d4 ? d2

Experimental Settings Experiments run on TREC and WEB datasets: Both datasets have document-intent annotations. Avg. Intents per Doc: 20 vs. 4.5 Prob(Dominant Intent) : 0.28 vs. 0.52 Different Utility functions compared: PREC : g(x) =x SQRT : √x LOG : log(1+x) SAT-2 : min(x,2) SAT-1 : min(x,1) Mention important differences This is the function that will be maximized to produce the dynamic ranking.

Turning the knob ? No Diversification LOG SQRT LOG SAT-2 SQRT SAT-2 Results Maximum Diversity

How Do We break this trade-off? KEY IDEA: Using user interaction as feedback to disambiguate the query on-the-fly, so as to present more results relevant to the user. (Brandt et. al. WSDM ‘11)

Two-Level Dynamic Ranking: Example First-Level: Start with a Diverse Ranking User interested in the ML Method. Expands to see more such results.

Two-Level Dynamic Ranking: Example New ranking related to ML Method. Second-Level Ranking

Two-Level Dynamic Ranking: Example

Two-Level Dynamic Ranking: Example

Two-Level Dynamic Ranking : Example Hea d expand d1 d1,1 d1,2 d1,3 skip Tail Documents expand d2 d2,1 d2,2 d2,3 skip expand d3 d3,1 d3,2 d3,3 skip expand d4 d4,1 d4,2 d4,3 Two-Level Ranking (Θ)

Utility of two-level ranking Can extend definition of Utility for two-level rankings: Concave function applied to (weighted) sum over all rows. Resulting measure is nested submodular function. Can similarly extend the greedy algorithm to a nested greedy algorithm for Two-Level Rankings: (1): Given a head document, can greedily build a row. (2): Each iteration, amongst all rows from (1), choose the one with highest marginal benefit. Can still prove an approximation bound: ≈ 0.47

Static vs. Two-Level Dynamic Rankings TREC WEB

Structural-SVM Based Learning Framework What if we do not have the document-intent labels? Solution: Use TERMS as a substitute for intents. Structural SVMs used to predict complex outputs. Similarity between head and tail documents. Utility Measure with Words replacing intents

Learning Experiments Compare with different static-learning baselines:

Main Take-Aways Using Submodular Performance Measures to smoothly bridge the gap between depth and diversity. Breaking the Trade-off between depth and diversity via user interaction using Two Level Dynamic Rankings. Given training data, being able to predict dynamic rankings using the Structural SVM-based Learning Algorithm. Give overview of talk and how each relates to the other Code up on : http://www.cs.cornell.edu/~karthik/code/svm-dyn.html