Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group:

Slides:

Advertisements

Similar presentations

Beliefs & Biases in Web Search

Advertisements

Accurately Interpreting Clickthrough Data as Implicit Feedback Joachims, Granka, Pan, Hembrooke, Gay Paper Presentation: Vinay Goel 10/27/05.

Evaluating the Robustness of Learning from Implicit Feedback Filip Radlinski Thorsten Joachims Presentation by Dinesh Bhirud

Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.

Psychological Advertising: Exploring User Psychology for Click Prediction in Sponsored Search Date: 2014/03/25 Author: Taifeng Wang, Jiang Bian, Shusen.

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.

Optimizing search engines using clickthrough data

SIGIR 2013 Recap September 25, 2013.

Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.

Catching the Drift: Learning Broad Matches from Clickthrough Data Sonal Gupta, Mikhail Bilenko, Matthew Richardson University of Texas at Austin, Microsoft.

Contextual Advertising by Combining Relevance with Click Feedback D. Chakrabarti D. Agarwal V. Josifovski.

Statistical Classification Rong Jin. Classification Problems X Input Y Output ? Given input X={x 1, x 2, …, x m } Predict the class label y  Y Y = {-1,1},

Optimal Design Laboratory | University of Michigan, Ann Arbor 2011 Design Preference Elicitation Using Efficient Global Optimization Yi Ren Panos Y. Papalambros.

Searchable Web sites Recommendation Date : 2012/2/20 Source : WSDM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh Jia-ling 1.

1 Learning User Interaction Models for Predicting Web Search Result Preferences Eugene Agichtein Eric Brill Susan Dumais Robert Ragno Microsoft Research.

Improving relevance prediction by addressing biases and sparsity in web search click data Qi Guo, Dmitry Lagun, Denis Savenkov, Qiaoling Liu

Statistic Models for Web/Sponsored Search Click Log Analysis The Chinese University of Hong Kong 1 Some slides are revised from Mr Guo Fan’s tutorial at.

Visual Recognition Tutorial

BBM: Bayesian Browsing Model from Petabyte-scale Data Chao Liu, MSR-Redmond Fan Guo, Carnegie Mellon University Christos Faloutsos, Carnegie Mellon University.

Evaluating Search Engine

Click Evidence Signals and Tasks Vishwa Vinay Microsoft Research, Cambridge.

Learning to Rank: New Techniques and Applications Martin Szummer Microsoft Research Cambridge, UK.

1 CS 430 / INFO 430 Information Retrieval Lecture 8 Query Refinement: Relevance Feedback Information Filtering.

Sparse vs. Ensemble Approaches to Supervised Learning

Time-dependent Similarity Measure of Queries Using Historical Click- through Data Qiankun Zhao*, Steven C. H. Hoi*, Tie-Yan Liu, et al. Presented by: Tie-Yan.

Probabilistic Model of Sequences Bob Durrant School of Computer Science University of Birmingham (Slides: Dr Ata Kabán)

Sparse vs. Ensemble Approaches to Supervised Learning

Cohort Modeling for Enhanced Personalized Search Jinyun YanWei ChuRyen White Rutgers University Microsoft BingMicrosoft Research.

Online Search Evaluation with Interleaving Filip Radlinski Microsoft.

Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:

Modern Retrieval Evaluations Hongning Wang

Understanding and Predicting Graded Search Satisfaction Tang Yuk Yu 1.

Improving Web Search Ranking by Incorporating User Behavior Information Eugene Agichtein Eric Brill Susan Dumais Microsoft Research.

Ramakrishnan Srikant Sugato Basu Ni Wang Daryl Pregibon 1.

Fan Guo 1, Chao Liu 2 and Yi-Min Wang 2 1 Carnegie Mellon University 2 Microsoft Research Feb 11, 2009.

CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.

Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.

Search Engines that Learn from Implicit Feedback Jiawen, Liu Speech Lab, CSIE National Taiwan Normal University Reference: Search Engines that Learn from.

Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.

Jun Li, Peng Zhang, Yanan Cao, Ping Liu, Li Guo Chinese Academy of Sciences State Grid Energy Institute, China Efficient Behavior Targeting Using SVM Ensemble.

Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.

Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.

Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.

Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.

ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.

1 Click Chain Model in Web Search Fan Guo Carnegie Mellon University PPT Revised and Presented by Xin Xin.

Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.

1 Learning to Rank --A Brief Review Yunpeng Xu. 2 Ranking and sorting Rank: only has K structured categories Sorting: each sample has a distinct rank.

Modern Retrieval Evaluations Hongning Wang

Relevance Feedback Hongning Wang

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Date: 2013/9/25 Author: Mikhail Ageev, Dmitry Lagun, Eugene Agichtein Source: SIGIR’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Improving Search Result.

A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.

1 Random Walks on the Click Graph Nick Craswell and Martin Szummer Microsoft Research Cambridge SIGIR 2007.

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819

1 CS 430 / INFO 430 Information Retrieval Lecture 12 Query Refinement and Relevance Feedback.

1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.

Ranking and Learning 293S UCSB, Tao Yang, 2017

Ranking and Learning 290N UCSB, Tao Yang, 2014

Accurately Interpreting Clickthrough Data as Implicit Feedback

Search User Behavior: Expanding The Web Search Frontier

Content-Aware Click Modeling

FRM: Modeling Sponsored Search Log with Full Relational Model

Evidence from Behavior

Click Chain Model in Web Search

Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007

Efficient Multiple-Click Models in Web Search

Learning to Rank with Ties

Topic: Semantic Text Mining

Presentation transcript:

Modelling Relevance and User Behaviour in Sponsored Search using Click-Data Adarsh Prasad, IIT Delhi Advisors: Dinesh Govindaraj SVN Vishwanathan* Group: Revenue and Relevance *-Visiting Researcher from Purdue

Overview Click-Data seems to be the perfect source of information when deciding which Ads to show in answer to a query. It can be thought as the result of users voting in favour of the documents they find interesting. This information can be fed into the ranker, to tune search parameters or even use as training points as for the ranker. The aim of the project is to develop a model which takes in Click-Data and generates output in the form of constraints or updated ranking score as input to the ranker. 2

Quality of training points is of critical importance for learning a ranking function Currently, labeled data collected using human judges. Human-labeling is time-consuming and labor-intensive. Need to ensure “temporal relevance” of Ads i.e. Something relevant today might not be relevant 6 months later, therefore labeling must be repeated and there is a need for automation of labeling process Motivation Main Difficulty – Presentation Bias Results at lower positions are less likely to be clicked even if they are relevant.(Position) Clicks depend on other Ads being shown.(Externalities) Main Difficulty – Presentation Bias Results at lower positions are less likely to be clicked even if they are relevant.(Position) Clicks depend on other Ads being shown.(Externalities) [1] Oliver Chapelle et al. A Dynamic Bayesian Click Model for Web Search Ranking Example [1] Query: myspace URL = Market = U.K. Ranking 1 Pos 1: uk.myspace.com: ctr = 0.97 Pos 2: ctr = 0.11www.myspace.com Ranking 2: Pos 1 : : ctr = 0.97www.myspace.com 3

Procedure Use of Click Data as target : Useful for markets with few editorial Judgments. Train on pairwise preferences: Two Sets of preferences: P E from editorial judgments and P C coming from click modeling. Minimize: For learning a web search function, clicks can be used as a target [2] or as a feature [3] Target 1.Deriving Preference Relations on the basis of click-pattern and feeding them as constraints to ranker (Rocky-Road) Position and Order-of-Click based Constraints [4] Aggregate Constraints Target 1.Deriving Preference Relations on the basis of click-pattern and feeding them as constraints to ranker (Rocky-Road) Position and Order-of-Click based Constraints [4] Aggregate Constraints Feature 1.Sample Clicked Ads and label them as relevant. 2.Types of Sampling: Random Position based Weighted : User Clicking ml-4 Ad stronger signal of relevance as compared to user clicking ml-1 3.Feed them to the Binary Classifier Feature 1.Sample Clicked Ads and label them as relevant. 2.Types of Sampling: Random Position based Weighted : User Clicking ml-4 Ad stronger signal of relevance as compared to user clicking ml-1 3.Feed them to the Binary Classifier [2] Joachims et al. Optimizing Search Engines using Clickthrough Data [3] Agichtein et al. Improving web search ranking via incorporating User Behaviour [4] Joachims et al. Accurately interpreting ClickThrough Data as Implicit Feedback 4

Results 5 EXACTMATCHBROADMATCHPHRASEMATCHSMARTMATCH Sampling+0.39%+1.02% Position and Order Constraints +1.22%+5.93%+4.15%+0.38% Aggregate Constraints +0.2%+5.17%+0.77%+0.5% SAME SUPERSETDISJOINT Sampling+5.72%+4.22% Position and Order Constraints +3.1%+2.28% Aggregate Constraints +7.4%+5.28% -6.28% -3.9% -11.3% -0.06%-0.5% Log Loss (Label Based) Sampling+0.001% Position and Order Constraints +3.07% Aggregate Constraints +1.75% Weighted LL

Background on Click Models Use CTR (click-through rate) data. Pr(click) = Pr(examination) x Pr(click | examination) Need user browsing models to estimate Pr(examination) Relevance 6

Notation Φ(i) : result at position i Examination event: Click event: 7

Examination Hypothesis Richardson et al, WWW 2007: Pr(C i = 1) = Pr(E i = 1) Pr(C i = 1 | E i = 1) α i : position bias Depends solely on position. Can be estimated by looking at CTR of the same result in different positions. 8

Using Prior Clicks 9 Clicks Pr(E 5 | C 1,C 3 ) = 0.5 Pr(E 5 | C 1 ) = 0.3 Clicks R1 R2 R3 R4 R5 : R1 R2 R3 R4 R5:

Examination depends on prior clicks Cascade model Dependent click model (DCM) User browsing model (UBM) [Dupret & Piwowarski, SIGIR 2008] More general and more accurate than Cascade, DCM. Conditions Pr(examination) on closest prior click. Bayesian browsing model (BBM) [Liu et al, KDD 2009] Same user behavior model as UBM. Uses Bayesian paradigm for relevance. 10

Use position of closest prior click to predict Pr(examination). Pr(E i = 1 | C 1:i-1 ) = α i β i,p(i) Pr(C i = 1 | C 1:i-1 ) = Pr(E i = 1 | C 1:i-1 ) Pr(C i = 1 | E i = 1) User browsing model (UBM) 11 position bias p(i) = position of closest prior click Prior clicks don’t affect relevance.

Other Related Work Examination depends on prior clicks and prior relevance Click chain model (CCM) General click model (GCM) Post-click models Dynamic Bayesian model Session utility model 12

User Browsing in Sponsored Search 13 Is user browsing in sponsored search similar to browsing in Web Search?? Generally, the assumption in organic search is that users examine and click in a linear top-to-bottom fashion. We observed that for sponsored search where the number of returned results is few, a fair share (~ 30%) of users click out of order. Users behaving in a non-linear fashion is a strong signal, which may contain important information. Combining position and temporal behavior of user. The statistic(x) that has been counted is the difference between the positions of temporal clicks. Example: if the user clicks on ml1 and then ml2 then x = -1 if ml2 and then ml1 then x=1 and so on.

A New Model Allow users to move in a non-linear fashion Also, incorporate the notion of externalities, i.e. perceived relevance changes with other clicks. 14 For learning our parameters, we can use EM Algorithm. (1)In E step, we estimate our hidden parameters by a forward-backward algorithm. (2)In M step- We have closed form solutions to maximize the expected log-likelihood.