Learning to Rank Shubhra kanti karmaker (Santu)

Learning to Rank Shubhra kanti karmaker (Santu)
Credit: some materials are from James Allan HongNinG Wang Jiepu Jiang

The ranking problem Core problem in Information Retrieval
Given a query, rank the document collection according to their relevance Each <query, document> pair is associated with a relevance label Traditionally binary labels, 0 (non-relevant) or 1(relevant) Can be multiple levels too, (0-9) or (0-4)

Classic retrieval models
You have already learnt a number of classic retrieval models Vector Space Model BM25 Language Models Given a query q and document d, return the relevance score r(q,d)

Classic retrieval models
You have already learnt a number of classic retrieval models Vector Space Model BM25 Language Models Given a query q and document d, return the relevance score r(q,d) What features they are using?

Limitations of Classic retrieval models
Limited number of features TF IDF Doc length normalization etc No weighting between features Sometimes manually weighted Hand tuned parameters are not optimal

Why Learning to rank? What if we have a lot of non-text features?
Think about product search at amazon.com

Why Learning to rank? Search for Camping chairs
Weight of the chair Color of the chair Materials used for construction Includes arm rest Folding or not Manufacturer reputation How do we use them as features?

Learning to rank Apply machine learning to solve the ranking problem
A supervised learning task Given a query, Learn a function automatically to rank documents Requires a training set and a testing set A <query, document> pair is represented as a feature vector Target variable is the relevance label corresponding to the <query, document> pair <query, document> -> relevance label

Features in learning to rank
Query specific features Words in query Query length Asks for a special attribute? (e.g., red color chair) Document specific features Pagerank Product description Product sales Customer reviews Number of images in the page Query-Document pair features BM25 score Department matches in query and product specification Image caption match

Features in learning to rank

Learning to rank Point-wise approach Pair-wise List-wise
The function is based on features of a single object e.g., regress the relevance score, classify docs into R and NR Classic retrieval models are also point-wise: score (q, D) Pair-wise The function is based on a pair of item e.g., given two documents, predict partial ranking List-wise The function is based on a ranked list of items e.g., given two ranked list of the same items, which is better?

Point-wise approach – Binary Judgements
Classification Logistic Regression Support Vector Machine Decision Trees Query Document Relevance q1 d1 1 d2 d3 … q100

Point-wise approach – Binary Judgements
0.05 Decision surface R R N cosine score α R R R R R N N 0.025 R R R R N N N N N N N 2 3 4 5 Term proximity ω

Point-wise approach – Multi level Judgements
Classification Logistic Regression Support Vector Machine Decision Trees Regression Linear Regression SVM Regression Query Document Relevance q1 d1 4 d2 1 d3 … q100 2

Problems with Point wise approaches

Problems with Point wise approaches
Ignores the order / rank Pairwise order is not important Class imbalance affects learning Classification error is biased toward “big” queries Does not optimize the target matric directly MAP, NDCG etc

Pair-wise Approach Input: Classify: a pair of documents Di and Dj
Characterized by feature vectors ψi and ψk Classify: rank i before k iff w.(ψi − ψk) > 0

Pair-wise Approach

Pair-wise Approach [Herbrich et al. 1999, 2000; Joachims et al. 2002]
Given a query q and two documents D1, D2: Learn which one should be ranked higher between D1 and D2

pair-wise approaches More popular than list-wise and point-wise ones
SVM-Rank is only one of them (and not the best one) Many others, e.g., RankBoost, Ranknet, LambdaRank, LambdaMART The best so far (in terms of effectiveness): LambdaMART

Problems with SVM-Rank?

NDCG Rank Matters Discounting co-efficient
Error at the top position costs more than error at lower positions

Problems with SVM-Rank?

pair-wise approaches [Yisong Yue, et al., SIGIR’07]
SVM-Rank Minimizing the pairwise loss SVM-MAP Minimizing the structural loss MAP difference

Pair-wise Approach

pair-wise approaches [Christopher J.C. Burges, 2010]
LAMBDAMART Some pairs are more important than others! Includes change of NDCG into the gradient

List wise approaches Directly optimize the value of one evaluation measure: NDCG, MAP etc Challenges Computationally expensive Most evaluation measures are not continuous GBRank, AdaRank, ListNet etc.

Tool RankLib Jforest SVM-Rank
Jforest SVM-Rank

Public Datasets Web Search Dataset E-com Search Dataset
MSR LETOR Dataset E-com Search Dataset

references General E-Com search
Li, Hang. "A short introduction to learning to rank." IEICE TRANSACTIONS on Information and Systems 94.10 (2011): Liu, Tie-Yan. "Learning to rank for information retrieval." Foundations and Trends® in Information Retrieval 3.3 (2009): Tax, Niek, Sander Bockting, and Djoerd Hiemstra. "A cross-benchmark comparison of 87 learning to rank methods." Information processing & management 51.6 (2015): E-Com search Shubhra Kanti Karmaker Santu, Parikshit Sondhi and ChengXiang Zhai. "On Application of Learning to Rank for E-Commerce Search. In proceedings of ACM SIGIR, 2017:

Learning to Rank Shubhra kanti karmaker (Santu)

Similar presentations

Presentation on theme: "Learning to Rank Shubhra kanti karmaker (Santu)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning to Rank Shubhra kanti karmaker (Santu)

Similar presentations

Presentation on theme: "Learning to Rank Shubhra kanti karmaker (Santu)"— Presentation transcript:

Similar presentations

About project

Feedback