Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Learning to Rank --A Brief Review Yunpeng Xu. 2 Ranking and sorting Rank: only has K structured categories Sorting: each sample has a distinct rank.

Similar presentations


Presentation on theme: "1 Learning to Rank --A Brief Review Yunpeng Xu. 2 Ranking and sorting Rank: only has K structured categories Sorting: each sample has a distinct rank."— Presentation transcript:

1 1 Learning to Rank --A Brief Review Yunpeng Xu

2 2 Ranking and sorting Rank: only has K structured categories Sorting: each sample has a distinct rank Generally, no need to differentiate them

3 3 Overview Rank aggregation Label ranking Query and rank by example Preference learning Problems left, what we can do?

4 4 Ranking aggregation Needs of combining different ranking results  Voting systems, welfare economics, decision making 1. Hillary Clinton > John Edwards > Barack Obama 2. Barack Obama >John Edwards > Hillary Clinton => ?

5 5 Ranking aggregation (cont.) Arrow’s impossibility theorem  Kenneth Arrow, 1951 If the decision-making body has at least two members and at least three options to decide among, then it is impossible to design a social welfare function that satisfies all these conditions at once.

6 6 Ranking aggregation (cont.) Arrow’s impossibility theorem  5 fair assumptions non-dictatorship, unrestricted domain or universality, independence of irrelevant alternatives, positive association of social and individual values or monotonicity, non-imposition or citizen sovereignty  Cannot be satisfied simultaneously

7 7 Ranking aggregation (cont.) Borda’s method (1971)  Given lists, each has n items  For each Define as the number of items rank below j in  Rank all items by Hillary Clinton: 2, John Edwards: 2, Barack Obama: 2

8 8 Ranking aggregation (cont.) -- Border Condorcet Criteria  If the majority prefers x to y, then x must be ranked above y Border’s method does not satisfy CC, neither any method that assigns weights to each rank position

9 9 Ranking aggregation (cont.) Assumption relaxation Maximize consensus criteria  Equivalent to minimize disagreement (Kemeny, Social Choice Theorem)  NP Hard!  Sub-optimal solutions using heuristics

10 10 Ranking aggregation (cont.) Basic idea  Assign different weights to different experts  Supervised aggregation Weighting according to a final judger (ground truth)  Unsupervised aggregation Aims to minimize the disagreement measured by certain distances

11 11 Ranking aggregation (cont.) Distance measure  Spearman footrule distance  Kendal tau distance  Kendal tau distance for multiple lists  Scaled footrule distance

12 12 Ranking aggregation (cont.) -Distance Measure Kemeny optimal ranking  Minimizing Kendal distance  Still NP-Hard to compute  Local Kemenization (local optimal aggregation) Can be computed in O(knlogn)

13 13 Ranking aggregation (cont.) Supervised Ranking Aggregation (SRA WWW07)  Ground truth: preference matrix H Example  Goal: rank by the score  It can be seen that, or with relaxation

14 14 Ranking aggregation (cont.) -- SRA Method  Use Borda’s score  Objective

15 15 Ranking aggregation (cont.) Markov Chain Rank Aggregation (MCRA, WWW05)  Map a ranked list to a Markov Chain M  Compute the stationary distribution of M  Rank items based on  Example: B > C > D A > D > E A > B > E

16 16 Ranking aggregation (cont.) - MCRA Different transition strategies  MC1 all out-degree edges have uniform probabilities  MC2 choose a list, then choose next item on the list;  …  For disconnected graph, define transition probability based on measure item similarity

17 17 Ranking aggregation (cont.) Unsupervised Learning Algorithm for Rank Aggregation (ULARA: Dan Roth ECML07)  Goal:  Method: maximize agreement

18 18 Ranking aggregation (cont.) - UCLRA Method Algorithm: iterative gradient decent Initially, w is uniform, then updated iteratively

19 19 Overview Rank aggregation Label ranking Query and rank by example Preference learning Problems left, what we can do?

20 20 Label Ranking  Goal: Map from the input space to the set of total order over a finite set of labels  Related to multi-label or multi-class problems Input: Customer information Output: Porsche > Toyota > Ford Mountain > Sea> Beach

21 21 Label Ranking (cont.) Pairwise ranking (ECML03)  Train a classifier for each pair of labels  When judge on an example : If the classifier predicts, then count it as a vote on Then rank all labels according to their votes  Total classifiers

22 22 Label Ranking (cont.) Constraint Classification (NIPS 02)  Consider a linear sorting function  Goal: learn the values of rank all labels by the score

23 23 Label Ranking (cont.) -- CC Expand the feature vector Generate positive/ negative samples in

24 24 Label Ranking (cont.) -- CC Learn a separating hyper plane Can be solved by SVM

25 25 Overview Rank aggregation Label ranking Query and rank by example Preference learning Problems left, what we can do?

26 26 Query and rank by example  Given one query, rank retrieved items according to their relevancy w.r.t the query.

27 27 Query and rank by example (cont.) Rank on manifold  Convergence form  Essentially, this is an one-class semi-supervised method

28 28 Preference learning Given a set of items, and a set of user preference over these items, to rank all items according to the user preference.  Motivated by the needs of personalized search.

29 29 Preference learning Input: preference: a set of partial order on X Output: a total order on X or, map X onto a structured label space Y Preference function

30 30 Existing methods Learning to order things [W. Cohen 98] Large margin ordinal regression [R. Herbrich 98] PRanking with Ranking [K Crammer 01] Optimizing Search Engines using Clickthrough Data [T Joachims 02] Efficient boosting algorithm for combining preferences [Yoav Freund 03] Classification Approach towards Ranking and Sorting Problems [S Rajaram 03]

31 31 Existing methods Learning to Rank using Gradient Descent [C Burges 05] Stability and Generalization of Bipartite Ranking [S Agarwal 05] Generalization Bounds for k-Partite Ranking[S Rajaram 05] Ranking with a p-norm push [C Rudin 05] Magnitutde-Preserving Ranking Algorithms [C Cortes 07] From Pairwise Approach to Listwise [Z Cao 07]

32 32 Large Margin Ordinal Regression Mapping to an axis using inner product

33 33 Large Margin Ordinal Regression Consider Then Introduce soft margin Solve using SVM

34 34 Learn to order things A greedy ordering algorithm to order things Calculate a score for each item

35 35 Learn to order things (cont.) Combine different ranking functions To learn the weight iteratively

36 36 Learn to order things Combine preference functions Do ranking aggregation Update weights based on feedbacks

37 37 Initially, w is uniform At each step  Compute a combined ranking function  Produce a ranking aggregation  Measure the loss

38 38 RankBoost Bipartite ranking problems Combine weaker rankers Sort based on values of H(x)

39 39 RankBoost (cont.) Bipartite ranking problem Sampling distribution Initialization Sampling distribution updation normalization Learn weak ranker Combine weak rankers

40 40 Stability and Generalization Bipartite ranking problems Expected rank error Empirical rank error

41 41 Stability and Generalization (cont.) Stability  Remove one training sample, how much changes Generalization Generalize to k-partite ranking problem…

42 42 Rank on graph data Objective

43 43 P-norm push Focus on the topmost ranked items  The top left region is the most important

44 44 P-norm push (cont.) Height of k (k is a negative sample) Cost of sample k: g is convex, monotonically incresasing

45 45 p-norm push Run RankBoost to solve the problem

46 46 Thanks!


Download ppt "1 Learning to Rank --A Brief Review Yunpeng Xu. 2 Ranking and sorting Rank: only has K structured categories Sorting: each sample has a distinct rank."

Similar presentations


Ads by Google