Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar.

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar

PASCAL VOC “Jumping” Classification Features Processing Training Classifier

PASCAL VOC Features Processing Training Classifier Think of a classifier !!! “Jumping” Classification ✗

PASCAL VOC Features Processing Training Classifier Think of a classifier !!! ✗ “Jumping” Ranking

Ranking vs. Classification Rank 1Rank 2Rank 3 Rank 4Rank 5Rank 6 Average Precision= 1

Ranking vs. Classification Rank 1Rank 2Rank 3 Rank 4Rank 5Rank 6 Average Precision= 1Accuracy= 1 = 0.92 = 0.67 = 0.81

Ranking vs. Classification Ranking is not the same as classification Average precision is not the same as accuracy Should we use 0-1 loss based classifiers? No (basic “machine learning” principle) !!

Structured Output SVM Optimizing Average Precision High-Order Information Missing Information Related Work Taskar, Guestrin and Koller, NIPS 2003; Tsochantaridis, Hofmann, Joachims and Altun, ICML 2004 Outline

Structured Output SVM Input xOutput yJoint Feature Ψ(x,y) Scoring function s(x,y;w) = w T Ψ(x,y) Prediction y(w) = argmax y s(x,y;w)

Training data {(x i,y i ), i = 1,2,…,m} Δ(y i,y i (w)) Loss function for i-th sample Minimize the regularized sum of loss over training data Highly non-convex in w Regularization plays no role (overfitting may occur) Parameter Estimation

Training data {(x i,y i ), i = 1,2,…,m} Δ(y i,y i (w))w T Ψ(x,y i (w)) +- w T Ψ(x,y i (w)) ≤ w T Ψ(x,y i (w)) +Δ(y i,y i (w)) - w T Ψ(x,y i ) ≤ max y { w T Ψ(x,y) + Δ(y i,y) } - w T Ψ(x,y i ) ConvexSensitive to regularization of w Parameter Estimation

Training data {(x i,y i ), i = 1,2,…,m} w T Ψ(x,y) + Δ(y i,y) - w T Ψ(x,y i ) ≤ ξ i for all y min w ||w|| 2 + C Σ i ξ i Quadratic program, which only requires cutting planes Parameter Estimation max y { w T Ψ(x,y) + Δ(y i,y) }

Training data {(x i,y i ), i = 1,2,…,m} s(x,y;w) + Δ(y i,y) - s(x,y i ;w) ≤ ξ i for all y min w ||w|| 2 + C Σ i ξ i Quadratic program, which only requires cutting planes Parameter Estimation max y { s(x,y;w) + Δ(y i,y) }

Problem Formulation –Input –Output –Joint Feature Vector or Scoring Function Learning Formulation –Loss function (‘test’ evaluation criterion) Optimization for Learning –Cutting plane (loss-augmented inference) Prediction –Inference Recap

Structured Output SVM Optimizing Average Precision (AP-SVM) High-Order Information Missing Information Related Work Yue, Finley, Radlinski and Joachims, SIGIR 2007 Outline

Problem Formulation Single Input X Φ(x i ) for all i  P Φ(x k ) for all k  N

Problem Formulation Single Output R R ik = +1 if i is better ranked than k -1 if k is better ranked than i

Problem Formulation Scoring Function s i (w) = w T Φ(x i ) for all i  P s k (w) = w T Φ(x k ) for all k  N S(X,R;w) = Σ i  P Σ k  N R ik (s i (w) - s k (w))

Learning Formulation Loss Function Δ(R*,R) = 1 – AP of rank R

Optimization for Learning Optimal greedy algorithm is O(|P||N|) run time. Cutting Plane Computation Yue, Finley, Radlinski and Joachims, SIGIR 2007

Ranking Sort in decreasing order of individual score s i (w) Yue, Finley, Radlinski and Joachims, SIGIR 2007

Experiments PASCAL VOC 2011 Jumping Phoning Playing Instrument Reading Riding Bike Riding Horse Running Taking Photo Using Computer Walking ImagesClasses 10 ranking tasks Cross-validation Poselets Features

AP-SVM vs. SVM PASCAL VOC ‘test’ Dataset Difference in AP Better in 8 classes, tied in 2 classes

AP-SVM vs. SVM Folds of PASCAL VOC ‘trainval’ Dataset Difference in AP AP-SVM is statistically better in 3 classes SVM is statistically better in 0 classes

Structured Output SVM Optimizing Average Precision High-Order Information (M4-AP-SVM) Missing Information Related Work Kumar, Behl, Jawahar and Kumar, Submitted Outline

High-Order Information People perform similar actions People strike similar poses Objects are of same/similar sizes “Friends” have similar habits How can we use them for ranking? classification

Problem Formulation x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Ψ(x,y) = Ψ1(x,y)Ψ1(x,y) Ψ2(x,y)Ψ2(x,y) Unary Features Pairwise Features

Learning Formulation x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Δ(y*,y) = Fraction of incorrectly classified persons

Optimization for Learning x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 max y w T Ψ(x,y) + Δ(y*,y) Graph Cuts (if supermodular) LP Relaxation, or exhaustive search

Classification x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 max y w T Ψ(x,y) Graph Cuts (if supermodular) LP Relaxation, or exhaustive search

Ranking? x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Use difference of max-marginals

Max-Marginal for Positive Class x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 mm + (i;w) = max y,y i =+1 w T Ψ(x,y) Best possible score when person i is positive Convex in w

Max-Marginal for Negative Class x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 mm - (i;w) = max y,y i =-1 w T Ψ(x,y) Best possible score when person i is negative Convex in w

Ranking x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 s i (w) = mm + (i;w) – mm - (i;w) Difference-of-Convex in w Use difference of max-marginals HOB-SVM

Ranking s i (w) = mm + (i;w) – mm - (i;w) Why not optimize AP directly? Max-Margin Max-Marginal AP-SVM M4-AP-SVM

Problem Formulation Single Input X Φ(x i ) for all i  P Φ(x k ) for all k  N

Problem Formulation Single Input R R ik = +1 if i is better ranked than k -1 if k is better ranked than i

Problem Formulation Scoring Function s i (w) = mm + (i;w) – mm - (i;w) for all i  P S(X,R;w) = Σ i  P Σ k  N R ik (s i (w) - s k (w)) s k (w) = mm + (k;w) – mm - (k;w) for all k  N

Learning Formulation Loss Function Δ(R*,R) = 1 – AP of rank R

Optimization for Learning Difference-of-convex program Kohli and Torr, ECCV 2006 Very efficient CCCP Linearization step by Dynamic Graph Cuts Update step equivalent to AP-SVM Kumar, Behl, Jawahar and Kumar, Submitted

Ranking Sort in decreasing order of individual score s i (w)

Experiments PASCAL VOC 2011 Jumping Phoning Playing Instrument Reading Riding Bike Riding Horse Running Taking Photo Using Computer Walking ImagesClasses 10 ranking tasks Cross-validation Poselets Features

HOB-SVM vs. AP-SVM PASCAL VOC ‘test’ Dataset Difference in AP Better in 4, worse in 3 and tied in 3 classes

HOB-SVM vs. AP-SVM Folds of PASCAL VOC ‘trainval’ Dataset Difference in AP HOB-SVM is statistically better in 0 classes AP-SVM is statistically better in 0 classes

M4-AP-SVM vs. AP-SVM PASCAL VOC ‘test’ Dataset Better in 7, worse in 2 and tied in 1 class Difference in AP

M4-AP-SVM vs. AP-SVM Folds of PASCAL VOC ‘trainval’ Dataset M4-AP-SVM is statistically better in 4 classes AP-SVM is statistically better in 0 classes Difference in AP

Structured Output SVM Optimizing Average Precision High-Order Information Missing Information (Latent-AP-SVM) Related Work Outline Behl, Jawahar and Kumar, CVPR 2014

Fully Supervised Learning

Weakly Supervised Learning Rank images by relevance to ‘jumping’

Use Latent Structured SVM with AP loss –Unintuitive Prediction –Loose Upper Bound on Loss –NP-hard Optimization for Cutting Planes Carefully design a Latent-AP-SVM –Intuitive Prediction –Tight Upper Bound on Loss –Optimal Efficient Cutting Plane Computation Two Approaches

Results

Structured Output SVM Optimizing Average Precision High-Order Information Missing Information (Latent-AP-SVM) Related Work Outline Mohapatra, Jawahar and Kumar, In Preparation

conv1 conv2 conv3 conv4 conv5 fc6 fc7 fc8 fcA fcB Softmax + cross-entropy loss W AP loss W AP-CNN Small but statistically significant improvements

Questions? Code + Data Available

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar.

Similar presentations

Presentation on theme: "Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar.

Similar presentations

Presentation on theme: "Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar."— Presentation transcript:

Similar presentations

About project

Feedback