Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar.

Slides:



Advertisements
Similar presentations
A Support Vector Method for Optimizing Average Precision
Advertisements

Self-Paced Learning for Semantic Segmentation
Learning Shared Body Plans Ian Endres University of Illinois work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang.
Curriculum Learning for Latent Structural SVM
Linear Classifiers (perceptrons)
Efficient Large-Scale Structured Learning
Structured SVM Chen-Tse Tsai and Siddharth Gupta.
Human Action Recognition by Learning Bases of Action Attributes and Parts Bangpeng Yao, Xiaoye Jiang, Aditya Khosla, Andy Lai Lin, Leonidas Guibas, and.
Loss-based Visual Learning with Weak Supervision M. Pawan Kumar Joint work with Pierre-Yves Baudin, Danny Goodman, Puneet Kumar, Nikos Paragios, Noura.
Max-Margin Latent Variable Models M. Pawan Kumar.
Machine Learning & Data Mining CS/CNS/EE 155 Lecture 2: Review Part 2.
Structural Human Action Recognition from Still Images Moin Nabi Computer Vision Lab. ©IPM - Oct
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Learning Structural SVMs with Latent Variables Xionghao Liu.
Intro to DPM By Zhangliliang. Outline Intuition Introduction to DPM Model Inference(matching) Training latent SVM Training Procedure Initialization Post-processing.
Human Action Recognition by Learning Bases of Action Attributes and Parts.
Discrete Optimization for Vision and Learning. Who? How? M. Pawan Kumar Associate Professor Ecole Centrale Paris Nikos Komodakis Associate Professor Ecole.
Object-centric spatial pooling for image classification Olga Russakovsky, Yuanqing Lin, Kai Yu, Li Fei-Fei ECCV 2012.
Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.
Large-Scale Object Recognition with Weak Supervision
Restrict learning to a model-dependent “easy” set of samples General form of objective: Introduce indicator of “easiness” v i : K determines threshold.
Improved Moves for Truncated Convex Models M. Pawan Kumar Philip Torr.
Efficiently Solving Convex Relaxations M. Pawan Kumar University of Oxford for MAP Estimation Philip Torr Oxford Brookes University.
Learning to Segment from Diverse Data M. Pawan Kumar Daphne KollerHaithem TurkiDan Preston.
Relaxations and Moves for MAP Estimation in MRFs M. Pawan Kumar STANFORDSTANFORD Vladimir KolmogorovPhilip TorrDaphne Koller.
Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.
Group Norm for Learning Latent Structural SVMs Overview Daozheng Chen (UMD, College Park), Dhruv Batra (TTI Chicago), Bill Freeman (MIT), Micah K. Johnson.
Latent Boosting for Action Recognition Zhi Feng Huang et al. BMVC Jeany Son.
School of Electronic Information Engineering, Tianjin University Human Action Recognition by Learning Bases of Action Attributes and Parts Jia pingping.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
Modeling Latent Variable Uncertainty for Loss-based Learning Daphne Koller Stanford University Ben Packer Stanford University M. Pawan Kumar École Centrale.
Loss-based Learning with Weak Supervision M. Pawan Kumar.
Self-paced Learning for Latent Variable Models
Loss-based Learning with Latent Variables M. Pawan Kumar École Centrale Paris École des Ponts ParisTech INRIA Saclay, Île-de-France Joint work with Ben.
Jifeng Dai 2011/09/27.  Introduction  Structural SVM  Kernel Design  Segmentation and parameter learning  Object Feature Descriptors  Experimental.
Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet KumarPritish MohapatraC. V. Jawahar.
1 Action Classification: An Integration of Randomization and Discrimination in A Dense Feature Representation Computer Science Department, Stanford University.
Research Ranked Recall: Efficient Classification by Learning Indices That Rank Omid Madani with Michael Connor (UIUC)
Modeling Latent Variable Uncertainty for Loss-based Learning Daphne Koller Stanford University Ben Packer Stanford University M. Pawan Kumar École Centrale.
Object Detection with Discriminatively Trained Part Based Models
Optimizing Average Precision using Weakly Supervised Data Aseem Behl IIIT Hyderabad Under supervision of: Dr. M. Pawan Kumar (INRIA Paris), Prof. C.V.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.
Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.
1 A fast algorithm for learning large scale preference relations Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park Balaji Krishnapuram.
Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.
Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.
Object detection, deep learning, and R-CNNs
Discriminative Sub-categorization Minh Hoai Nguyen, Andrew Zisserman University of Oxford 1.
Recognition Using Visual Phrases
Learning from Big Data Lecture 5
COT6930 Course Project. Outline Gene Selection Sequence Alignment.
1 Learning to Rank --A Brief Review Yunpeng Xu. 2 Ranking and sorting Rank: only has K structured categories Sorting: each sample has a distinct rank.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Optimizing Average Precision using Weakly Supervised Data Aseem Behl 1, C.V. Jawahar 1 and M. Pawan Kumar 2 1 IIIT Hyderabad, India, 2 Ecole Centrale Paris.
Loss-based Learning with Weak Supervision M. Pawan Kumar.
Spatial Localization and Detection
Week 4: 6/6 – 6/10 Jeffrey Loppert. This week.. Coded a Histogram of Oriented Gradients (HOG) Feature Extractor Extracted features from positive and negative.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Discriminative Machine Learning Topic 3: SVM Duality Slides available online M. Pawan Kumar (Based on Prof.
Discriminative Machine Learning Topic 4: Weak Supervision M. Pawan Kumar Slides available online
Recent developments in object detection
Bangpeng Yao1, Xiaoye Jiang2, Aditya Khosla1,
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Object Localization Goal: detect the location of an object within an image Fully supervised: Training data labeled with object category and ground truth.
Group Norm for Learning Latent Structural SVMs
Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007
Primal Sparse Max-Margin Markov Networks
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Jiahe Li
Presentation transcript:

Ranking with High-Order and Missing Information M. Pawan Kumar Ecole Centrale Paris Aseem BehlPuneet DokaniaPritish MohapatraC. V. Jawahar

PASCAL VOC “Jumping” Classification Features Processing Training Classifier

PASCAL VOC Features Processing Training Classifier Think of a classifier !!! “Jumping” Classification ✗

PASCAL VOC Features Processing Training Classifier Think of a classifier !!! ✗ “Jumping” Ranking

Ranking vs. Classification Rank 1Rank 2Rank 3 Rank 4Rank 5Rank 6 Average Precision= 1

Ranking vs. Classification Rank 1Rank 2Rank 3 Rank 4Rank 5Rank 6 Average Precision= 1Accuracy= 1 = 0.92 = 0.67 = 0.81

Ranking vs. Classification Ranking is not the same as classification Average precision is not the same as accuracy Should we use 0-1 loss based classifiers? Or should we use AP loss based rankers?

Optimizing Average Precision (AP-SVM) High-Order Information Missing Information Yue, Finley, Radlinski and Joachims, SIGIR 2007 Outline

Problem Formulation Single Input X Φ(x i ) for all i  P Φ(x k ) for all k  N

Problem Formulation Single Output R R ik = +1 if i is better ranked than k -1 if k is better ranked than i

Problem Formulation Scoring Function s i (w) = w T Φ(x i ) for all i  P s k (w) = w T Φ(x k ) for all k  N S(X,R;w) = Σ i  P Σ k  N R ik (s i (w) - s k (w))

Ranking at Test-Time R(w) = max R S(X,R;w) x1x1 Sort samples according to individual scores s i (w) x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8

Learning Formulation Loss Function Δ(R*,R(w)) = 1 – AP of rank R(w) Non-convex Parameter cannot be regularized

Learning Formulation Upper Bound of Loss Function Δ(R*,R(w))S(X,R(w);w) +- S(X,R(w);w)

Learning Formulation Upper Bound of Loss Function Δ(R*,R(w))S(X,R(w);w) +- S(X,R*;w)

Learning Formulation Upper Bound of Loss Function Δ(R*,R)S(X,R;w) +- S(X,R*;w) max R ConvexParameter can be regularized min w ||w|| 2 + C ξ S(X,R;w) + Δ(R*,R) - S(X,R*;w) ≤ ξ, for all R

Optimization for Learning Cutting Plane Computation max R S(X,R;w) + Δ(R*,R) x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8 Sort positive samples according to scores s i (w) Sort negative samples according to scores s k (w) Find best rank of each negative sample independently

Optimization for Learning Cutting Plane Computation Training Time 0-1 AP 5x slower AP Slightly faster Mohapatra, Jawahar and Kumar, NIPS 2014

Experiments PASCAL VOC 2011 Jumping Phoning Playing Instrument Reading Riding Bike Riding Horse Running Taking Photo Using Computer Walking ImagesClasses 10 ranking tasks Cross-validation Poselets Features

AP-SVM vs. SVM PASCAL VOC ‘test’ Dataset Difference in AP Better in 8 classes, tied in 2 classes

AP-SVM vs. SVM Folds of PASCAL VOC ‘trainval’ Dataset Difference in AP AP-SVM is statistically better in 3 classes SVM is statistically better in 0 classes

Optimizing Average Precision High-Order Information (HOAP-SVM) Missing Information Dokania, Behl, Jawahar and Kumar, ECCV 2014 Outline

High-Order Information People perform similar actions People strike similar poses Objects are of same/similar sizes “Friends” have similar habits How can we use them for ranking? classification

Problem Formulation x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Ψ(x,y) = Ψ1(x,y)Ψ1(x,y) Ψ2(x,y)Ψ2(x,y) Unary Features Pairwise Features

Learning Formulation x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Δ(y*,y) = Fraction of incorrectly classified persons

Optimization for Learning x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 max y w T Ψ(x,y) + Δ(y*,y) Graph Cuts (if supermodular) LP Relaxation, or exhaustive search

Classification x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 max y w T Ψ(x,y) Graph Cuts (if supermodular) LP Relaxation, or exhaustive search

Ranking? x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 Use difference of max-marginals

Max-Marginal for Positive Class x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 mm + (i;w) = max y,y i =+1 w T Ψ(x,y) Best possible score when person i is positive Convex in w

Max-Marginal for Negative Class x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 mm - (i;w) = max y,y i =-1 w T Ψ(x,y) Best possible score when person i is negative Convex in w

Ranking x Input x = {x 1,x 2,x 3 } Output y = {-1,+1} 3 s i (w) = mm + (i;w) – mm - (i;w) Difference-of-Convex in w Use difference of max-marginals HOB-SVM

Ranking s i (w) = mm + (i;w) – mm - (i;w) Why not optimize AP directly? High Order AP-SVM HOAP-SVM

Problem Formulation Single Input X Φ(x i ) for all i  P Φ(x k ) for all k  N

Problem Formulation Single Input R R ik = +1 if i is better ranked than k -1 if k is better ranked than i

Problem Formulation Scoring Function s i (w) = mm + (i;w) – mm - (i;w) for all i  P s k (w) = mm + (k;w) – mm - (k;w) for all k  N S(X,R;w) = Σ i  P Σ k  N R ik (s i (w) - s k (w))

Ranking at Test-Time R(w) = max R S(X,R;w) x1x1 Sort samples according to individual scores s i (w) x2x2 x3x3 x4x4 x5x5 x6x6 x7x7 x8x8

Learning Formulation Loss Function Δ(R*,R(w)) = 1 – AP of rank R(w)

Learning Formulation Upper Bound of Loss Function min w ||w|| 2 + C ξ S(X,R;w) + Δ(R*,R) - S(X,R*;w) ≤ ξ, for all R

Optimization for Learning Difference-of-convex program Kohli and Torr, ECCV 2006 Very efficient CCCP Linearization step by Dynamic Graph Cuts Update step equivalent to AP-SVM

Experiments PASCAL VOC 2011 Jumping Phoning Playing Instrument Reading Riding Bike Riding Horse Running Taking Photo Using Computer Walking ImagesClasses 10 ranking tasks Cross-validation Poselets Features

HOB-SVM vs. AP-SVM PASCAL VOC ‘test’ Dataset Difference in AP Better in 4, worse in 3 and tied in 3 classes

HOB-SVM vs. AP-SVM Folds of PASCAL VOC ‘trainval’ Dataset Difference in AP HOB-SVM is statistically better in 0 classes AP-SVM is statistically better in 0 classes

HOAP-SVM vs. AP-SVM PASCAL VOC ‘test’ Dataset Better in 7, worse in 2 and tied in 1 class Difference in AP

HOAP-SVM vs. AP-SVM Folds of PASCAL VOC ‘trainval’ Dataset HOAP-SVM is statistically better in 4 classes AP-SVM is statistically better in 0 classes Difference in AP

Optimizing Average Precision High-Order Information Missing Information (Latent-AP-SVM) Outline Behl, Jawahar and Kumar, CVPR 2014

Fully Supervised Learning

Weakly Supervised Learning Rank images by relevance to ‘jumping’

Use Latent Structured SVM with AP loss –Unintuitive Prediction –Loose Upper Bound on Loss –NP-hard Optimization for Cutting Planes Carefully design a Latent-AP-SVM –Intuitive Prediction –Tight Upper Bound on Loss –Optimal Efficient Cutting Plane Computation Two Approaches

Results

Questions? Code + Data Available