Graph Cut based Inference with Co-occurrence Statistics Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr.

Slides:

Advertisements

Similar presentations

Using Strong Shape Priors for Multiview Reconstruction Yunda SunPushmeet Kohli Mathieu BrayPhilip HS Torr Department of Computing Oxford Brookes University.

Advertisements

POSE–CUT Simultaneous Segmentation and 3D Pose Estimation of Humans using Dynamic Graph Cuts Mathieu Bray Pushmeet Kohli Philip H.S. Torr Department of.

Mean-Field Theory and Its Applications In Computer Vision1 1.

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts.

1 LP, extended maxflow, TRW OR: How to understand Vladimirs most recent work Ramin Zabih Cornell University.

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.

Multi-view Stereo via Volumetric Graph-cuts George Vogiatzis, Philip H. S. Torr Roberto Cipolla.

HOPS: Efficient Region Labeling using Higher Order Proxy Neighborhoods Albert Y. C. Chen 1, Jason J. Corso 1, and Le Wang 2 1 Dept. of Computer Science.

Generating Classic Mosaics with Graph Cuts Y. Liu, O. Veksler and O. Juan University of Western Ontario, Canada Ecole Centrale de Paris, France.

Ľubor Ladický1 Phil Torr2 Andrew Zisserman1

Introduction to Markov Random Fields and Graph Cuts Simon Prince

Joint Optimisation for Object Class Segmentation and Dense Stereo Reconstruction Ľubor Ladický, Paul Sturgess, Christopher Russell, Sunando Sengupta, Yalin.

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.

Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.

Scene Labeling Using Beam Search Under Mutex Constraints ID: O-2B-6 Anirban Roy and Sinisa Todorovic Oregon State University 1.

ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.

LARGE-SCALE IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill road building car sky.

INTRODUCTION Heesoo Myeong, Ju Yong Chang, and Kyoung Mu Lee Department of EECS, ASRI, Seoul National University, Seoul, Korea Learning.

Efficient Inference for Fully-Connected CRFs with Stationarity

Pseudo-Bound Optimization for Binary Energies Presenter: Meng Tang Joint work with Ismail Ben AyedYuri Boykov 1 / 27.

Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs Roozbeh Mottaghi 1, Sanja Fidler 2, Jian Yao 2, Raquel Urtasun 2, Devi Parikh 3 1 UCLA.

1 Fast Primal-Dual Strategies for MRF Optimization (Fast PD) Robot Perception Lab Taha Hamedani Aug 2014.

GraphCut-based Optimisation for Computer Vision Ľubor Ladický.

Models for Scene Understanding – Global Energy models and a Style-Parameterized boosting algorithm (StyP-Boost) Jonathan Warrell, 1 Simon Prince, 2 Philip.

Robust Higher Order Potentials For Enforcing Label Consistency

Schedule Introduction Models: small cliques and special potentials Tea break Inference: Relaxation techniques:

LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.

P 3 & Beyond Solving Energies with Higher Order Cliques Pushmeet Kohli Pawan Kumar Philip H. S. Torr Oxford Brookes University CVPR 2007.

Improved Moves for Truncated Convex Models M. Pawan Kumar Philip Torr.

TextonBoost : Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation J. Shotton*, J. Winn†, C. Rother†, and A.

Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.

What Energy Functions Can be Minimized Using Graph Cuts? Shai Bagon Advanced Topics in Computer Vision June 2010.

Relaxations and Moves for MAP Estimation in MRFs M. Pawan Kumar STANFORDSTANFORD Vladimir KolmogorovPhilip TorrDaphne Koller.

Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.

Extensions of submodularity and their application in computer vision

What, Where & How Many? Combining Object Detectors and CRFs

Michael Bleyer LVA Stereo Vision

Surface Stereo with Soft Segmentation Michael Bleyer 1, Carsten Rother 2, Pushmeet Kohli 2 1 Vienna University of Technology, Austria 2 Microsoft Research.

Object Detection Sliding Window Based Approach Context Helps

Minimizing Sparse Higher Order Energy Functions of Discrete Variables (CVPR’09) Namju Kwak Applied Algorithm Lab. Computer Science Department KAIST 1Namju.

MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris Joint work with Phil Torr, Daphne Koller.

Rounding-based Moves for Metric Labeling M. Pawan Kumar Center for Visual Computing Ecole Centrale Paris.

Lena Gorelick joint work with O. Veksler I. Ben Ayed A. Delong Y. Boykov.

INTRODUCTION Heesoo Myeong and Kyoung Mu Lee Department of ECE, ASRI, Seoul National University, Seoul, Korea Tensor-based High-order.

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London.

Associative Hierarchical CRFs for Object Class Image Segmentation Ľubor Ladický 1 1 Oxford Brookes University 2 Microsoft Research Cambridge Based on the.

1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.

Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online

Dynamic Tree Block Coordinate Ascent Daniel Tarlow 1, Dhruv Batra 2 Pushmeet Kohli 3, Vladimir Kolmogorov 4 1: University of Toronto3: Microsoft Research.

Update any set S of nodes simultaneously with step-size We show fixed point update is monotone for · 1/|S| Covering Trees and Lower-bounds on Quadratic.

Associative Hierarchical CRFs for Object Class Image Segmentation

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.

Inference for Learning Belief Propagation. So far... Exact methods for submodular energies Approximations for non-submodular energies Move-making ( N_Variables.

1 Scale and Rotation Invariant Matching Using Linearly Augmented Tree Hao Jiang Boston College Tai-peng Tian, Stan Sclaroff Boston University.

Markov Random Fields & Conditional Random Fields

Pushmeet Kohli. E(X) E: {0,1} n → R 0 → fg 1 → bg Image (D) n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and.

Rounding-based Moves for Metric Labeling M. Pawan Kumar École Centrale Paris INRIA Saclay, Île-de-France.

MRFs (X1,X2) X3 X1 X2 4 (X2,X3,X3) X4. MRFs (X1,X2) X3 X1 X2 4 (X2,X3,X3) X4.

Learning a Region-based Scene Segmentation Model

Alexander Shekhovtsov and Václav Hlaváč

Markov Random Fields with Efficient Approximations

Nonparametric Semantic Segmentation

Efficiently Selecting Regions for Scene Understanding

Multiway Cut for Stereo and Motion with Slanted Surfaces

Efficient Graph Cut Optimization for Full CRFs with Quantized Edges

Learning to Combine Bottom-Up and Top-Down Segmentation

Tractable MAP Problems

Automatic User Interaction Correction via Multi-label Graph-cuts

Presentation transcript:

Graph Cut based Inference with Co-occurrence Statistics Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr

Image labelling Problems Image Denoising Geometry Estimation Object Segmentation Assign a label to each image pixel Building Sky Tree Grass

Standard CRF Energy Pairwise CRF models Data termSmoothness term

Standard CRF Energy Pairwise CRF models Restricted expressive power Data termSmoothness term

Structures in CRF Taskar et al. 02 – associative potentials Kohli et al. 08 – segment consistency Woodford et al. 08 – planarity constraint Vicente et al. 08 – connectivity constraint Nowozin & Lampert 09 – connectivity constraint Roth & Black 09 – field of experts Ladický et al. 09 – consistency over several scales Woodford et al. 09 – marginal probability Delong et al. 10 – label occurrence costs

Pairwise CRF models Standard CRF Energy for Object Segmentation Cannot encode global consistency of labels!! Local context

Image from Torralba et al. 10 Detection Suppression If we have 1000 categories (detectors), and each detector produces 1 fp every 10 images, we will have 100 false alarms per image… pretty much garbage… [Torralba et al. 10, Leibe & Schiele 09, Barinova et al. 10]

Thing – Thing Stuff - Stuff Stuff - Thing [ Images from Rabinovich et al. 07 ] Encoding Co-occurrence Co-occurrence is a powerful cue [Heitz et al. '08] [Rabinovich et al. ‘07]

Thing – Thing Stuff - Stuff Stuff - Thing [ Images from Rabinovich et al. 07 ] Encoding Co-occurrence Co-occurrence is a powerful cue [Heitz et al. '08] [Rabinovich et al. ‘07] Proposed solutions : 1. Csurka et al Hard decision for label estimation 2. Torralba et al GIST based unary potential 3. Rabinovich et al Full-connected CRF

So... What properties should these global co-occurence potentials have ?

Desired properties 1. No hard decisions

Desired properties 1. No hard decisions Incorporation in probabilistic framework Unlikely possibilities are not completely ruled out

Desired properties 1. No hard decisions 2. Invariance to region size

Desired properties 1. No hard decisions 2. Invariance to region size Cost for occurrence of {people, house, road etc.. } invariant to image area

Desired properties 1. No hard decisions 2. Invariance to region size The only possible solution : Local context Global context Cost defined over the assigned labels L(x) L(x)={,, }

Desired properties 1. No hard decisions 2. Invariance to region size 3. Parsimony – simple solutions preferred L(x)={ building, tree, grass, sky } L(x)={ aeroplane, tree, flower, building, boat, grass, sky } 

Desired properties 1. No hard decisions 2. Invariance to region size 3. Parsimony – simple solutions preferred 4. Efficiency

Desired properties 1. No hard decisions 2. Invariance to region size 3. Parsimony – simple solutions preferred 4. Efficiency a) Memory requirements as O(n) with the image size and number or labels b) Inference tractable

Torralba et al.(2003) – Gist-based unary potentials Rabinovich et al.(2007) - complete pairwise graphs Csurka et al.(2008) - hard estimation of labels present Previous work

Zhu & Yuille 1996 – MDL prior Bleyer et al – Surface Stereo MDL prior Hoiem et al – 3D Layout CRF MDL Prior Delong et al – label occurence cost Related work C(x) = K |L(x)| C(x) = Σ L K L δ L (x)

Zhu & Yuille 1996 – MDL prior Bleyer et al – Surface Stereo MDL prior Hoiem et al – 3D Layout CRF MDL Prior Delong et al – label occurence cost Related work C(x) = K |L(x)| C(x) = Σ L K L δ L (x) All special cases of our model

Inference Pairwise CRF Energy

Inference IP formulation (Schlesinger 73)

Inference Pairwise CRF Energy with co-occurence

Inference IP formulation with co-occurence

Inference IP formulation with co-occurence Pairwise CRF cost Pairwise CRF constaints

Inference IP formulation with co-occurence Co-occurence cost

Inference IP formulation with co-occurence Inclusion constraints

Inference IP formulation with co-occurence Exclusion constraints

Inference LP relaxation Relaxed constraints

Inference LP relaxation Very Slow! 80 x 50 subsampled image takes 20 minutes

Inference: Our Contribution Pairwise representation One auxiliary variable Z  2 L Infinite pairwise costs if x i  Z [see technical report] *Solvable using standard methods: BP, TRW etc.

Inference: Our Contribution Pairwise representation One auxiliary variable Z  2 L Infinite pairwise costs if x i  Z [see technical report] *Solvable using standard methods: BP, TRW etc. Relatively faster but still computationally expensive!

Inference using Moves Graph Cut based move making algorithms [Boykov et al. 01] α-expansion transformation function Series of locally optimal moves Each move reduces energy Optimal move by minimizing submodular function Space of Solutions (x) : L N Move Space (t) : 2 N Search Neighbourhood Current Solution N Number of Variables L Number of Labels

Inference using Moves Graph Cut based move making algorithms [Boykov, Veksler, Zabih. 01] α-expansion transformation function

Inference using Moves Label indicator functions Co-occurence representation

Inference using Moves Move Energy Cost of current label set

Inference using Moves Move Energy Decomposition to α-dependent and α-independent part α-independentα-dependent

Inference using Moves Move Energy Decomposition to α-dependent and α-independent part Either α or all labels in the image after the move

Inference using Moves Move Energy submodularnon-submodular

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t)  E(t) for any other labelling

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t)  E(t) for any other labelling Occurrence - tight

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t)  E(t) for any other labelling Co-occurrence overestimation

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t)  E(t) for any other labelling General case [See the paper]

Inference Move Energy non-submodular Non-submodular energy overestimated by E'(t) – E'(t) = E(t) for current solution – E'(t)  E(t) for any other labelling Quadratic representation

Application: Object Segmentation Standard MRF model for Object Segmentation Label based Costs Cost defined over the assigned labels L(x)

Training of label based potentials Indicator variables for occurrence of each label Label set costs Approximated by 2 nd order representation

Experiments Methods – Segment CRF – Segment CRF + Co-occurrence Potential – Associative HCRF [Ladický et al. ‘09] – Associative HCRF + Co-occurrence Potential Datasets MSRC-21 Number of Images: 591 Number of Classes: 21 Training Set: 50% Test Set: 50% PASCAL VOC 2009 Number of Images: 1499 Number of Classes: 21 Training Set: 50% Test Set: 50%

MSRC - Qualitative

VOC 2010-Qualitative

Quantitative Results MSRC-21 PASCAL VOC 2009

Incorporated label based potentials in CRFs Proposed feasible inference Open questions – Optimal training method for co-occurence – Bounds of graph cut based inference Questions ? Summary and further work