C. Olsson Higher-order and/or non-submodular optimization: Yuri Boykov jointly with Western University Canada O. Veksler Andrew Delong L. Gorelick C. NieuwenhuisE.

Slides:

Advertisements

Similar presentations

Mean-Field Theory and Its Applications In Computer Vision1 1.

Advertisements

Primal-dual Algorithm for Convex Markov Random Fields Vladimir Kolmogorov University College London GDR (Optimisation Discrète, Graph Cuts et Analyse d'Images)

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts.

Various Regularization Methods in Computer Vision Min-Gyu Park Computer Vision Lab. School of Information and Communications GIST.

Tutorial at ICCV (Barcelona, Spain, November 2011)

Lena Gorelick Joint work with Frank Schmidt and Yuri Boykov Rochester Institute of Technology, Center of Imaging Science January 2013 TexPoint fonts used.

Introduction to Markov Random Fields and Graph Cuts Simon Prince

Medical Image Segmentation: Beyond Level Sets (Ismail’s part) 1.

Learning with Inference for Discrete Graphical Models Nikos Komodakis Pawan Kumar Nikos Paragios Ramin Zabih (presenter)

ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.

The University of Ontario CS 4487/9687 Algorithms for Image Analysis Multi-Label Image Analysis Problems.

ICCV 2007 tutorial on Discrete Optimization Methods in Computer Vision part I Basic overview of graph cuts.

Variational Inference in Bayesian Submodular Models

Learning with Inference for Discrete Graphical Models Nikos Komodakis Pawan Kumar Nikos Paragios Ramin Zabih (presenter)

Pseudo-Bound Optimization for Binary Energies Presenter: Meng Tang Joint work with Ismail Ben AyedYuri Boykov 1 / 27.

1 Fast Primal-Dual Strategies for MRF Optimization (Fast PD) Robot Perception Lab Taha Hamedani Aug 2014.

1 Minimum Ratio Contours For Meshes Andrew Clements Hao Zhang gruvi graphics + usability + visualization.

Corp. Research Princeton, NJ Computing geodesics and minimal surfaces via graph cuts Yuri Boykov, Siemens Research, Princeton, NJ joint work with Vladimir.

Markov Random Fields (MRF)

Corp. Research Princeton, NJ Cut Metrics and Geometry of Grid Graphs Yuri Boykov, Siemens Research, Princeton, NJ joint work with Vladimir Kolmogorov,

Visual Recognition Tutorial

Silhouettes in Multiview Stereo Ian Simon. Multiview Stereo Problem Input: – a collection of images of a rigid object (or scene) – camera parameters for.

Robust Higher Order Potentials For Enforcing Label Consistency

Schedule Introduction Models: small cliques and special potentials Tea break Inference: Relaxation techniques:

P 3 & Beyond Solving Energies with Higher Order Cliques Pushmeet Kohli Pawan Kumar Philip H. S. Torr Oxford Brookes University CVPR 2007.

Improved Moves for Truncated Convex Models M. Pawan Kumar Philip Torr.

2010/5/171 Overview of graph cuts. 2010/5/172 Outline Introduction S-t Graph cuts Extension to multi-label problems Compare simulated annealing and alpha-

Graph Cut based Inference with Co-occurrence Statistics Ľubor Ladický, Chris Russell, Pushmeet Kohli, Philip Torr.

What Energy Functions Can be Minimized Using Graph Cuts? Shai Bagon Advanced Topics in Computer Vision June 2010.

Relaxations and Moves for MAP Estimation in MRFs M. Pawan Kumar STANFORDSTANFORD Vladimir KolmogorovPhilip TorrDaphne Koller.

Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.

Yuri Boykov, UWO 1: Basics of optimization-based segmentation - continuous and discrete approaches 2 : Exact and approximate techniques - non-submodular.

Extensions of submodularity and their application in computer vision

Graph-based Segmentation

S I E M E N S C O R P O R A T E R E S E A R C H 1 1 Computing Exact Discrete Minimal Surfaces: Extending and Solving the Shortest Path Problem in 3D with.

CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep

Michael Bleyer LVA Stereo Vision

MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.

CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct

Planar Cycle Covering Graphs for inference in MRFS The Typhon Algorithm A New Variational Approach to Ground State Computation in Binary Planar Markov.

Minimizing general submodular functions

Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris Joint work with Phil Torr, Daphne Koller.

Lena Gorelick joint work with O. Veksler I. Ben Ayed A. Delong Y. Boykov.

Fast Parallel and Adaptive Updates for Dual-Decomposition Solvers Ozgur Sumer, U. Chicago Umut Acar, MPI-SWS Alexander Ihler, UC Irvine Ramgopal Mettu,

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London.

Discrete Optimization Lecture 3 – Part 1 M. Pawan Kumar Slides available online

1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.

Fast and accurate energy minimization for static or time-varying Markov Random Fields (MRFs) Nikos Komodakis (Ecole Centrale Paris) Nikos Paragios (Ecole.

Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online

Auxiliary Cuts for General Classes of Higher-Order Functionals 1 Ismail Ben Ayed, Lena Gorelick and Yuri Boykov.

1 Markov random field: A brief introduction (2) Tzu-Cheng Jen Institute of Electronics, NCTU

Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

The University of Ontario CS 4487/9687 Algorithms for Image Analysis Multi-Label Image Analysis Problems.

Yuri Boykov jointly with

Pseudo-Bound Optimization for Binary Energies Meng Tang 1 Ismail Ben Ayed 2 Yuri Boykov 1 1 University of Western Ontario, Canada 2 GE Healthcare Canada.

CS654: Digital Image Analysis Lecture 28: Advanced topics in Image Segmentation Image courtesy: IEEE, IJCV.

The University of Ontario From photohulls to photoflux optimization Yuri Boykov University of Western Ontario Victor Lempitsky Moscow State University.

A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.

MAP Estimation in Binary MRFs using Bipartite Multi-Cuts Sashank J. Reddi Sunita Sarawagi Sundar Vishwanathan Indian Institute of Technology, Bombay TexPoint.

Segmentation with non-linear constraints on appearance, complexity, and geometry Yuri Boykov Western Univesity Andrew Delong Olga Veksler Lena Gorelick.

The University of Ontario CS 4487/9687 Algorithms for Image Analysis Combining Color and Boundary and other extensions (graph cuts)

Advanced Appearance and Shape Models

Markov Random Fields with Efficient Approximations

Secrets of GrabCut and Kernel K-means

Efficient Graph Cut Optimization for Full CRFs with Quantized Edges

Normalized Cut meets MRF

Multi-view reconstruction

Discrete Optimization Methods Basic overview of graph cuts

Parallel and Distributed Graph Cuts by Dual Decomposition

“Traditional” image segmentation

Presentation transcript:

C. Olsson Higher-order and/or non-submodular optimization: Yuri Boykov jointly with Western University Canada O. Veksler Andrew Delong L. Gorelick C. NieuwenhuisE. Toppe I. Ben Ayed A. Delong H. IsackA. Osokin M. Tang Tutorial on Medical Image Segmentation: Beyond Level-Sets MICCAI, 2014

Basic segmentation energy E(S) boundary smoothness (quadratic / pairwise) segment appearance (linear / unary) - such second-order functions can be minimized exactly via graph cuts [Greig et al.’91, Sullivan’94, Boykov-Jolly’01] n-links s t a cut t-link

Basic segmentation energy E(S) boundary smoothness (quadratic / pairwise) segment appearance (linear / unary) - submodular second-order functions can be minimized exactly via graph cuts [Greig et al.’91, Sullivan’94, Boykov-Jolly’01] n-links s t a cut t-link [Hammer 1968, Pickard&Ratliff 1973]

any (binary) segmentation functional E(S) is a set function E: S Ω Submodular set functions

Set function is submodular if for any Significance: any submodular set function can be globally optimized in polynomial time [Grotschel et al.1981,88, Schrijver 2000] S T Ω Edmonds 1970 (for arbitrary lattices)

Submodular set functions Set function is submodular if for any S T Ω equivalent intuitive interpretation via “diminishing returns” v Easily follows from the previous definition: Significance: any submodular set function can be globally optimized in polynomial time [Grotschel et al.1981,88, Schrijver 2000]

Assume set Ω and 2nd-order (quadratic) function Function E(S) is submodular if for any Significance: submodular 2 nd -order boolean (set) function can be globally optimized in polynomial time by graph cuts [Hammer 1968, Pickard&Ratliff 1973] Indicator variables [Boros&Hammer 2000, Kolmogorov&Zabih2003] Submodular set functions

Combinatorial optimization Continuous optimization Global Optimization submodularityconvexity ?

Global Optimization f (x,y) =  ∙ xy for  y x A C D B EXAMPLE: y x A C D B f (0,0) + f (1,1) ≤ f (0,1) + f (1,0) submodular binary energy convex continuous extension submodularityconvexity ?

submodularityconvexity Global Optimization y x A C D B f (0,0) + f (1,1) ≤ f (0,1) + f (1,0) y x A C D B submodular binary energy concave continuous extension f (x,y) =  ∙ xy for  EXAMPLE:

Assume Gibbs distribution over binary random variables for posterior optimization (in Markov Random Fields) Theorem [Boykov, Delong, Kolmogorov, Veksler in unpublished book 2014?] All random variables s p are positively correlated iff set function E(S) is submodular That is,… submodularity implies MRF with “smoothness” prior

second-order submodular energy boundary smoothness segment region/appearance w pq < 0 would imply non-submodularity

Now - more difficult energies Non-submodular energies High-order energies

Beyond submodularity: even second-order is challenging Example: deconvolution image I blurred with mean kernel QPBO TRWS “partial enumeration” submodular quadratic term non-submodular quadratic term

Beyond submodularity: even second-order is challenging Example: quadratic volumetric prior e.g. NOTE: any convex cardinality potentials are supermodular [Lovasz’83] (not submodular)

Many useful higher-order energies Cardinality potentials Curvature of the boundary Shape convexity Segment connectivity Appearance entropy, color consistency Distribution consistency High-order shape moments …

QPBO [survey Kolmogorov&Rother, 2007] LP relaxations [e.g. Schlezinger, Komodakis, Kolmogorov, Savchinsky,…] Message passing, e.g. TRWS [Kolmogorov] Partial Enumeration [Olsson&Boykov, 2013] Trust Region [e.g. Gorelick et al., 2013] Bound Optimization [Bilmes et al.’2006, Ben Ayed et al.’2013, Tang et al.’2014] Submodularization [e.g. Gorelick et al., 2014] Non-submodular and/or high-order functions E(S) Optimization is a very active area of research…

Our recent methods: local submodular approximations (submodularization) gradient descent + level sets (in continuous case) LP relaxations (QPBO, TRWS, etc) local linear approximations (parallel ICM) Prior approximation techniques based on linearization Fast Trust Region [CVPR 13] Auxiliary Cuts [CVPR 13] LSA [CVPR 14] Pseudo-bounds [ECCV 14], [Bilmes et al.2009]

Trust Region Approximation |S||S|  submodular terms appearance log-likelihoodsboundary length non-submodular term volume constraint Linear approx. at S 0 S0S0 S0S0 submodular approx. trust region L 2 distance to S 0

Volume Constraint for Vertebrae segmentation Log-Lik. + length 20

1 Trust region Vs. bound optimization Fast iterative techniques Sub-problem solutions Convex continuous formulation Graph Cuts (submodular discrete formulation) Gradient descent and Level sets Vs.

Trust Region vs. Gradient Descent 2 iso-surfaces of approximation  E S E S - trust region ||S – S 0 || ≤ d gradient Gorelick et al., CVPR 2013

Trust Region vs. Gradient Descent 3 iso-surfaces of approximation  E S E S - trust region ||S – S 0 || ≤ d gradient e.g., Gorelick et al., CVPR 2013 Can be solved globally with graph cuts or convex relaxation

Bound Optimization solution space S StSt S t+1 At(S)At(S) E(S)E(S) E(St)E(St) E(S t+1 ) (Majorize-Minimize, Auxiliary Function, Surrogate Function) e.g., Ben Ayed et al., CVPR

Bound Optimization solution space S StSt S t+1 S t+2 A t+1 (S) E(S)E(S) E(St)E(St) E(S t+1 ) E(S t+2 ) local minimum (Majorize-Minimize, Auxiliary Function, Surrogate Function) e.g., Ben Ayed et al., CVPR

Bound Optimization solution space S StSt S t+1 S t+2 A t+1 (S) E(S)E(S) E(St)E(St) E(S t+1 ) E(S t+2 ) Can be solved globally with graph cuts or Convex relaxation (Majorize-Minimize, Auxiliary Function, Surrogate Function) e.g., Ben Ayed et al., CVPR

Pseudo-Bound Optimization (Majorize-Minimize, Auxiliary Function, Surrogate Function) Tang et al., ECCV 2014 Make larger move! (a) (b) (c) solution space S StSt S t+1 E(S)E(S) E(St)E(St) E(S t+1 ) 4

5 An example illustrating the difference between the three frameworks Initialization Adding volume Log-likelihood Gradient Descent

5 An example illustrating the difference between the three frameworks Initialization Adding volume Log-likelihood Trust Region

5 An example illustrating the difference between the three frameworks Initialization Adding volume Log-likelihood Bound Optimization

5 An example illustrating the difference between the three frameworks Initialization Adding volume Log-likelihood Pseudo-Bound Optimization

6 Summary of main differences/similarities FrameworkPropertiesApplicability Gradient Descent (e.g. level- sets) - Fixed small moves - Linear approximation Any differentiable functional Trust region - Adaptive large moves - Submodular or convex approximation Any differentiable functional Bound optimization - Unlimited large moves - Submodular or convex bounds Need a bound (not always easy)

7 Trust Region Vs. Level sets: Experimental comparisons  Level set evolution without re initialization C. Li et al., CVPR 2005  No ad hoc initialization procedures  Keep a distance function by adding:  Frequently used by the community (>1500 citations)  Allows relatively large values of dt

High Low Min for a circle Compactness (Circularity) prior Trust region Vs. Level sets: Example 1 Ben Ayed et al., MICCAI 14 8

Trust region Compactness (Circularity prior) Trust region Vs. Level sets: Example 1 Without compactness 8

Trust region Vs. Level sets: Example 2 Volume constraint + Length Discrete Continuous 9

Trust Region Vs. Level Sets: Example 2 Volume Constraint + Length Init LS, t=1 LS, t=5 LS, t=10 LS, t=50 LS, t=1000 FTR, α=10FTR, α=5 FTR, α=2 9

Volume Constraint + Length Trust Region Vs. Level Sets: Example 2 10

Volume Constraint + Length Trust Region Vs. Level Sets: Example 2 10

Volume Constraint + Length Trust Region Vs. Level Sets: Example 2 10

Level-Set, t=50 Level-Set, t=1 Level-Set, t=5 Level-Set, t=10 Level-Set, t=1000 Init FTR, α=10 FTR, α=2 FTR, α=5 Trust Region Vs. Level Sets: Example 3 Shape moment Constraint + Length Up to order-2 moments learned from user-provided ellipse 11

Level-Set, t=50Level-Set, t=1Level-Set, t=5Level-Set, t=10 Level-Set, t=1000 Shape moment Constraint + Length Trust Region Vs. Level Sets: Example 3 11

Level-Set, t=50Level-Set, t=1Level-Set, t=5Level-Set, t=10 Level-Set, t=1000 Shape moment Constraint + Length Trust Region Vs. Level Sets: Example 3 11

Level-Set, t=1 Level-Set, t=1000 Shape moment Constraint + Length Trust Region Vs. Level Sets: Example 3 11

Level-Set, dt=1 Level-Set, dt=50 Level-Set, dt=1000 Init Surrogate 12 L2 bin count Constraint Bound Optimization Vs. Level Sets

L2 bin count Constraint Bound Optimization Vs. Level Sets 12

L2 bin count Constraint Bound Optimization Vs. Level Sets 12

Entropy-based segmentation Interactive segmentation with box volume balancing color consistency boundary smoothness |S||S| |Si||Si|   ii   i  ++ submodular termsnon-submodular term

1.8 sec 1.3 sec 11.9 sec 576 sec Pseudo-Bound optimization [ECCV 2014] One-Cut [ICCV 2013] GrabCut (block-coordinate descent) Dual Decomposition Entropy-based segmentation

Curvature optimization instead of boundary length (pairwise submodular term) (3-rd order non-submodular potential, see next slide)

S S 3-cliques with configurations (0,1,0) and (1,0,1) p p+  p-  general intuition example Nieuwenhuis et al., CVPR 2014 more responses where curvature is higher

Need to penalize 3-clique configurations (010) and (101) uses submodular approximation and Trust Region [CVPR13, CVPR14] Nieuwenhuis et al., CVPR 2014

Segmentation Examples length-based regularization

elastica [Heber,Ranftl,Pock, 2012] Segmentation Examples

90-degree curvature [El-Zehiry&Grady, 2010] Segmentation Examples

our squared curvature Segmentation Examples

Take-home messages - submodular or convex approximations are a good alternative to linearization methods (Level-Sets, QPBO, TRWS, etc.,) (Trust Region, Auxiliary Functions, Pseudo-bounds, etc.) [CVPR 2013, CVPR 2014, ECCV 2014] - code available