Speeding Up MRF Optimization using Graph Cuts for Computer Vision Vibhav Vineet Adviser: Prof. P. J. Narayanan
Labelling Problem Extracting Foreground Pixels Disparity map calculation Image Denoising Extracting Foreground Object Left Tsukuba Image Noisy House Image Denoised Image Disparity Map Flower Image Pixel Labeling: Assigning a label to each pixel in image. Image Segmentation: Involves Separating Foreground layer from background layer Pixel Labeling: Assigning a label to each pixel in image. Stereo Correspondence: Involves Calculation of depth map using left and right images Pixel Labeling: Assigning a label to each pixel in image. Image Denoising: Involves assigning denoised intensity value to each pixel in image.
Labelling Problem - MAP Estimation To find the best possible configuration. But the complexity increases With the number of variables/pixels With the number of labels in the label set Using joint probability or conditional probabilities to evaluate the best possible configuration Very hard with the limited computation and memory power Energy minimization method MAP – MRF equiivalence Methods provide approximate solution at a moderate times Generally, in computer vision an energy function involve unary cost and pairwise interactions between variables.
Labelling Problem Image-Graph Equivalence Total Cost = Unary Cost + PairWise Cost Different Labeling, different Cost Energy E ( X ) = Unary Potential + PairWise Potential Labeling Problem: Find a Labeling ( X ) with Minimum Cost or Energy Value Image-Graph Equivalence Graph G( V, E ) PairWise Cost ( Per Edge Cost ) Cost of Assignment for “same label” is low. Cost of Assignment for “different labels” is high. Image-Graph Equivalence Unary Cost ( Per Vertex Cost) Graph G( V, E ) Cost of Assignment for “fg” is low. Cost of Assignment for “bg” is high.
MAP-MRF Formulation MAP(X) Min Energy(X*) MAP estimation of a configuration X is equivalent to the minimum energy defined over the configuration Energy = Data Term + Smoothness Term Graph Cuts in Computer Vision Image Energy Function Graph Construction st-MinCut
Image-Graph Equivalence Graph Construction Vertex per pixel s t Image-Graph Equivalence Background Pixels Foreground Pixels Add n-edges Image Add t-edges Graph G(V,E) Graph Constructed for vision problems Grid graphs Low connectivity Connectivity is limited to 4, 8, or 27
The st-Mincut Problem Given a Graph G(V,E,W) and two vertices s and t. Partition G into two disjoint components containing s and t respectively such that sum of edge weights from s to t is minimum s Mincut t
Computing the st-Mincut Solve the dual Maximum Flow problem Two approaches Edmond Karp’s Augmenting path method Goldberg’s Push-Relabel method st-Mincut Max Flow Dual In every network, the maximum flow equals the cost of st-mincut
Edmond Karp Method Initialize flow in G to 0 Find a shortest path from s to t. Augment the path with minimum possible flow Repeat until there exists a path from s to t Initialize flow in G to 0 Find a shortest path from s to t. Augment the path with minimum possible flow Repeat until there exists a path from s to t Initialize flow in G to 0 Find a shortest path from s to t. Augment the path with minimum possible flow Repeat until there exists a path from s to t Initialize flow in G to 0 Find a shortest path from s to t. Augment the path with minimum possible flow Repeat until there exists a path from s to t Initialize flow in G to 0 Find a shortest path from s to t. Augment the path with minimum possible flow Repeat until there exists a path from s to t Initialize flow in G to 0 Find a shortest path from s to t. Augment the path with minimum possible flow Repeat until there exists a path from s to t Initialize flow in G to 0 Find a shortest path from s to t. Augment the path with minimum possible flow Repeat until there exists a path from s to t s t 100 40 7 3 4 6 11 13 14 s t 100 40 7 3 4 6 11 13 14 s t 100 37 7 1 6 11 13 14 3 s t 100 37 7 1 6 11 13 14 3 s t 93 37 1 6 11 13 7 3 s t 93 37 1 6 11 13 7 3 s t 87 37 1 17 7 3 6 Current Flow: 10 Current Flow: 16 Current Flow: 10 Current Flow: 0 Current Flow: 0 Current Flow: 3 Current Flow: 3 Flow <= Edge Capacity Edge Capcity must be positive
Edmond Karp Method Initialize flow in G to 0 Find a shortest path from s to t. Augment the path with minimum possible flow Repeat until there exists a path from s to t s 87 37 6 7 3 7 17 1 1 Current Flow: 16 t Flow <= Edge Capacity Edge Capcity must be positive
Goldberg’s Push-Relabel Algorithm Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation s 87 37 7 t 3 1 6 11 13 8 Height s 87 37 7 t 3 1 6 17 8 Height s 87 37 t 7 3 1 6 17 Height s 87 37 7 t 3 1 6 17 8 Height s 87 37 7 t 3 1 6 17 8 Height s 87 37 7 t 3 1 6 11 13 8 Height s 87 37 7 t 3 1 6 11 13 8 Height s 87 37 7 t 3 1 6 11 13 8 Height s 87 37 7 3 6 t 4 11 13 14 Height s 100 40 t 7 3 4 6 11 13 14 Height s 100 40 t 7 3 4 6 11 13 14 Height s 100 40 t 7 3 4 6 11 13 14 Height s 87 37 7 3 6 t 4 11 13 14 Height s 87 37 7 3 6 t 4 11 13 14 Height s 87 37 7 3 6 t 4 11 13 14 Height Current Flow: 9 Current Flow: 0 Current Flow: 0 Current Flow: 16 Current Flow: 9 Current Flow: 0 Current Flow: 0 Current Flow: 0 Current Flow: 9 Current Flow: 9 Current Flow: 9 Current Flow: 0 Current Flow: 9 Current Flow: 0 Current Flow: 9 Flow <= Edge Capacity Edge Capcity must be positive
Goldberg’s Push-Relabel Algorithm Initialize excess flow and heights in G Perform an applicable Push or Relabel operation Repeat until there exists an applicable push or relabel operation Height s 6 37 87 7 17 3 t 1 1 6 Current Flow: 16 Flow <= Edge Capacity Edge Capcity must be positive
Motivation Fast Computation Required Robot navigation, surveillance, video processing etc Video Processing at real time You tube and other web-servers Large images Processing Even our offshelf cameras take high resolution images Interactive tools
Mapping to CUDA Image Grid Thread per pixel Image CUDA Grid CUDA Block
Push-Relabel Algorithm on CUDA Push is an local operation with each node sending flows to its neighbors. Relabel is also a local operation, each vertex updates its own height. Problems faced: Read After Write consistency Synchronization of threads
Handling Problems using Atomics Push operations can performed without any read after write inconsistencies Relabel is a per vertex operation Employing atomic Capabilities and combining the push and pull kernels Push Kernel Relabel Kernel Lowers Global memory access, empirically faster convergence is observed.
The Push Kernel Load heights from the global memory to the shared memory. Synchronize threads ensuring the completion of load operation. Push flows to eligible neighbors atomically. Update the edge-weights atomically in the residual graph. Update excess flow atomically in the residual graph.
The Relabel Kernel Load height from the global memory to the shared memory. Synchronize ensuring the completion of load operation. Compute the minimum height of all neighbors and set own height to plus one of this. Write the new height to global memory.
Using Shared Memory Per CUDA block requires (Block_size+2 X Block_size+2) memory to be loaded into the shared memory Corner pixels need heights from other blocks Block size + 2 Thread Block size + 2 Height needed per thread Block
Heuristics on Push and Relabel On grid graphs Global relabel (BFS based) is an expensive operation Local relabel perform better empirically Multiple pushes can be performed before applying a relabel step using For most general graphs m=3 and k=7 are found to be optimal. (m*Push + Relabel)*k + Global Relabel
Stochastic Cuts MRF consists of simple and difficult pixels. Simple pixels get their correct labels in few initial iterations Difficult (few) pixels exchange flows with their neighbors in later iterations Stochastic Cuts processes pixels based on their activity Activity is based on change in flows from previous to current iteration. Low activity is observed for simple pixels Heuristically process simple pixels after a fixed number of iterations
Experimental Results
Experimental Results Image Size Time CPU (ms) Time Non Atomic Time Atomic Stochastic Sponge 640x480 142 28 16 11 Flower 608x456 188 33 26 Person 140 31 27 20 Synthetic 1Kx1K 655 19 10 7
Graph Reparameterization S t 5 9 4 2 1 S 2 9 Graph Cuts Graph Cuts 1 2 5 4 t
Graph Reparameterization S t 5 9 4 2 1 S 2 9 Graph Cuts Graph Cuts 1 2 5 4 t Graph Reparameterized Graph Reparameterized No change in cut S S 2+2 9 4 9 1 Graph Cuts Graph Cuts 1 2 2 5+2 4 7 4 t t
Dynamic Cuts EA SA EB SB Problem Instance 1 Problem Instance 2 Problems instances where they differ slightly. Solving each independently is computationally expensive Example: Continuous frames in a video
Dynamic Cuts Steps Involved Edge capacities are updated and reparameterized using Previous frame edge capacities Previous frame residual flow Current frame edge capacities ci Previous Frame fi Previous Frame after st-MinCut ri ci’ Current Frame Updation Step: ri’ = ri + ci’ - ci ri’ Approximate cut using previous frame and its st-MinCut fi’ Reparameterization Step: rsi’ = 0 rit’ = cit - fit + fsi – csi’ Final st-MinCut of current frame
Dynamic Cuts are parallizable Updation and Reparameterization are independent and parallizable operations, work locally at every vertex. st-Mincut is performed using a parallel implementation of Push Relabel algorithm.
Dynamic Cuts Empirically Running time depends on the percentage of weights that changed On a low resolution video, the dynamic cuts takes about 2 ms compared to 7 ms on the same image for the st-MinCut Consecutive frames of a video segmented using dynamic cuts
The Multilabeling problem Multi-way cut on any graph is an NP-Hard problem for L > 2 Approximate solutions based on graph cuts α-Expansion αβ-Swap
The α-Expansion 1: Initialize the MRF with an arbitrary labeling X 2: For each label alpha \in L do 3: Construct the graph based on the current configuration 4: Perform one α-Expansion step (st-cut) 5: Update the configuration if energy decreases 6: End For 7: Repeat steps 2 to 6 till convergence. Step 2-6 is a cycle and 3-5 is an iteration
Incremental α-Expansion Reusability of flows, as in dynamic MRF Better initializations for next graph cut Incremental/Dynamic Reuse the flows from label to label and and re- cycle flows from cycle to cycle. Cycle 1 Input Label1 Label2 Label3 Cycle 2
Incremental α-Expansion Results Tsukuba Teddy Penguin Panorama
Incremental α-Expansion Results Total Timings on Different Datasets
Processing on High Detailed Scene High Resolution Image High Dynamic Ranges of Colors Wide View Angles. Challenges High Computation Cost High Memory Requirement Interaction with high resolution images Statistics of image sizes available on Google images. An overwhelming fraction of images are of size 2 to 10 million pixels. Only 0.6% of fewer images had more than 40 mega pixels.
Processing on High Detailed Scene Define E(x) for coarsest image Final Result for this level Define E(x) for next finer level Final Result at this level Solve an optimization problem at the coarser level to dynamically update the optimization instance for the next level for better initialization.
Pyramid Reparameterization Construction Pyramid is Constructed. Input largest image at the base of the pyramid. Each pixel coarser level image is mean of 4 pixels at the previous finer level
Pyramid Reparameterization Image (i-1) Graph G(V,E) Minimization at (i-1) Segmented Image at (i-1) Residual Graph G(V,E) Upsampling Step Upsampled Initial Graph Upsampled Residual Graph
Pyramid Reparameterization Computationally Expensive Graph Cuts Graph at the current level Final Residual Graph at the current level
Pyramid Reparameterization Graph Cuts Upsampled graph of previous level Reuse of flows Difference between two graphs Cheaper solution Computationally Expensive Graph Cuts Graph at the current level Final Residual Graph at the current level
Residual Graph Upsampling Upsampling Rules Graph Upsampling Graph Cuts Residual Graph Upsampling
Residual Graph Upsampling Upsampling Rules Graph Upsampling Graph Cuts Residual Graph Upsampling
Residual Graph Upsampling Upsampling Rules Graph Upsampling Graph Cuts Residual Graph Upsampling
Image Segmentation Results Horse 3.3 MP (2048x1600)
Image Segmentation Results
Interactive Image Segmentation Tool User Interaction Important in foreground/background separation User 1 User 2 Results of a user study on image size for comfortable manipulation for two display sizes. Average subjective response for six image sizes. Images that are larger than the display is disfavored users.
Pyramid Segmentation System Display Window User interacts at the display window of comfortable size Actual Image Actual Segmentation goes on in background on other levels Quick Segment: Display the segmentation results on this display image Provides perceptual response to start planning further interactions
User Study Results of User Study on CPU and GPU version of Pyramid Segmentation With GrabCut and Quick Selection Interaction Time Response Time Total Time Subjective Response of Users
Multiresolution alpha-expansion - Build pyramid of graphs. Perform alpha-expansion at a lower resolution graph. Save the initial and final residual graphs for all the labels. Upsample and reparameterize the previous resolution initial and final graphs and current resolution initial graph. Perform alpha-expansion at this level. Repeat this for all the levels in the pyramid.
Stereo Correspondence Image size – 1328 x 1104 Number of Labels (Disparity) – 200 - 290
Stereo Correspondence Optimization Time Total Time (Optimization time + graph construction time + energy function calculation time) Running time in seconds for stereo correpondence using Pyramid Cuts on the GPU (G-PyCut), the CPU (C-PyCut) and a single level Graph Cuts(GCuts). A speed up of 5-6 times on the CPU is observed.
Number of Labels (Disparity) – 256 Image Denoising Image size – 1000 x 1000 Number of Labels (Disparity) – 256
Image Denoising Total Time (Optimization time + graph construction time + energy function calculation time) Optimization Time Running time in seconds for stereo correpondence using Pyramid Cuts on the GPU (G-PyCut), the CPU (C-PyCut) and a single level Graph Cuts(GCuts). A speed up of 5-6 times on the CPU is observed.
Future Work Higher order Interactions of variables in MRF Computationally more challenging Modelling this on our hierarchical and multiresolution framework Using multiple GPUs to parallelize the alpha-expansion Better interactive tools: Both global and local interactions
Conclusion Two methods to optimize basic graph cuts algorithm Using facilities provided by parallel accelerators like GPU Modelling graph cuts on hierarchical and dynamic framework for better initialization Graph Cuts methods proved very instrumental solving many computationally challenging problems Successes of graph cuts -> Promising future in the realm of energy minimization methods
Related Publications P. J. Narayanan, Vibhav Vineet and Timo Stitch. Fast Graph Cuts on the GPU. GPU Computing Gems (GCG), Volume 1 Dec. 2010 (Book Chapter). Vibhav Vineet and P. J. Narayanan. Solving Multi-label MRFs using incremental alpha-expansion move on the GPUs. In Proceeding of Ninth Asian Conference on Computer Vision. (ACCV-2009), China, 2009. Vibhav Vineet and P. J. Narayanan. CUDA Cuts: Fast Graph Cuts on the GPU. In Proceeding of CVPR workshop on Visual Computer Vision on GPUs (CVGPU-2008), Alaska, USA, 2008. Vibhav Vineet, Pawan Harish, Suryakant Patidar and P. J. Narayanan. Fast Minimum Spanning Tree for Large Graphs on the GPU. In Proceeding of ACM SIGGRAPH High Performance Graphics (HPG-2009), New Orleans, LA, USA, 2009. Pawan Harish, Vibhav Vineet and P. J. Narayanan. Large Graph Algorithms for Massively Multithreaded Architectures. IIIT Tech Report, IIIT/TR/2009/74. CUDA Cuts: Fast Graph Cuts on the GPU. http://cvit.iiit.ac.in/index.php?page=resources. (Software). Vibhav Vineet, Pawan Harish, Suryakant Patidar and P. J. Narayanan. Fast Minimum Spanning Tree for Large Graphs on the GPU. GPU Computing Gems (GCG). (Book Chapter).
Changes made to the thesis Reviewer1 (Dr. Kishore) Tables containing experimental results on more standard images (around 60) images added. Sections on related and background works expanded. Reviewer 2 (Dr. Srinivasan) Missed references added to the related work section. Formal results on speed up for dynamic graph cuts on video segmentation added to the result section. Figure captions properly referenced with the paper of Kohli and Torr. Other minor changes made as recommended.
Thank You