Michael Bleyer LVA Stereo Vision

Slides:

Advertisements

Similar presentations

Filling Algorithms Pixelwise MRFsChaos Mosaics Patch segments are pasted, overlapping, across the image. Then either: Ambiguities are removed by smoothing.

Advertisements

Various Regularization Methods in Computer Vision Min-Gyu Park Computer Vision Lab. School of Information and Communications GIST.

Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.

Spatial-Temporal Consistency in Video Disparity Estimation ICASSP 2011 Ramsin Khoshabeh, Stanley H. Chan, Truong Q. Nguyen.

I Images as graphs Fully-connected graph – node for every pixel – link between every pair of pixels, p,q – similarity w ij for each link j w ij c Source:

電腦視覺 Computer and Robot Vision I Chapter2: Binary Machine Vision: Thresholding and Segmentation Instructor: Shih-Shinh Huang 1.

Real-Time Accurate Stereo Matching using Modified Two-Pass Aggregation and Winner- Take-All Guided Dynamic Programming Xuefeng Chang, Zhong Zhou, Yingjie.

CS774. Markov Random Field : Theory and Application Lecture 17 Kyomin Jung KAIST Nov

Does Color Really Help in Dense Stereo Matching?

Graph-Based Image Segmentation

1 s-t Graph Cuts for Binary Energy Minimization  Now that we have an energy function, the big question is how do we minimize it? n Exhaustive search is.

September 10, 2013Computer Vision Lecture 3: Binary Image Processing 1Thresholding Here, the right image is created from the left image by thresholding,

Epipolar lines epipolar lines Baseline O O’ epipolar plane.

Rajat K. Pal. Chapter 3 Emran Chowdhury # P Presented by.

Optimal solution of binary problems Much material taken from :  Olga Veksler, University of Western Ontario

Last Time Pinhole camera model, projection

Contents Description of the big picture Theoretical background on this work The Algorithm Examples.

Computer Vision : CISC 4/689 Adaptation from: Prof. James M. Rehg, G.Tech.

Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.

Multiview stereo. Volumetric stereo Scene Volume V Input Images (Calibrated) Goal: Determine occupancy, “color” of points in V.

High-Quality Video View Interpolation

MRF Labeling With Graph Cut CMPUT 615 Nilanjan Ray.

The plan for today Camera matrix

CS 223b 1 More on stereo and correspondence. CS 223b 2 =?f g Mostpopular For each window, match to closest window on epipolar line in other image. (slides.

Optical flow and Tracking CISC 649/849 Spring 2009 University of Delaware.

Stereo Computation using Iterative Graph-Cuts

Lecture 11: Stereo and optical flow CS6670: Computer Vision Noah Snavely.

Perceptual Organization: Segmentation and Optical Flow.

Webcam-synopsis: Peeking Around the World Young Ki Baik (CV Lab.) (Fri)

Image Renaissance Using Discrete Optimization Cédric AllèneNikos Paragios ENPC – CERTIS ESIEE – A²SI ECP - MAS France.

Fast Approximate Energy Minimization via Graph Cuts

Michael Bleyer LVA Stereo Vision

Surface Stereo with Soft Segmentation Michael Bleyer 1, Carsten Rother 2, Pushmeet Kohli 2 1 Vienna University of Technology, Austria 2 Microsoft Research.

Graph-based Segmentation. Main Ideas Convert image into a graph Vertices for the pixels Vertices for the pixels Edges between the pixels Edges between.

Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images (Fri) Young Ki Baik, Computer Vision Lab.

Takuya Matsuo, Norishige Fukushima and Yutaka Ishibashi

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

Stereo Vision Reading: Chapter 11 Stereo matching computes depth from two or more images Subproblems: –Calibrating camera positions. –Finding all corresponding.

Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.

September 23, 2014Computer Vision Lecture 5: Binary Image Processing 1 Binary Images Binary images are grayscale images with only two possible levels of.

Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.

CS 4487/6587 Algorithms for Image Analysis

Feature-Based Stereo Matching Using Graph Cuts Gorkem Saygili, Laurens van der Maaten, Emile A. Hendriks ASCI Conference 2011.

Data Extraction using Image Similarity CIS 601 Image Processing Ajay Kumar Yadav.

Computer Vision, Robert Pless

December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.

Michael Bleyer LVA Stereo Vision

Lecture 19: Solving the Correspondence Problem with Graph Cuts CAP 5415 Fall 2006.

Presenter ： Kuang-Jui Hsu Date ： 2011/3/24(Thur.).

Segmentation of Vehicles in Traffic Video Tun-Yu Chiang Wilson Lau.

Segmentation- Based Stereo Michael Bleyer LVA Stereo Vision.

Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.

Jeong Kanghun CRV (Computer & Robot Vision) Lab..

CS654: Digital Image Analysis Lecture 28: Advanced topics in Image Segmentation Image courtesy: IEEE, IJCV.

A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.

Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.

Stereo Video 1. Temporally Consistent Disparity Maps from Uncalibrated Stereo Videos 2. Real-time Spatiotemporal Stereo Matching Using the Dual-Cross-Bilateral.

Photoconsistency constraint C2 q C1 p l = 2 l = 3 Depth labels If this 3D point is visible in both cameras, pixels p and q should have similar intensities.

Image segmentation.

Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:

April 21, 2016Introduction to Artificial Intelligence Lecture 22: Computer Vision II 1 Canny Edge Detector The Canny edge detector is a good approximation.

Graph-based Segmentation

A Plane-Based Approach to Mondrian Stereo Matching

Michael Bleyer LVA Stereo Vision

Geometry 3: Stereo Reconstruction

Computer Vision Lecture 5: Binary Image Processing

Presented by: Cindy Yan EE6358 Computer Vision

Haim Kaplan and Uri Zwick

Lecture 31: Graph-Based Image Segmentation

“Traditional” image segmentation

Presentation transcript:

Michael Bleyer LVA Stereo Vision Global Methods 1 Michael Bleyer LVA Stereo Vision

What happened last time? Local methods Pros and cons Adaptive windows Slanted surfaces Occlusion handling in local stereo

Outline Stereo as an Energy Minimization Problem Dynamic programming (DP) Basic Algorithm DP Algorithms Scanline Optimization Tree DP Semi global matching Simple Tree Method

Michael Bleyer LVA Stereo Vision Stereo as an Energy Minimization Problem Michael Bleyer LVA Stereo Vision

Stereo as an Energy Minimization Problem Define an energy/cost function to measure the quality of a disparity map: High energy means that the disparity map is bad. Low energy means it is good. Energy function is typically in the form of: where D is the disparity map of the left image Edata measures photo consistency Esmooth measures smoothness Global methods express smoothness assumption in an explicit form (as a smoothness term). Let us take a closer look at Edata and Esmooth.

The Data Term Measures the color dissimilarity for each pixel p of the left image I: where dp is the disparity of p in the disparity map D m() is a function computing the color dissimilarity between pixels of left and right images.

Nodes correspond to pixels of the left image The Smoothness Term The smoothness assumption states that neighboring pixels should be assigned to the same (or similar) disparities: Nodes correspond to pixels of the left image

Edges represent interactions between pixels. The Smoothness Term The smoothness assumption states that neighboring pixels should be assigned to the same (or similar) disparities: Edges represent interactions between pixels. In our case, interactions occur between a pixel and its 4 spatial neighbours. We state that the pixel should have the same disparity with its neighbours

The Smoothness Term Let us write this idea as a term: where N is the set of all spatial neighbouring pixels in the left image. s() is a smoothness function that imposes a penalty if two disparities are different from each other. Let us use the following smoothness function: P is a user-defined penalty that balances data and smoothness terms 0 if dp = dq P otherwise

This particular smoothness function is called the Potts model The Smoothness Term Let us write this idea as a term: where N is the set of all spatial neighbouring pixels in the left image. s() is a smoothness function that imposes a penalty if two disparities are different from each other. Let us use the following smoothness function: P is a user-defined penalty that balances data and smoothness terms This particular smoothness function is called the Potts model 0 if dp = dq P otherwise

Balancing Data and Smoothness Terms P = 0 P = 5 P = 10 P = 30 P = 50 P = 5000 Disparity maps generated by energy optimization via alpha-expansion algorithm (no global optimum)

The Smoothness Term Our smoothness term defines the following set of (smoothness) interactions: This is called the 4-connected grid.

Optimizing the 4-connected grid We are looking for a disparity map D that has the minimum energy E(D) among all possible disparity maps. This is a very difficult problem: np-complete problem in the general case => It is not possible to compute the optimal disparity map in reasonable time (most likely). Why is it difficult: A pixel has influence on all other pixels in the image: Changing the disparity of the pixel in the top-left corner might change the disparity of the pixel in the bottom left corner.

Optimizing the 4-connected grid We are looking for a disparity map D that has the minimum energy E(D) among all possible disparity map. In general, this is a very difficult problem: np-complete problem in general => It is not possible to compute the optimal disparity map in reasonable time. Why is it difficult: Every pixel is connected to every other pixel. Changing the disparity of the pixel in the top-left corner might change the disparity of the pixel in the bottom left corner. We will spend two sessions on optimization algorithms. We will learn about the following algorithms: Dynamic Programming (This session) Belief Propagation (Next session) Graph-Cuts (Next session)

Application to Other CV Problems Our energy function measures the quality of an assignment of pixels to labels. In our case, labels correspond to disparities. However, labels can have a different meaning => We can use energy minimization approaches to solve a lot of other computer vision problems.

Optical Flow (Very Similar to Stereo) Input: 2 consecutive frames of a video Desired output: Map of 2D vectors representing the movement of each pixel Labels: All allowed 2D displacement vectors Data Term: Color dissimilarity between corresponding pixels Smoothness term: Penalty if neighbouring pixels have different 2D vectors

Image Denoising Input: Desired output: Labels: Data Term: Noisy image Desired output: Noise-free image Labels: 255 intensity values Data Term: Dissimilarity between pixel’s intensity and assigned intensity Smoothness term: Penalty if neighbouring pixels are assigned to different intensities.

Inpainting Input: Desired output: Labels: Data Term: Smoothness term: Image with partially missing information (red rectangle) Desired output: Complete image Labels: 255 intensity values Data Term: 0 for each label assignment Smoothness term: Penalty if neighbouring pixels are assigned to different intensities.

Interactive Image Segmentation Input: Color Image Foreground and background scribbles provided by user. Desired output: Binary map Label 0: Pixel belongs to background Label 1: Pixel belongs to foreground Data Term: Dissimilarity between a pixel’s color and the color models of fore-/background. Smoothness term: Penalty on 0/1 label transitions

Interactive Image Segmentation Input: Color Image Foreground and background scribbles provided by user. Desired output: Binary map Label 0: Pixel belongs to background Label 1: Pixel belongs to foreground Data Term: Dissimilarity between a pixel’s color and the color models of fore-/background. Smoothness term: Penalty on 0/1 label transitions There are many more computer vision problems that can be modelled by our energy function.

Generality of energy functions Apart from smoothness, we can model other assumptions in the energy function. Some examples for stereo: Energy gives infinite costs if uniqueness assumption is violated. Energy is lower if disparity borders coincide with intensity edges. In general, if you have a computer vision problem: Think about what a perfect solution should look like. Try to express the properties of this perfect solution as an energy function. Apply one of many existing optimization algorithm to find the solution that minimizes your energy function. ~30% of vision papers work like this.

Limitations of Energy Minimization I have implemented an energy minimization approach, but it gives poor results. Why? There are 2 reasons: Energy modeling: Your energy represents a poor model of your problem. Ideally, the correct solution should have lower energy than all other possible solutions. Energy minimization: Your optimization algorithm delivers a solution that is far off from the exact minimum of your energy. Problem: You usually do not know which of the two reasons is the problem in your approach. However, there are strong indications that energy modeling is the major problem at the current state-of-the-art.

Limitations of Energy Minimization I have implemented an energy minimization approach, but it gives poor results. Why? There are 2 reasons: Energy modeling: Your energy represents a poor model of your problem: Ideally, the correct solution should have lower energy than all other possible solutions. If this is the disparity map that has lower energy than all other possible disparity maps, you have done a good job in the energy modelling step.

Limitations of Energy Minimization I have implemented an energy minimization approach, but it gives poor results. Why? There are 2 reasons: Energy modeling: Your energy represents a poor model of your problem. Ideally, the correct solution should have lower energy than all other possible solutions. Energy minimization: Your optimization algorithm delivers a solution that is far off from the exact minimum of your energy. Problem: You usually do not know which of the two reasons is the problem in your approach. However, there are strong indications that energy modeling is the major problem at the current state-of-the-art.

Limitations of Energy Minimization I have implemented an energy minimization approach, but it gives poor results. Why? There are 2 reasons: Energy modeling: Your energy represents a poor model of your problem. Ideally, the correct solution should have lower energy than all other possible solutions. Energy minimization: Your optimization algorithm delivers a solution that is far off from the exact minimum of your energy. Problem: You usually do not know which of the two reasons is the problem in your approach. However, there are strong indications that energy modeling is the major problem at the current state-of-the-art. Result of applying two different optimization algorithms on the same energy function ICM Graph-Cuts

Limitations of Energy Minimization I have implemented an energy minimization approach, but it gives poor results. Why? There are 2 reasons: Energy modeling: Your energy represents a poor model of your problem. Ideally, the correct solution should have lower energy than all other possible solutions. Energy minimization: Your optimization algorithm delivers a solution that is far off from the exact minimum of your energy. Problem: You usually do not know which of the two reasons is the problem in your approach. However, there are strong indications that energy modeling is the major problem at the current state-of-the-art.

Limitations of Energy Minimization I have implemented an energy minimization approach, but it gives poor results. Why? There are 2 reasons: Energy modeling: Your energy represents a poor model of your problem. Ideally, the correct solution should have lower energy than all other possible solutions. Energy minimization: Your optimization algorithm delivers a solution that is far off from the exact minimum of your energy. Problem: You usually do not know which of the two reasons is the problem in your approach. However, there are strong indications that energy modeling is the major problem at the current state-of-the-art. We will spend this and next sessions on the energy optimization problem. We will then focus on the modelling component.

Michael Bleyer LVA Stereo Vision Dynamic Programming Michael Bleyer LVA Stereo Vision

Special Case of our Energy Function Let us come back to our energy function where Esmooth is implemented by the Potts model. Optimization of E is np-complete: There is most likely no algorithm that can give you the exact minimum in reasonable time. However, there is a special case: If the smoothness interactions form a tree in the grid graph, the optimal solution can be efficiently computed using dynamic programming (DP).

Example of a tree A tree is a graph that does not contain cycles.

Dynamic Programming - Algorithm Function L(r) computes the exact energy optimum r represents the root of the tree (can be chosen arbitrarily) D is the set of all allowed disparities m(p,d) are the costs for matching pixel p at disparity d. s(d,d’) gives a penalty if the disparities d and d’ have different values (smoothness function) Cp is the set of all siblings of p (Those pixels that have p as a direct predecessor on the path to the root node).

Dynamic Programming - Algorithm If the graph was not a tree, this recursion would run forever. Function L(r) computes the exact energy optimum r represents the root of the tree (can be chosen arbitrarily) D is the set of all allowed disparities m(p,d) are the costs for matching pixel p at disparity d. s(d,d’) gives a penalty if the disparities d and d’ have different values (smoothness function) Cp is the set of all siblings of p (Those pixels that have p as a direct predecessor on the path to the root node).

Dynamic Programming – An Example We will use the Potts model to implement s() with P = 10. Matching costs are given by m(p,d) r s t u d1 5 20 10 d2 15 30 25 r s t u

Dynamic Programming – An Example We will use the Potts model to implement s() with P = 10. Matching costs are given by The energy of the optimal disparity assignment is 55. What is the disparity assignment that has led to this optimal energy? m(p,d) r s t u d1 5 20 10 d2 15 30 25 r s t u

Dynamic Programming – An Example We can find the disparity sequence that has led to the optimum by back-tracking. We look which disparity was chosen at each pixel and follow this path. dr = 1 r ds = 1 s dt = 1 du = 1 t u

Dynamic Programming – An Example We can find the disparity sequence that has led to the optimum by back-tracking. We look which disparity was chosen at each pixel and follow this path. Setting all pixels to disparity 1 represents the optimal disparity assignment in our example. dr = 1 r ds = 1 s dt = 1 du = 1 t u

DP on the 4-connected Grid Problem: The 4-connected grid is definitely not a tree!

DP on the 4-connected Grid Idea: We can remove edges (smoothness interactions) so that the 4-connected grid becomes a tree. The following approaches only differ in the way how they erase edges. Problem: The 4-connected grid is definitely not a tree!

Scanline DP All vertical edges are deleted from the 4-connected grid. That is what the majority of DP-based approaches do. Oftentimes, these approaches implement the ordering assumption (I will skip this, because this is not state-of-the-art anymore.)

Scanline DP All vertical edges are deleted from the 4-connected grid. That is what the majority of DP-based approaches do. Oftentimes, these approaches implement the ordering assumption (I will skip this, because this is not state-of-the-art anymore.)

Scanline DP All vertical edges are deleted from the 4-connected grid. That is what the majority of DP-based approaches do. Oftentimes, these approaches implement the ordering assumption (I will skip this, because this is not state-of-the-art anymore.)

What will be the problem of this approach? Scanline DP What will be the problem of this approach? All vertical edges are deleted from the 4-connected grid. That is what the majority of DP-based approaches do. Oftentimes, these approaches implement the ordering assumption (I will skip this, because this is not state-of-the-art anymore.)

The Scanline Streaking Problem Deleting the vertical smoothness edges leads to horizontal streaks in the disparity maps. The problem is that smoothness between neighbouring scanlines is not enforced.

Tree DP by [Veksler, CVPR2005] We can obtain a tree structure in a smarter way. Observation: Disparity discontinuities are typically aligned with intensity edges, hence: Two neighbouring pixels of similar intensities are very likely to lie on the same disparity. Two neighbouring pixels of different intensities are less likely to lie on the same disparity. => The smoothness edges between neighbouring pixels of very different intensities are the least important ones. => We should remove those.

Tree DP by [Veksler, CVPR2005] Algorithm for obtaining the tree structure: For each smoothness edge between two pixels p and q: Compute a weight w(p,q) by where I(p) denotes pixel p’s intensity. Build the minimum spanning tree (MST) using the computed weights: The MST is the tree connecting all pixels whose sum of weights is minimum among all such trees. The MST can be computed in linear time using standard graph algorithms.

Tree DP by [Veksler, CVPR2005] Pixels of high intensity difference (no smoothness edge) Pixels of low intensity difference (smoothness edge)

Tree DP by [Veksler, CVPR2005] Horizontal streaks are effectively reduced. However, vertical streaks or now present as well.

Semi-Global Matching by [Hirschmueller, CVPR2005] Disclaimer: The original paper is written from a completely different perspective (no tree DP). Construct an individual tree at each pixel p. Tree contains the vertical, horizontal and diagonal lines on which p resides (star shape).

Semi-Global Matching by [Hirschmueller, CVPR2005] No streaks, but isolated pixels

Semi-Global Matching by [Hirschmueller, CVPR2005] Texture If the tree for pixel p does not capture texture, the algorithm will fail.

Simple Tree by [Bleyer, VISAPP2008] Motivation: Overcome the problem of [Hirschmueller, CVPR2005] in untextured regions Idea: Also generate an individual tree at each pixel p. This tree contains all pixels of the reference view (=> The problem of missing texture is avoided) 2 tree structures: Horizontal Tree Vertical Tree

Simple Tree by [Bleyer, VISAPP2008] Texture cannot be missed by these trees.

Simple Tree by [Bleyer, VISAPP2008] No streaks, no problems in untextured regions.

Dynamic Programming - Pros and Cons DP algorithms are very fast (comparable to local methods) Good tradeoff between speed and accuracy Cons: Can only be applied on tree structures. Erasing smoothness edges leads to performance degradations. Optimization algorithms that operate on the full 4-connected grid perform better (next session)

Summary Principle of global methods: Energy modeling Energy minimization Energy minimization for CV problems different from stereo Dynamic Programming: Scanline optimization Tree DP Semi-global matching Simple Tree

References M. Bleyer, M. Gelautz, Simple but Effective Tree Structures for Dynamic Programming-based Stereo Matching, VISAPP 2008. H. Hirschmueller, Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information, CVPR 2005. O. Veksler, Stereo Correspondence by Dynamic Programming on a Tree, CVPR 2005.