Michael Bleyer LVA Stereo Vision

Slides:

Advertisements

Similar presentations

Investigation Into Optical Flow Problem in the Presence of Spatially-varying Motion Blur Mohammad Hossein Daraei June 2014 University.

Advertisements

Stereo Vision Reading: Chapter 11

Does Color Really Help in Dense Stereo Matching?

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

Last Time Pinhole camera model, projection

Motion Analysis (contd.) Slides are from RPI Registration Class.

Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.

The plan for today Camera matrix

Optical flow and Tracking CISC 649/849 Spring 2009 University of Delaware.

Stereo Computation using Iterative Graph-Cuts

CSE473/573 – Stereo Correspondence

Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’04) /04 $20.00 c 2004 IEEE 1 Li Hong.

CS 376b Introduction to Computer Vision 02 / 26 / 2008 Instructor: Michael Eckmann.

Stereo Matching Information Permeability For Stereo Matching – Cevahir Cigla and A.Aydın Alatan – Signal Processing: Image Communication, 2013 Radiometric.

Michael Bleyer LVA Stereo Vision

Mutual Information-based Stereo Matching Combined with SIFT Descriptor in Log-chromaticity Color Space Yong Seok Heo, Kyoung Mu Lee, and Sang Uk Lee.

A Local Adaptive Approach for Dense Stereo Matching in Architectural Scene Reconstruction C. Stentoumis 1, L. Grammatikopoulos 2, I. Kalisperakis 2, E.

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

Stereo Vision Reading: Chapter 11 Stereo matching computes depth from two or more images Subproblems: –Calibrating camera positions. –Finding all corresponding.

Feature-Based Stereo Matching Using Graph Cuts Gorkem Saygili, Laurens van der Maaten, Emile A. Hendriks ASCI Conference 2011.

December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.

A Region Based Stereo Matching Algorithm Using Cooperative Optimization Zeng-Fu Wang, Zhi-Gang Zheng University of Science and Technology of China Computer.

Michael Bleyer LVA Stereo Vision

Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.

Segmentation of Vehicles in Traffic Video Tun-Yu Chiang Wilson Lau.

Segmentation- Based Stereo Michael Bleyer LVA Stereo Vision.

Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.

Jeong Kanghun CRV (Computer & Robot Vision) Lab..

October 1, 2013Computer Vision Lecture 9: From Edges to Contours 1 Canny Edge Detector However, usually there will still be noise in the array E[i, j],

Stereo Vision Local Map Alignment for Robot Environment Mapping Computer Vision Center Dept. Ciències de la Computació UAB Ricardo Toledo Morales (CVC)

Machine Vision Edge Detection Techniques ENT 273 Lecture 6 Hema C.R.

May 2003 SUT Color image segmentation – an innovative approach Amin Fazel May 2003 Sharif University of Technology Course Presentation base on a paper.

Correspondence and Stereopsis. Introduction Disparity – Informally: difference between two pictures – Allows us to gain a strong sense of depth Stereopsis.

April 21, 2016Introduction to Artificial Intelligence Lecture 22: Computer Vision II 1 Canny Edge Detector The Canny edge detector is a good approximation.

Miguel Tavares Coimbra

A Plane-Based Approach to Mondrian Stereo Matching

Edge Detection slides taken and adapted from public websites:

CSC2535: Computation in Neural Networks Lecture 11 Extracting coherent properties by maximizing mutual information across space or time Geoffrey Hinton.

Summary of “Efficient Deep Learning for Stereo Matching”

Semi-Global Matching with self-adjusting penalties

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Motion and Optical Flow

1-Introduction (Computing the image histogram).

Lecture 7: Image alignment

Mean Shift Segmentation

SoC and FPGA Oriented High-quality Stereo Vision System

Feature description and matching

A segmentation and tracking algorithm

Geometry 3: Stereo Reconstruction

A special case of calibration

Common Classification Tasks

Computer Vision Lecture 5: Binary Image Processing

Range Imaging Through Triangulation

Image filtering Hybrid Images, Oliva et al.,

Image filtering Images by Pawan Sinha.

Finite Element Surface-Based Stereo 3D Reconstruction

Image filtering Images by Pawan Sinha.

Image filtering Images by Pawan Sinha.

Most slides from Steve Seitz

Image filtering Images by Pawan Sinha.

CSSE463: Image Recognition Day 30

Image filtering

Image filtering

CSSE463: Image Recognition Day 30

CSSE463: Image Recognition Day 30

Most slides from Steve Seitz

Calibration and homographies

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Occlusion and smoothness probabilities in 3D cluttered scenes

Presentation transcript:

Michael Bleyer LVA Stereo Vision Data Term Michael Bleyer LVA Stereo Vision

What happened last time? We have looked at our energy function: We have learned about an optimization algorithm that can minimize our energy Belief Propagation We have investigated the smoothness function s(): First order smoothness functions Linear Potts model Truncated linear Second order smoothness Edge/segment-sensitive smoothness functions

What is Going to Happen Today? We will look at the data term

Michael Bleyer LVA Stereo Vision Data Term Michael Bleyer LVA Stereo Vision

The Data Term Measures the goodness of correspondences. The color of corresponding pixels are compared. Formally defined as where m() computes the intensity/color dissimilarity between two pixels p is a pixel of the left image q is p’s correspondence in the right image according to disparity map D. We will now look at different methods for defining the pixel dissimilarity function m(). For the first match measures, I will assume that the photo-consistency assumption holds: I will relax this assumption later on. For now, we will just consider grey-scale images: I will speak about color matching later on.

Absolute Intensity Difference Pixel dissimilarity is computed as the absolute difference in intensity values: where Ip denotes the intensity of pixel p. Quite common choice in stereo matching

Squared Intensity Difference Pixel dissimilarity is computed as the squared difference in intensity values: Probably not a good choice for stereo matching: Sensitive to outliers Typically performs slightly worse than absolute difference.

Truncated Absolute Difference Pixel dissimilarity is computed as the truncated absolute difference in intensity values: where k is a user-defined value. Advantage: Robustness against outliers. Better performance in occluded regions: Since an occluded pixel does not have a matching point, it will have high pixel dissimilarity. Truncation lowers the data costs for the “correct” disparity. There is still hope that the correct disparity can be propagated from surrounding non-occluded pixels. Disadvantage: You have an additional parameter. An optimal value for k is difficult to find or might not even exist.

Sampling Insensitive Measure [Birchfield,PAMI98] In the “real world”, intensity is a continuous function Intensity X-Coordinates (Left image)

Sampling Insensitive Measure [Birchfield,PAMI98] When we take a photo, we sample this continuous curve to derive discrete pixels. Intensity X-Coordinates (Left image)

Sampling Insensitive Measure [Birchfield,PAMI98] Problem: Samples are typically taken at different curve positions in the left and right images. Intensity Intensity X-Coordinates X-Coordinates (Left image) (Right image)

Sampling Insensitive Measure [Birchfield,PAMI98] Due to this different sampling, corresponding pixels have different intensity values (and this will oftentimes lead to wrong matches). p p’ Intensity Intensity X-Coordinates X-Coordinates p and p’ are corresponding pixels, but have different intensity values (Left image) (Right image)

Sampling Insensitive Measure [Birchfield,PAMI98] Idea of [Birchfield,PAMI98]: We also look p’s horizontal neighbors p p’ Intensity q Intensity r X-Coordinates X-Coordinates (Left image) (Right image)

Sampling Insensitive Measure [Birchfield,PAMI98] We interpolate the intensity of the pixel p- that lies in between p and q by p p’ p- Intensity q Intensity r X-Coordinates X-Coordinates (Left image) (Right image)

Sampling Insensitive Measure [Birchfield,PAMI98] We also interpolate the intensity of the pixel p+ that lies in between p and r by p p’ p- p+ Intensity q Intensity r X-Coordinates X-Coordinates (Left image) (Right image)

Sampling Insensitive Measure [Birchfield,PAMI98] We compute the sampling insensitive matching as p p’ p- p+ Intensity q Intensity X-Coordinates X-Coordinates (Left image) (Right image)

Sampling Insensitive Measure [Birchfield,PAMI98] We compute the sampling insensitive matching as p p’ p- p+ Intensity q Intensity Advantage: We have also included the “correctly sampled” pixel p- => low intensity dissimilarity and high chances for correct match X-Coordinates X-Coordinates (Left image) (Right image)

Sampling Insensitive Measure [Birchfield,PAMI98] We should do this in a symmetric way: p p’ p’+ p- p+ Intensity q Intensity p’- X-Coordinates X-Coordinates (Left image) (Right image)

Violations of Photo-Consistency Assumption In real-world stereo images, the photo-consistency assumption is almost never perfectly fulfilled. We call a pixel radiometric distorted if its intensity is different in left and right images. There are various reasons for radiometric distortions, e.g.: Different illumination conditions in the images Different exposure times Different sensor characteristics (Left image) (Right image)

Radiometric Insensitive Match Measures Treatment of radiometric distortions has great impact on the quality of results. Unfortunately, none of the above match measures is able to cope with radiometric distortions. We will now learn about 3 radiometric insensitive measures: Mutual Information Zero mean Normalized Cross-Correlation (ZNCC) Census

Mutual Information [Hirschmueller,PAMI08] Advantage: Mutual Information is a pixel-based measure. In contrast to window-based measures, artifacts at disparity borders are avoided To compute Mutual Information matching scores, we need the disparity map: Chicken and Egg problem If we knew the disparity map, then we were already done. This dilemma is typically solved in an iterative fashion: We compute an initial disparity map (e.g., using absolute differences as a dissimilarity function) We compute Mutual Information scores using our current disparity map. We compute a new disparity map using the Mutual Information scores. Goto 2.

Mutual Information [Hirschmueller,PAMI08] Advantage: Mutual Information is a pixel-based measure. In contrast to window-based measures, artifacts at disparity borders are avoided To compute Mutual Information matching scores, we need the disparity map: Chicken and Egg problem If we knew the disparity map, then we were already done. This dilemma is typically solved in an iterative fashion: We compute an initial disparity map (e.g., using absolute differences as a dissimilarity function) We compute Mutual Information scores using our current disparity map. We compute a new disparity map using the Mutual Information scores. Goto 2. How can we compute the Mutual Information matching scores? Disclaimer: There is quite a lot of theory behind that. I will focus on the practical implementation as described in [Hirschmueller,PAMI08].

Computing Mutual Information Scores For each pixel p, we look-up its matching point q in the right image using our current disparity map. We look up the intensity values Ip and Iq. We make an entry at <Ip,Iq> in the diagram bellow. For each possible pair of intensity values <Ip,Iq>, our diagram stores how often this pair occurred in the disparity map. Ip=150 Iq=100 Entry Intensity Right Image 100 150 Intensity Left Image

Computing Mutual Information Scores Let us assume that all corresponding pixels have identical intensity values, i.e. there is no radiometric distortion. The diagram looks like this: Intensity Right Image 45˚ Intensity Left Image

Computing Mutual Information Scores Let us assume that the right image is darker than the left one. The diagram looks like this: Intensity Right Image <45˚ Intensity Left Image

Computing Mutual Information Scores Let us assume that the left image is darker than the right one. The diagram looks like this: Intensity Right Image >45˚ Intensity Left Image

Computing Mutual Information Scores This is what the diagram looks like for the Teddy test set: Black means that the intensity pair occurred very frequently in the disparity map Images taken from [Hirschmueller, PAMI08]

Computing Mutual Information Scores This is what the diagram looks like for the Teddy test set: Those intensity pairs that occurred frequently should be given low matching costs. Black means that the intensity pair occurred very frequently in the disparity map

Computing Mutual Information Scores We compute –log(P) where P is our diagram. These are our Mutual Information scores Where white pixels mean low matching costs. Side node: For simplicity, I left out two steps where you apply Gaussian smoothing on P and –log(P). P -log(P)

Disadvantage of Mutual Information Global model: Mutual Information can only model radiometric changes that are valid for the whole image. For example, the whole left image is darker than the right one. It cannot model radiometric changes that occur only locally. For example, the bottom left part of the image is darker in the left view. Unfortunately, radiometric distortions are oftentimes local.

Zero mean Normalized Cross-Correlation (ZNCC) Is defined on windows => Will lead to artefacts at disparity discontinuities Pixel dissimilarity m() computed as where Wp is the set of all pixels in the window centered at p. is the mean intensity computed over all pixels insider Wp. Subtraction of the mean value serves to normalize intensity values (robustness against radiometric changes)

Census We center a window on pixel p in the left image.

Census We center a window on pixel p in the left image. Apply the following transformation: If a pixel has smaller intensity than the window center pixel, write 0. Else write 1.

Census We center a window on pixel p in the left image. Apply the following transformation: If a pixel has smaller intensity than the window center pixel, write 0. Else write 1. Write the binary values as a bit-string.

Census We center a window on pixel p in the left image. Apply the following transformation: If a pixel has smaller intensity than the window center pixel, write 0. Else write 1. Write the binary values as a bit-string. Apply the same operations for the window centered on pixel q in the right image.

Census We center a window on pixel p in the left image. Apply the following transformation: If a pixel has smaller intensity than the window center pixel, write 0. Else write 1. Write the binary values as a bit-string. Apply the same operations for the window centered on pixel q in the right image. Census matching costs are computed as the Hamming distance between the bit strings: Number of positions at which binary values are different

Census - Discussion We do not directly match the intensities, but the local texture represented as a bit-string. If one image is darker than the other, the bit strings should still agree: You can, for example, add a value of 10 to the intensity values in p’s window on the previous slide and will still get the same bit string. Increased robustness if window overlaps a disparity discontinuity. Problems in untextured regions If all pixel have very similar intensities, the values of the bit string largely depend on image noise. Leads to noisy results in untextured regions.

How Can We Incorporate Color Information? Typically this is done in the most simple way: Compute the match measure individually for each color channel Sum-up the values over all color channels Let us now investigate the role of color

Why Should Color Help? We have a blue pixel in the left image. We have 2 candidate matches in the right image: A blue pixel A yellow pixel It is quite clear that the blue pixel is the correct match. (Left Image) (Right Image)

Why Should Color Help? Let us now convert the color images into grey-scale images. In our example blue and yellow colors map to the same grey-value. It is no longer clear which of our 2 candidate pixels is the correct match => Color information reduces ambiguity! ? (Left Image) (Right Image)

However, a lot of stereo algorithms do not use color information. Why Should Color Help? Let us now convert the color images into grey-scale images. In our example blue and yellow colors map to the same grey-value. It is no longer clear which of our 2 candidate pixels is the correct match => Color information reduces ambiguity! However, a lot of stereo algorithms do not use color information. Is this for a reason? ? (Left Image) (Right Image)

Evaluation of Color Matching [Bleyer,3DPVT10] We evaluate the performance of 8 different color systems and grey scale matching: The different color systems affect the data term of our energy. Truncated linear is used as a smoothness term. Energy optimization is accomplished using the Simple Tree dynamic programming method (see session on dynamic programming) We use 30 ground truth pairs from Middlebury as test data. Error is computed as the percentage of pixels having an absolute disparity error > 1 pixel in non-occluded regions.

Test Set

Absolute Intensity Difference We compute the pixel dissimilarity as the absolute difference in intensity values. Results (see plot bellow): Grey-scale matching nearly always performs worst. The color system LUV seems to perform better than RGB.

Absolute Intensity Difference We compute the pixel dissimilarity as the absolute difference in intensity values. Results (see plot bellow): Grey-scale matching nearly always performs worst. The color system LUV seems to perform better than RGB. So using color is a good thing? Well, this is not the whole story.

Example Results for Dolls Test Set (Left Image) (Grey - Disparity) (LUV - Disparity) Let us have a closer look at the disparity maps. I show two disparity maps where One has been computed by using the absolute difference of grey values as a match function. One has been computed by using the summed-up absolute differences in LUV values.

Example Results for Dolls Test Set (Left Image) (Grey - Errors) (LUV - Errors) We should rather look at the error maps Black pixels are pixels whose disparity error is larger than 1 pixel in comparison against the ground truth. Errors are clearly smaller when using LUV.

Example Results for Dolls Test Set (Left Image) (Grey - Errors) (LUV - Errors) We should rather look at the error maps Black pixels are pixels whose disparity error is larger than 1 pixel in comparison against the ground truth. Errors are clearly smaller when using LUV.

(Data Costs of Ground Truth Solution) Radiometric Problems in the Dolls Set We have the ground truth disparity map for the Dolls Set. => For each pixel p of the left image, we know its correct correspondence q in the right image. If there are no radiometric distortions, |Ip-Iq| should be equal to 0. In practice, we obtain the image shown on the right where: Bright pixels have a large value for |Ip-Iq|. These bright pixels are the result of radiometric distortions. (Data Costs of Ground Truth Solution)

(Data Costs of Ground Truth Solution) Radiometric Problems in the Dolls Set We can apply thresholding on the ground truth data cost image. (Data Costs of Ground Truth Solution) (Smoothed Thresholding)

(Data Costs of Ground Truth Solution) Radiometric Problems in the Dolls Set We can apply thresholding on the ground truth data cost image. There seems to be large overlap between errors in the grey-scale matching result and radiometric distorted regions. (Data Costs of Ground Truth Solution) (Smoothed Thresholding) (Disparity Error when using Grey-Scale Matching)

Radiometric Problems in the Dolls Set We can apply thresholding on the ground truth data cost image. There seems to be large overlap between errors in the grey-scale matching result and radiometric distorted regions. Errors in radiometric distorted regions seem to be effectively reduced when using color matching. (Data Costs of Ground Truth Solution) (Smoothed Thresholding) (Disparity Error when using LUV Matching)

Radiometric Problems in the Dolls Set We can apply thresholding on the ground truth data cost image. There seems to be large overlap between errors in the grey-scale matching result and radiometric distorted regions. Errors in radiometric distorted regions seem to be effectively reduced when using color matching. Color might be of specific importance in radiometric distorted image regions (Data Costs of Ground Truth Solution) (Smoothed Thresholding) (Disparity Error when using LUV Matching)

Color Helps in Radiometric Distorted Regions We have extracted the radiometric distorted regions for all 30 test images. We now analyze the disparity error separately In regions affected by radiometric distortions In regions unaffected by radiometric distortions Average error percentage in regions affected by radiometric distortions A large improvement can be observed when using color (e.g., LUV) instead of grey-scale matching (Grey)

Color Helps in Radiometric Distorted Regions We have extracted the radiometric distorted regions for all 30 test images. We now analyze the disparity error separately In regions affected by radiometric distortions In regions unaffected by radiometric distortions Regions unaffected by radiometric distortions Almost no improvement

Color Helps in Radiometric Distorted Regions We have extracted the radiometric distorted regions for all 30 test images. We now analyze the disparity error separately In regions affected by radiometric distortions In regions unaffected by radiometric distortions Average error percentage in all regions. The overall improvement is largely due to considerable improvement in radiometric distorted regions.

Why Not Directly Using a Radiometric Insensitive Measure? If color only helps in radiometric distorted regions, we can directly use radiometric insensitive measures instead of color. 3 radiometric insensitive measures are tested: Mutual information (MI) Zero mean Normalized Cross-Correlation (NCC) Census (CENSUS) All 3 match measures are applied on grey-scale images

Why Not Directly Using a Radiometric Insensitive Measure? NCC and CENUS seem to perform best This result is consistent with [Hirschmueller,PAMI09] NCC and CENSUS have the same effect as using color (AbsDif (LUV)): They improve performance in radiometric distorted regions (blue line) NCC and CENSUS are considerably better than color in this respect.

Using Color with Radiometric Insensitive Measures NCC used with grey scale and 8 color spaces CENSUS used with grey scale and 8 color spaces It seems to be a bad idea to use color in conjunction with radiometric insensitive measures: Grey-scale matching performs better than all 8 color spaces. How can this happen? Increased robustness of color in radiometric regions is not important anymore (NCC and CENSUS do a better job) You practically do not lose texture when deleting color Intensity is probably more robustly capture by nowadays cameras (less noise in the intensity channel)

Using Color with Radiometric Insensitive Measures My advice: You should not use color. You should definitely use a radiometric insensitive match measure. NCC used with grey scale and 8 color spaces CENSUS used with grey scale and 8 color spaces It seems to be a bad idea to use color in conjunction with radiometric insensitive measures: Grey-scale matching performs better than all 8 color spaces. How can this happen? Increased robustness of color in radiometric regions is not important anymore (NCC and CENSUS do a better job) You practically do not lose texture when deleting color Intensity is probably more robustly capture by nowadays cameras (less noise in the intensity channel)

Support Aggregation in Global Stereo Matching In the session on local stereo methods, we have learned about support aggregation: We do not match single pixels, but windows Until recently, using windows in global stereo was considered a bad idea: You get artefacts at disparity discontinuities!

Standard Support Weights Geodesic Support Weights Support Aggregation in Global Stereo Matching We have also learned about new segmentation-based aggregation schemes: They deliver excellent performance In general, they will even improve the performance near depth discontinuities Apart from increased computational costs, there is relatively little that speaks against using these aggregation methods for implementing your pixel dissimilarity function m(). Standard Support Weights Geodesic Support Weights

Summary Data Term: Standard dissimilarity measures: Absolute / Squared intensity differences Sampling insensitive measures Radiometric insensitive measures: Mutual information ZNCC Census The role of color Segmentation-based aggregation schemes

References [Birchfiled,PAMI98] S. Birchfield, C. Tomasi, A Pixel Dissimilarity Measure That Is Insensitive to Image Sampling, PAMI, vol. 20, 1998. [Bleyer,3DPVT10] M. Bleyer, S. Chambon, Does Color Really Help in Dense Stereo Matching?, 3DPVT, 2010. [Hirschmueller,PAMI08] H. Hirschmueller, Stereo Processing by Semi-Global Matching and Mutual Information, PAMI, vol. 30, no. 2, 2008. [Hirschmueller,PAMI09] H. Hirschmueller, D. Scharstein, Evaluation of Stereo Matching Costs on Images with Radiometric Differences, PAMI, vol. 31, no. 9, 2009.