© 2002 by Davi GeigerComputer Vision October 2002 L1.1 Binocular Stereo Left Image Right Image
© 2002 by Davi GeigerComputer Vision October 2002 L1.2 Each potential match is represented by a square. The black ones represent the most likely scene to “explain” the image, but other combinations could have given rise to the same image (e.g., red) Stereo Correspondence: Ambiguities What makes the set of black squares preferred/unique is that they have similar disparity values, the ordering constraint is satisfied and there is a unique match for each point. Any other set that could have given rise to the two images would have disparity values varying more, and either the ordering constraint violated or the uniqueness violated. The disparity values are inversely proportional to the depth values
© 2002 by Davi GeigerComputer Vision October 2002 L1.3 Right boundary no match Boundary no match Left depth discontinuity Surface orientation discontinuity AB C D E F A B A C D D C F F E Stereo Correspondence: Matching Space F D C B A AC DEFAC DEF
© 2002 by Davi GeigerComputer Vision October 2002 L1.4 Smoothness or similar depth values: In nature most surfaces are smooth compared to their distance to the observer, but depth discontinuities also occur. Uniqueness: Given a point in the left image there will be only one point in the right image to match, i.e. there should be only one disparity value associated to each point. Ordering Constraint (Monotonicity): Points to the right of q l match points to the right of q r. In the matching space this implies a monotonic non-decreasing curve to represent the matches. Stereo Correspondence: Constraints w= 2 w=-2 w=0 w=4 Left Epipolar Line Right Epipolar Line w= 2 w=-2 w=0 w=4 Left Right
© 2002 by Davi GeigerComputer Vision October 2002 L1.5 Cooperative Stereo Algorithm: Data C 0 (e,j,t) Є [0,1] representing how good is a match between a point (e,j) in the left image and a point (e,t) in the right image (t= j+d j, where d j is the disparity at j.) The epipolar lines are indexed by e. In order to account for occlusions, we extend the matrix C 0 (e,j,t) to include elements for j=0 and t=0, representing the total mismatch of a pixel (a half- occlusion), e.g., if C 0 (e,0,t)=1 then pixel (e,t) is likely to be half occluded.
© 2002 by Davi GeigerComputer Vision October 2002 L1.6 The stereovision algorithm produces a series of matrices C n, which converges to a good solution for many cases, with 0 < but such an update excludes t=0 and j=0 nodes. The positive feedback is given by the two neighbors of node (e,j,t) with matches at the same disparity d=t-j. Cooperative Stereo: Smoothing and Limit Disparity w=2 w=-2 w=0 w=4 Left, j Right, t D=4 The matrix is updated only within a range of disparity : 2D+1 i.e. The rational is: (i)Less computations (ii)Larger disparity matches imply larger errors in 3D estimation.
© 2002 by Davi GeigerComputer Vision October 2002 L1.7 We use Sinkhorn algorithm to normalize C n along j and along t simultaneously and produce a double stochastic matrix C n (sum over row and columns add up to 1). We index C n by and loop for k (typically 6 times) Cooperative Stereo: Uniqueness
© 2002 by Davi GeigerComputer Vision October 2002 L1.8 Cooperative Stereo: Smoothing and Discontinuities w= 2 w=-2 w=0 w=4 Left Epipolar Line Right Epipolar Line j-1 j j+1 t-1 t t+1 Note that each term in (2) has been normalized to 1 so that 0 < . where
© 2002 by Davi GeigerComputer Vision October 2002 L1.9 Cooperative Stereo: Epipolar Lines
© 2002 by Davi GeigerComputer Vision October 2002 L1.10 Cyclopean Coordinate System Let us assume N >> D, typical in stereo images. Then, for efficiency, the simplest representation for C(e,j,t) is C(e,x,w+D), with an increase in resolution (subpixel), x=t+j/2 and w=t-j, with w varying in the range (-D, …, D), and x varying in the range (1, 1.5, …, N-0.5, N), subpixel accuracy. This is known as the cyclopean coordinate system. We can recover (j,t) from (x,w) via t= (2x +w)/2 and j = (2x – w)/2. x occluded units x+v x x x x x x x x x x x Hypothesis: match at blue circle “ ” and blue “x”, i.e., horizontal jump of 4 units (v=2.5) along x. w=-4 w=0 w=4 Left Epipolar Line Right Epipolar Line t=5 x=t+j/2 w=t-j/2 t+1 t-1 x x x xxx x xxx x xx xxx x x xxxxx x x xxxxxx x xxxxxx x xxxxxx x xxxx x x x x x x x w
© 2002 by Davi GeigerComputer Vision October 2002 L1.11 Cyclopean Coopeative Stereo The initialization of C 0 (e,x,w+D) can now use the gradient information at subpixel resolution and be written as where C 0 (e,x,w+D) is of size N x 2N x 2D+1 and typically, since N >> D, this is much smaller than N x N x N.
© 2002 by Davi GeigerComputer Vision October 2002 L1.12 We can update C n as follows: Cyclopean Coopeative Stereo (cont.) The occlusion units and normalization becomes simpler as we focus on each x coordinate to obtain it.
© 2002 by Davi GeigerComputer Vision October 2002 L1.13 ooooooo ooooooo ooooooo ooooooo ooooooo ooooooo ooooooo j-1 j j+1 t+1 t t-1
© 2002 by Davi GeigerComputer Vision October 2002 L1.14 xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo j-1/2 j j+1/2 t+1 t+1/2 t t-1/2 t-1 w w=2
© 2002 by Davi GeigerComputer Vision October 2002 L1.15 xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo xxxxxxx ooooooo j-1/2 j j+1/2 t+1 t+1/2 t t-1/2 t-1 w w=2 occluded units Hypothesis: match at orange unit (“o” marked) followed by another match at the other orange unit (“x” marked), i.e., horizontal jump of 3 units (v=4) along x.
© 2002 by Davi GeigerComputer Vision October 2002 L1.16