Introduction à la vision artificielle VIII Jean Ponce Web:

Slides:



Advertisements
Similar presentations
Lecture 11: Two-view geometry
Advertisements

3D reconstruction.
Stereo Vision Reading: Chapter 11
Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923.
Gratuitous Picture US Naval Artillery Rangefinder from World War I (1918)!!
Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image Where does the.
Stereo and Projective Structure from Motion
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Lecture 8: Stereo.
776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2012.
Camera calibration and epipolar geometry
Structure from motion.
Last Time Pinhole camera model, projection
Stereo. STEREOPSIS Reading: Chapter 11. The Stereopsis Problem: Fusion and Reconstruction Human Stereopsis and Random Dot Stereograms Cooperative Algorithms.
Stereo and Epipolar geometry
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.
Uncalibrated Geometry & Stratification Sastry and Yang
Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.
Multi-view stereo Many slides adapted from S. Seitz.
Stereopsis Mark Twain at Pool Table", no date, UCR Museum of Photography.
The plan for today Camera matrix
Stereo and Structure from Motion
Computer Vision Structure from motion Marc Pollefeys COMP 256 Some slides and illustrations from J. Ponce, A. Zisserman, R. Hartley, Luc Van Gool, …
Stereo Computation using Iterative Graph-Cuts
May 2004Stereo1 Introduction to Computer Vision CS / ECE 181B Tuesday, May 11, 2004  Multiple view geometry and stereo  Handout #6 available (check with.
3D Vision and Graphics. The Geometry of Multiple Views Motion Field Stereo Epipolar Geometry The Essential Matrix The Fundamental Matrix Structure from.
CSE473/573 – Stereo Correspondence
Announcements PS3 Due Thursday PS4 Available today, due 4/17. Quiz 2 4/24.
Epipolar Geometry and Stereo Vision Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/12/11 Many slides adapted from Lana Lazebnik,
Multi-view geometry. Multi-view geometry problems Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates.
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #15.
Epipolar Geometry and Stereo Vision Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/05/15 Many slides adapted from Lana Lazebnik,
Camera Calibration & Stereo Reconstruction Jinxiang Chai.
Introduction à la vision artificielle III Jean Ponce
Structure from images. Calibration Review: Pinhole Camera.
Euclidean cameras and strong (Euclidean) calibration Intrinsic and extrinsic parameters Linear least-squares methods Linear calibration Degenerate point.
Lecture 12 Stereo Reconstruction II Lecture 12 Stereo Reconstruction II Mata kuliah: T Computer Vision Tahun: 2010.
Projective cameras Motivation Elements of Projective Geometry Projective structure from motion Planches : –
Graph Cut Algorithms for Binocular Stereo with Occlusions
Recap from Monday Image Warping – Coordinate transforms – Linear transforms expressed in matrix form – Inverse transforms useful when synthesizing images.
Stereo Vision Reading: Chapter 11 Stereo matching computes depth from two or more images Subproblems: –Calibrating camera positions. –Finding all corresponding.
Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.
CSCE 643 Computer Vision: Structure from Motion
Stereo Many slides adapted from Steve Seitz.
Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image image 1image.
Computer Vision, Robert Pless
Lec 22: Stereo CS4670 / 5670: Computer Vision Kavita Bala.
EECS 274 Computer Vision Stereopsis.
Affine Structure from Motion
CSE 185 Introduction to Computer Vision Stereo. Taken at the same time or sequential in time stereo vision structure from motion optical flow Multiple.
Bahadir K. Gunturk1 Phase Correlation Bahadir K. Gunturk2 Phase Correlation Take cross correlation Take inverse Fourier transform  Location of the impulse.
EECS 274 Computer Vision Affine Structure from Motion.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Solving for Stereo Correspondence Many slides drawn from Lana Lazebnik, UIUC.
776 Computer Vision Jan-Michael Frahm & Enrique Dunn Spring 2013.
Structure from motion Multi-view geometry Affine structure from motion Projective structure from motion Planches : –
EECS 274 Computer Vision Projective Structure from Motion.
Multiple View Geometry and Stereo. Overview Single camera geometry – Recap of Homogenous coordinates – Perspective projection model – Camera calibration.
Correspondence and Stereopsis. Introduction Disparity – Informally: difference between two pictures – Allows us to gain a strong sense of depth Stereopsis.
CSE 185 Introduction to Computer Vision Stereo 2.
Epipolar Geometry and Stereo Vision
CS4670 / 5670: Computer Vision Kavita Bala Lec 27: Stereo.
STEREOPSIS The Stereopsis Problem: Fusion and Reconstruction
Stereo and Structure from Motion
STEREOPSIS The Stereopsis Problem: Fusion and Reconstruction
Epipolar geometry.
EECS 274 Computer Vision Stereopsis.
Chapter 11: Stereopsis Stereopsis: Fusing the pictures taken by two cameras and exploiting the difference (or disparity) between them to obtain the depth.
Stereo vision Many slides adapted from Steve Seitz.
Presentation transcript:

Introduction à la vision artificielle VIII Jean Ponce Web: Planches après les cours sur : plus Prochain exo : mean-shift (du le 24 novembre):

Basic stereo matching algorithm For each pixel in the first image Find corresponding epipolar line in the right image Examine all pixels on the epipolar line and pick the best match Triangulate the matches to get depth information Simplest case: epipolar lines are scanlines When does this happen?

Rectification All epipolar lines are parallel in the rectified image plane.

Rectification example

Epipolar constraint example

Reconstruction from Rectified Images Disparity: d=u’-u.Depth: z = -B/d.

Binocular fusion: a problem of correspondence

A Cooperative Model (Marr and Poggio, 1976) Excitory connections: continuity Inhibitory connections: uniqueness Iterate: C =  C - w  C + C. ei0 Reprinted from Vision: A Computational Investigation into the Human Representation and Processing of Visual Information by David Marr.  1982 by David Marr. Reprinted by permission of Henry Holt and Company, LLC.

A Cooperative Model (Marr and Poggio, 1976) Excitory connections: continuity Inhibitory connections: uniqueness Iterate: C =  C - w  C + C. ei0 Reprinted from Vision: A Computational Investigation into the Human Representation and Processing of Visual Information by David Marr.  1982 by David Marr. Reprinted by permission of Henry Holt and Company, LLC.

Correlation Methods (1970--) Normalized Correlation: minimize  instead. Slide the window along the epipolar line until w.w ’ is maximized. 2 Minimize |w-w ’ |.

LeftRight scanline Correlation-based methods Norm. corr

Failures of correlation-based methods Textureless surfaces Occlusions, repetition Non-Lambertian surfaces, specularities

Effect of window size –Smaller window + More detail More noise –Larger window + Smoother disparity maps Less detail W = 3W = 20

Results Correlation-based matchingGround truth Data

Correlation Methods: Foreshortening Problems Solution: add a second pass using disparity estimates to warp the correlation windows, e.g. Devernay and Faugeras (1994). Reprinted from “Computing Differential Properties of 3D Shapes from Stereopsis without 3D Models,” by F. Devernay and O. Faugeras, Proc. IEEE Conf. on Computer Vision and Pattern Recognition (1994).  1994 IEEE.

How can we improve window-based matching? The similarity constraint is local: each reference window is matched independently. Need to enforce global correspondence constraints.

Non-local constraints Uniqueness For any point in one image, there should be at most one matching point in the other image

Non-local constraints Uniqueness For any point in one image, there should be at most one matching point in the other image Ordering Corresponding points should be in the same order in both views

Non-local constraints Uniqueness For any point in one image, there should be at most one matching point in the other image Ordering Corresponding points should be in the same order in both views Ordering constraint does not (always) hold

Non-local constraints Uniqueness For any point in one image, there should be at most one matching point in the other image Ordering Corresponding points should be in the same order in both views Smoothness We expect disparity values to change slowly (for the most part)

Scanline stereo Try to coherently match pixels on the entire scanline Different scanlines are still optimized independently Left imageRight image

“Shortest paths” for scan-line stereo Left imageRight image Can be implemented with dynamic programming (Baker & Binford’81, Ohta & Kanade ’85) correspondence q p Left occlusion t Right occlusion s Slide credit: Y. Boykov

Shortest path stereo in real life Scanline stereo generates streaking artifacts Can’t use dynamic programming to find spatially coherent disparities and correspondences on a 2D grid

Stereo matching as energy minimization I1I1 I2I2 D Energy functions of this form can be minimized using “graph cuts” (aka min-cut/max-flow algorithms) Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001Fast Approximate Energy Minimization via Graph Cuts W1(i )W1(i )W 2 (i+D(i )) D(i )D(i )

Combinatorial optimization with unary and binary terms Quadratic pseudo-Boolean function optimization Binary variables

Generalization to integer variables n integer variables in 0..K-1 nK binary variables (Darbon, 2009) x =0 if x≤k and 1 otherwise k

Quadratic integer function optimization Submodular case Min-cut max-flow problems (Boros & Hammer, 2002) Otherwise NP hard Efficient exact algorithms (Ford & Fulkerson ‘56) (Goldberg & Tarjan ‘88) (Boykov & Kolgomorov ’04)

Quadratic integer function optimization For example: with g convex (Ishikawa ‘03) Min-cut max-flow problems (Boros & Hammer, 2002) Otherwise NP hard Efficient exact algorithms (Ford & Fulkerson ‘56) (Goldberg & Tarjan ‘88) (Boykov & Kolgomorov ’04)

Quadratic pseudo-Boolean function optimization For example: with g convex (Ishikawa ‘03) Min-cut max-flow problems (Boros & Hammer, 2002) Otherwise NP hard Efficient approximate algorithms (Boykov et al.’01)

Combinatorial optimization: Submodularity is “too restrictive” for certain stereo settings (use non-convex g for example) Use iterative approximate solutions such as alpha expansion (Boykov et al. 2001)

Back to Stereopsis as energy minimization.. I1I1 I2I2 D Energy functions of this form can be minimized using “graph cuts” (aka min-cut/max-flow algorithms) Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization via Graph Cuts, PAMI 2001Fast Approximate Energy Minimization via Graph Cuts W1(i )W1(i )W 2 (i+D(i )) D(i )D(i )

Stereo matching as energy minimization Note: the above formulation does not treat the two images symmetrically, does not enforce uniqueness, and does not take occlusions into account It is possible to come up with an energy that does all these things, but it is a bit more complex Defined over all possible sets of matches, not over all disparity maps with respect to the first image Includes an occlusion term The smoothness term looks different and more complicated V. Kolmogorov and R. Zabih, “Computing Visual Correspondences with Occlusions using Graph Cuts, ICCV 2001

Graph cuts Ground truth For the latest and greatest: Y. Boykov, O. Veksler and R. Zabih, “Fast Approximate Energy Minimization via Graph Cuts, PAMI Results

More Views (Okutami and Kanade, 1993) Pick a reference image, and slide the corresponding window along the corresponding epipolar lines of all other images, using inverse depth relative to the first image as the search parameter. Use the sum of correlation scores to rank matches. Reprinted from “A Multiple-Baseline Stereo System,” by M. Okutami and T. Kanade, IEEE Trans. on Pattern Analysis and Machine Intelligence, 15(4): (1993). \copyright 1993 IEEE.

I1I2I10 Reprinted from “A Multiple-Baseline Stereo System,” by M. Okutami and T. Kanade, IEEE Trans. on Pattern Analysis and Machine Intelligence, 15(4): (1993). \copyright 1993 IEEE.

Multi-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding points in 3D? Correspondence (fusion): Given a point in just one image, how does it constrain the position of the corresponding point in another image? Camera geometry (motion): Given a set of corresponding points in two or more images, what are the camera matrices for these views?

The Euclidean (perspective) Structure-from-Motion Problem Given m (internally) calibrated perspective images of n fixed points P j we can write Problem: estimate the m 3x4 matrices M i = [R i t i ] and the n positions P j from the mn correspondences p ij. 2mn equations in 11m (or rather 5m)+3n unknowns Overconstrained problem, that can be solved using (non-linear) least squares!

The Euclidean Ambiguity of Euclidean SFM If R i, t i, and P j are solutions, So are R i ’, t i ’, and P j ’, where In fact, the absolute scale cannot be recovered since: When the intrinsic parameters are known Euclidean ambiguity up to a similarity transformation.

Euclidean motion from E (Longuet-Higgins, 1981) Given F computed from n > 7 point correspondences, and its SVD F= UWV T, compute E=U diag(1,1,0) V T. There are two solutions t’ = u 3 and t’’ = -t’ to E T t=0. Define R’ = UWV T and R” = UW T V T where (It is easy to check R’ and R” are rotations.) Then [t x ’]R’ = -E and [t x ’]R” = E. Similar reasoning for t”. Four solutions. Only two of them place the reconstructed points in front of the cameras.

Euclidean reconstruction. Mean relative error: 3.1%

Euclidean (= similarity) ambiguity

If P is unconstrained: Projective ambiguity

Projective ambiguity

Structure from motion Let us now look at simpler, affine cameras center at infinity

The Affine Structure-from-Motion Problem Given m images of n fixed points P we can write Problem: estimate the m 2x4 matrices M and the n positions P from the mn correspondences p. i j ij 2mn equations in 8m+3n unknowns Overconstrained problem, that can be solved using (non-linear) least squares! j

The Affine Ambiguity of Affine SFM If M and P are solutions, i j So are M’ and P’ where i j and Q is an affine transformation. When the intrinsic parameters are unknown

Affine ambiguity Affine

Affine ambiguity

Types of ambiguity Projective 15dof Affine 12dof Similarity 7dof Euclidean 6dof Preserves intersection and tangency Preserves parallellism, volume ratios Preserves angles, ratios of length Preserves angles, lengths With no constraints on the camera calibration matrix or on the scene, we get a projective reconstruction Need additional information to upgrade the reconstruction to affine, similarity, or Euclidean

Affine Structure from Motion Reprinted with permission from “Affine Structure from Motion,” by J.J. (Koenderink and A.J.Van Doorn, Journal of the Optical Society of America A, 8: (1990).  1990 Optical Society of America. Given m pictures of n points, can we recover the three-dimensional configuration of these points? the camera configurations? (structure) (motion)

Affine projections induce affine transformations from planes onto their images.

Affine coordinate are planar affine invariants..and so are (signed) ratios of distances along parallel lines

Geometric affine scene reconstruction from two images (Koenderink and Van Doorn, 1991).