Presentation is loading. Please wait.

Presentation is loading. Please wait.

3D Computer Vision and Video Computing 3D Vision Topic 3 of Part II Stereo Vision CSc I6716 Spring 2011 Zhigang Zhu, City College of New York

Similar presentations


Presentation on theme: "3D Computer Vision and Video Computing 3D Vision Topic 3 of Part II Stereo Vision CSc I6716 Spring 2011 Zhigang Zhu, City College of New York"— Presentation transcript:

1 3D Computer Vision and Video Computing 3D Vision Topic 3 of Part II Stereo Vision CSc I6716 Spring 2011 Zhigang Zhu, City College of New York zhu@cs.ccny.cuny.edu

2 3D Computer Vision and Video Computing Stereo Vision n Problem l Infer 3D structure of a scene from two or more images taken from different viewpoints n Two primary Sub-problems l Correspondence problem (stereo match) -> disparity map n “Similar” instead of “Same” n Occlusion problem: some parts of the scene are visible only in one eye l Reconstruction problem -> 3D n What we need to know about the cameras’ parameters n Often a stereo calibration problem n Lectures on Stereo Vision l Stereo Geometry – Epipolar Geometry (*) l Correspondence Problem (*) – Two classes of approaches l 3D Reconstruction Problems – Three approaches

3 3D Computer Vision and Video Computing A Stereo Pair n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D CMU CIL Stereo Dataset : Castle sequence http://www-2.cs.cmu.edu/afs/cs/project/cil/ftp/html/cil-ster.html ? 3D?

4 3D Computer Vision and Video Computing More Images… n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D

5 3D Computer Vision and Video Computing More Images… n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D

6 3D Computer Vision and Video Computing More Images… n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D

7 3D Computer Vision and Video Computing More Images… n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D

8 3D Computer Vision and Video Computing More Images… n Problems l Correspondence problem (stereo match) -> disparity map l Reconstruction problem -> 3D

9 3D Computer Vision and Video Computing Part I. Stereo Geometry n A Simple Stereo Vision System l Disparity Equation l Depth Resolution l Fixated Stereo System n Zero-disparity Horopter n Epipolar Geometry l Epipolar lines – Where to search correspondences n Epipolar Plane, Epipolar Lines and Epipoles n http://www.ai.sri.com/~luong/research/Meta3DViewer/EpipolarGeo.html l Essential Matrix and Fundamental Matrix n Computing E & F by the Eight-Point Algorithm n Computing the Epipoles n Stereo Rectification

10 3D Computer Vision and Video Computing Stereo Geometry n Converging Axes – Usual setup of human eyes n Depth obtained by triangulation n Correspondence problem: p l and p r correspond to the left and right projections of P, respectively. P(X,Y,Z)

11 3D Computer Vision and Video Computing A Simple Stereo System Z w =0 LEFT CAMERA Left image: reference Right image: target RIGHT CAMERA Elevation Z w disparity Depth Z baseline

12 3D Computer Vision and Video Computing Disparity Equation P(X,Y,Z) p l (x l,y l ) Optical Center O l f = focal length Image plane LEFT CAMERA B = Baseline Depth Stereo system with parallel optical axes f = focal length Optical Center O r p r (x r,y r ) Image plane RIGHT CAMERA Disparity: dx = x r - x l

13 3D Computer Vision and Video Computing Disparity vs. Baseline P(X,Y,Z) p l (x l,y l ) Optical Center O l f = focal length Image plane LEFT CAMERA B = Baseline Depth f = focal length Optical Center O r p r (x r,y r ) Image plane RIGHT CAMERA Disparity dx = x r - x l Stereo system with parallel optical axes

14 3D Computer Vision and Video Computing Depth Accuracy n Given the same image localization error l Angle of cones in the figure n Depth Accuracy (Depth Resolution) vs. Baseline l Depth Error  1/B (Baseline length) l PROS of Longer baseline, n better depth estimation l CONS n smaller common FOV n Correspondence harder due to occlusion n Depth Accuracy (Depth Resolution) vs. Depth l Disparity (>0)  1/ Depth l Depth Error  Depth 2 l Nearer the point, better the depth estimation n An Example l f = 16 x 512/8 pixels, B = 0.5 m l Depth error vs. depth Z2Z2 Two viewpoints Z2>Z1Z2>Z1 Z1Z1 Z1Z1 OlOl OrOr Absolute error Relative error

15 3D Computer Vision and Video Computing Stereo with Converging Cameras n Stereo with Parallel Axes l Short baseline n large common FOV n large depth error l Long baseline n small depth error n small common FOV n More occlusion problems n Two optical axes intersect at the Fixation Point converging angle  l The common FOV Increases FOV Leftright

16 3D Computer Vision and Video Computing Stereo with Converging Cameras n Stereo with Parallel Axes l Short baseline n large common FOV n large depth error l Long baseline n small depth error n small common FOV n More occlusion problems n Two optical axes intersect at the Fixation Point converging angle  l The common FOV Increases FOV Leftright

17 3D Computer Vision and Video Computing Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle  l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n <0 : inside the horopter n Depth Accuracy vs. Depth l Depth Error  Depth 2 l Nearer the point, better the depth estimation FOV Leftright  Fixation point

18 3D Computer Vision and Video Computing  Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle  l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n <0 : inside the horopter n Depth Accuracy vs. Depth l Depth Error  Depth 2 l Nearer the point, better the depth estimation Leftright Fixation point ll rr  r =  l d  = 0 Horopter

19 3D Computer Vision and Video Computing  Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle  l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n <0 : inside the horopter n Depth Accuracy vs. Depth l Depth Error  Depth 2 l Nearer the point, better the depth estimation Leftright Fixation point ll rr  r >  l d  > 0 Horopter

20 3D Computer Vision and Video Computing Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle  l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n <0 : inside the horopter n Depth Accuracy vs. Depth l Depth Error  Depth 2 l Nearer the point, better the depth estimation Leftright Fixation point LL rr  r <  l d  < 0 Horopter

21 3D Computer Vision and Video Computing Stereo with Converging Cameras n Two optical axes intersect at the Fixation Point converging angle  l The common FOV Increases n Disparity properties l Disparity uses angle instead of distance l Zero disparity at fixation point n and the Zero-disparity horopter l Disparity increases with the distance of objects from the fixation points n >0 : outside of the horopter n <0 : inside the horopter n Depth Accuracy vs. Depth l Depth Error  Depth 2 l Nearer the point, better the depth estimation Leftright Fixation point ll rr  d  Horopter

22 3D Computer Vision and Video Computing Break n Homework #4 online, due on May 03 before class

23 3D Computer Vision and Video Computing Parameters of a Stereo System n Intrinsic Parameters l Characterize the transformation from camera to pixel coordinate systems of each camera l Focal length, image center, aspect ratio n Extrinsic parameters l Describe the relative position and orientation of the two cameras l Rotation matrix R and translation vector T p l p r P OlOl OrOr XlXl XrXr PlPl PrPr flfl frfr ZlZl YlYl ZrZr YrYr R, T

24 3D Computer Vision and Video Computing Epipolar Geometry n Notations l P l =(X l, Y l, Z l ), P r =(X r, Y r, Z r ) n Vectors of the same 3-D point P, in the left and right camera coordinate systems respectively l Extrinsic Parameters n Translation Vector T = (O r -O l ) n Rotation Matrix R l p l =(x l, y l, z l ), p r =(x r, y r, z r ) n Projections of P on the left and right image plane respectively n For all image points, we have z l =f l, z r =f r p l p r P OlOl OrOr XlXl XrXr PlPl PrPr flfl frfr ZlZl YlYl ZrZr YrYr R, T

25 3D Computer Vision and Video Computing Epipolar Geometry n Motivation: where to search correspondences? l Epipolar Plane n A plane going through point P and the centers of projections (COPs) of the two cameras l Conjugated Epipolar Lines n Lines where epipolar plane intersects the image planes l Epipoles n The image of the COP of one camera in the other n Epipolar Constraint l Corresponding points must lie on conjugated epipolar lines p l p r P OlOl OrOr elel erer PlPl PrPr Epipolar Plane Epipolar Lines Epipoles

26 3D Computer Vision and Video Computing Essential Matrix n Equation of the epipolar plane l Co-planarity condition of vectors P l, T and P l -T n Essential Matrix E = RS l 3x3 matrix constructed from R and T (extrinsic only) n Rank (E) = 2, two equal nonzero singular values Rank (R) =3 Rank (S) =2

27 3D Computer Vision and Video Computing Essential Matrix n Essential Matrix E = RS l A natural link between the stereo point pair and the extrinsic parameters of the stereo system n One correspondence -> a linear equation of 9 entries n Given 8 pairs of (pl, pr) -> E l Mapping between points and epipolar lines we are looking for n Given p l, E -> p r on the projective line in the right plane n Equation represents the epipolar line of pr (or pl) in the right (or left) image n Note: l pl, pr are in the camera coordinate system, not pixel coordinates that we can measure

28 3D Computer Vision and Video Computing Fundamental Matrix n Mapping between points and epipolar lines in the pixel coordinate systems l With no prior knowledge on the stereo system n From Camera to Pixels: Matrices of intrinsic parameters n Questions: l What are fx, fy, ox, oy ? l How to measure p l in images? Rank (M int ) =3

29 3D Computer Vision and Video Computing Fundamental Matrix n Fundamental Matrix l Rank (F) = 2 l Encodes info on both intrinsic and extrinsic parameters l Enables full reconstruction of the epipolar geometry l In pixel coordinate systems without any knowledge of the intrinsic and extrinsic parameters l Linear equation of the 9 entries of F

30 3D Computer Vision and Video Computing Computing F: The Eight-point Algorithm n Input: n point correspondences ( n >= 8) l Construct homogeneous system Ax= 0 from n x = (f 11,f 12,,f 13, f 21,f 22,f 23 f 31,f 32, f 33 ) : entries in F n Each correspondence give one equation n A is a nx9 matrix l Obtain estimate F^ by SVD of A n x (up to a scale) is column of V corresponding to the least singular value l Enforce singularity constraint: since Rank (F) = 2 n Compute SVD of F^ n Set the smallest singular value to 0: D -> D’ n Correct estimate of F : n Output: the estimate of the fundamental matrix, F’ n Similarly we can compute E given intrinsic parameters

31 3D Computer Vision and Video Computing Locating the Epipoles from F n Input: Fundamental Matrix F l Find the SVD of F l The epipole e l is the column of V corresponding to the null singular value (as shown above) l The epipole e r is the column of U corresponding to the null singular value n Output: Epipole e l and e r e l lies on all the epipolar lines of the left image F is not identically zero For every p r p l p r P OlOl OrOr elel erer PlPl PrPr Epipolar Plane Epipolar Lines Epipoles

32 3D Computer Vision and Video Computing Break n Homework #4 online, due on May 03 before class

33 3D Computer Vision and Video Computing Stereo Rectification n Rectification l Given a stereo pair, the intrinsic and extrinsic parameters, find the image transformation to achieve a stereo system of horizontal epipolar lines l A simple algorithm: Assuming calibrated stereo cameras p’ l r P OlOl OrOr X’ r PlPl PrPr Z’ l Y’ l Y’ r TX’ l Z’ r n Stereo System with Parallel Optical Axes n Epipoles are at infinity n Horizontal epipolar lines

34 3D Computer Vision and Video Computing Stereo Rectification n Algorithm l Rotate both left and right camera so that they share the same X axis : O r -O l = T l Define a rotation matrix R rect for the left camera l Rotation Matrix for the right camera is R rect R T l Rotation can be implemented by image transformation p l p r P OlOl OrOr XlXl XrXr PlPl PrPr ZlZl YlYl ZrZr YrYr R, T TX’ l X l ’ = T, Y l ’ = X l ’xZ l, Z’ l = X l ’xY l ’

35 3D Computer Vision and Video Computing Stereo Rectification n Algorithm l Rotate both left and right camera so that they share the same X axis : O r -O l = T l Define a rotation matrix R rect for the left camera l Rotation Matrix for the right camera is R rect R T l Rotation can be implemented by image transformation p l p r P OlOl OrOr XlXl XrXr PlPl PrPr ZlZl YlYl ZrZr YrYr R, T TX’ l X l ’ = T, Y l ’ = X l ’xZ l, Z’ l = X l ’xY l ’

36 3D Computer Vision and Video Computing Stereo Rectification n Algorithm l Rotate both left and right camera so that they share the same X axis : O r -O l = T l Define a rotation matrix R rect for the left camera l Rotation Matrix for the right camera is R rect R T l Rotation can be implemented by image transformation ZrZr p’ l r P OlOl OrOr X’ r PlPl PrPr Z’ l Y’ l Y’ r R, T TX’ l T’ = (B, 0, 0), P’ r = P’ l – T’

37 3D Computer Vision and Video Computing Epipolar Geometry: Summary n Purpose l where to search correspondences n Epipolar plane, epipolar lines, and epipoles l known intrinsic (f) and extrinsic (R, T) n co-planarity equation l known intrinsic but unknown extrinsic n essential matrix l unknown intrinsic and extrinsic n fundamental matrix n Rectification l Generate stereo pair (by software) with parallel optical axis and thus horizontal epipolar lines

38 3D Computer Vision and Video Computing Part II. Correspondence problem n Three Questions l What to match? n Features: point, line, area, structure? l Where to search correspondence? n Epipolar line? l How to measure similarity? n Depends on features n Approaches l Correlation-based approach l Feature-based approach n Advanced Topics l Image filtering to handle illumination changes l Adaptive windows to deal with multiple disparities l Local warping to account for perspective distortion l Sub-pixel matching to improve accuracy l Self-consistency to reduce false matches l Multi-baseline stereo

39 3D Computer Vision and Video Computing Correlation Approach n For Each point (x l, y l ) in the left image, define a window centered at the point (x l, y l ) LEFT IMAGE

40 3D Computer Vision and Video Computing Correlation Approach n … search its corresponding point within a search region in the right image (x l, y l ) RIGHT IMAGE

41 3D Computer Vision and Video Computing Correlation Approach n … the disparity (dx, dy) is the displacement when the correlation is maximum (x l, y l )dx(x r, y r ) RIGHT IMAGE

42 3D Computer Vision and Video Computing Correlation Approach n Elements to be matched l Image window of fixed size centered at each pixel in the left image n Similarity criterion l A measure of similarity between windows in the two images l The corresponding element is given by window that maximizes the similarity criterion within a search region n Search regions l Theoretically, search region can be reduced to a 1-D segment, along the epipolar line, and within the disparity range. l In practice, search a slightly larger region due to errors in calibration

43 3D Computer Vision and Video Computing Correlation Approach n Equations n disparity n Similarity criterion l Cross-Correlation l Sum of Square Difference (SSD) l Sum of Absolute Difference(SAD)

44 3D Computer Vision and Video Computing Correlation Approach n PROS l Easy to implement l Produces dense disparity map l Maybe slow n CONS l Needs textured images to work well l Inadequate for matching image pairs from very different viewpoints due to illumination changes l Window may cover points with quite different disparities l Inaccurate disparities on the occluding boundaries

45 3D Computer Vision and Video Computing Correlation Approach n A Stereo Pair of UMass Campus – texture, boundaries and occlusion

46 3D Computer Vision and Video Computing Feature-based Approach n Features l Edge points l Lines (length, orientation, average contrast) l Corners n Matching algorithm l Extract features in the stereo pair l Define similarity measure l Search correspondences using similarity measure and the epipolar geometry

47 3D Computer Vision and Video Computing Feature-based Approach n For each feature in the left image… LEFT IMAGE corner line structure

48 3D Computer Vision and Video Computing Feature-based Approach n Search in the right image… the disparity (dx, dy) is the displacement when the similarity measure is maximum RIGHT IMAGE corner line structure

49 3D Computer Vision and Video Computing Feature-based Approach n PROS l Relatively insensitive to illumination changes l Good for man-made scenes with strong lines but weak texture or textureless surfaces l Work well on the occluding boundaries (edges) l Could be faster than the correlation approach n CONS l Only sparse depth map l Feature extraction may be tricky n Lines (Edges) might be partially extracted in one image n How to measure the similarity between two lines?

50 3D Computer Vision and Video Computing Break n Homework #4 online, due on May 03 before class

51 3D Computer Vision and Video Computing Advanced Topics n Mainly used in correlation-based approach, but can be applied to feature-based match n Image filtering to handle illumination changes l Image equalization n To make two images more similar in illumination l Laplacian filtering (2 nd order derivative) n Use derivative rather than intensity (or original color)

52 3D Computer Vision and Video Computing Advanced Topics n Adaptive windows to deal with multiple disparities l Adaptive Window Approach (Kanade and Okutomi) n statistically adaptive technique which selects at each pixel the window size that minimizes the uncertainty in disparity estimates n A Stereo Matching Algorithm with an Adaptive Window: Theory and Experiment, T. Kanade and M. Okutomi. Proc. 1991 IEEE International Conference on Robotics and Automation, Vol. 2, April, 1991, pp. 1088-1095 A Stereo Matching Algorithm with an Adaptive Window: Theory and ExperimentT. Kanade l Multiple window algorithm (Fusiello, et al) n Use 9 windows instead of just one to compute the SSD measure n The point with the smallest SSD error amongst the 9 windows and various search locations is chosen as the best estimate for the given points n A Fusiello, V. Roberto and E. Trucco, Efficient stereo with multiple windowing, IEEE CVPR pp858-863, 1997

53 3D Computer Vision and Video Computing Advanced Topics n Multiple windows to deal with multiple disparities Smooth regions Corners edges near far

54 3D Computer Vision and Video Computing Advanced Topics n Sub-pixel matching to improve accuracy l Find the peak in the correlation curves n Self-consistency to reduce false matches esp. for occlusions l Check the consistency of matches from L to R and from R to L n Multiple Resolution Approach l From coarse to fine for efficiency in searching correspondences n Local warping to account for perspective distortion l Warp from one view to the other for a small patch given an initial estimation of the (planar) surface normal n Multi-baseline Stereo l Improves both correspondences and 3D estimation by using more than two cameras (images)

55 3D Computer Vision and Video Computing 3D Reconstruction Problem n What we have done l Correspondences using either correlation or feature based approaches l Epipolar Geometry from at least 8 point correspondences n Three cases of 3D reconstruction depending on the amount of a priori knowledge on the stereo system l Both intrinsic and extrinsic known - > can solve the reconstruction problem unambiguously by triangulation l Only intrinsic known -> recovery structure and extrinsic up to an unknown scaling factor l Only correspondences -> reconstruction only up to an unknown, global projective transformation (*)

56 3D Computer Vision and Video Computing Reconstruction by Triangulation n Assumption and Problem l Under the assumption that both intrinsic and extrinsic parameters are known l Compute the 3-D location from their projections, pl and pr n Solution l Triangulation: Two rays are known and the intersection can be computed l Problem: Two rays will not actually intersect in space due to errors in calibration and correspondences, and pixelization l Solution: find a point in space with minimum distance from both rays p p r P OlOl OrOr l

57 3D Computer Vision and Video Computing Reconstruction up to a Scale Factor n Assumption and Problem Statement l Under the assumption that only intrinsic parameters and more than 8 point correspondences are given l Compute the 3-D location from their projections, pl and pr, as well as the extrinsic parameters n Solution l Compute the essential matrix E from at least 8 correspondences l Estimate T (up to a scale and a sign) from E (=RS) using the orthogonal constraint of R, and then R n End up with four different estimates of the pair (T, R) l Reconstruct the depth of each point, and pick up the correct sign of R and T. l Results: reconstructed 3D points (up to a common scale); l The scale can be determined if distance of two points (in space) are known

58 3D Computer Vision and Video Computing Reconstruction up to a Projective Transformation n Assumption and Problem Statement l Under the assumption that only n (>=8) point correspondences are given l Compute the 3-D location from their projections, pl and pr n Solution l Compute the Fundamental matrix F from at least 8 correspondences, and the two epipoles l Determine the projection matrices n Select five points ( from correspondence pairs) as the projective basis l Compute the projective reconstruction n Unique up to the unknown projective transformation fixed by the choice of the five points (* not required for this course; needs advanced knowledge of projective geometry )

59 3D Computer Vision and Video Computing Summary n Fundamental concepts and problems of stereo n Epipolar geometry and stereo rectification n Estimation of fundamental matrix from 8 point pairs n Correspondence problem and two techniques: correlation and feature based matching n Reconstruct 3-D structure from image correspondences given l Fully calibrated l Partially calibration l Uncalibrated stereo cameras (*)

60 3D Computer Vision and Video Computing Next n Understanding 3D structure and events from motion Motion n Homework #4 online, due on May 03 before class


Download ppt "3D Computer Vision and Video Computing 3D Vision Topic 3 of Part II Stereo Vision CSc I6716 Spring 2011 Zhigang Zhu, City College of New York"

Similar presentations


Ads by Google