3D Vision Yang Wang National ICT Australia

Slides:



Advertisements
Similar presentations
Epipolar Geometry.
Advertisements

CSE473/573 – Stereo and Multiple View Geometry
CS 376b Introduction to Computer Vision 04 / 21 / 2008 Instructor: Michael Eckmann.
Two-view geometry.
Lecture 8: Stereo.
Camera calibration and epipolar geometry
Computer Vision : CISC 4/689
Geometry of Images Pinhole camera, projection A taste of projective geometry Two view geometry:  Homography  Epipolar geometry, the essential matrix.
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Uncalibrated Geometry & Stratification Sastry and Yang
1 Introduction to 3D Imaging: Perceiving 3D from 2D Images How can we derive 3D information from one or more 2D images? There have been 2 approaches: 1.
Introduction to Computer Vision 3D Vision Topic 9 Stereo Vision (I) CMPSCI 591A/691A CMPSCI 570/670.
Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.
3D Computer Vision and Video Computing 3D Vision Lecture 14 Stereo Vision (I) CSC 59866CD Fall 2004 Zhigang Zhu, NAC 8/203A
May 2004Stereo1 Introduction to Computer Vision CS / ECE 181B Tuesday, May 11, 2004  Multiple view geometry and stereo  Handout #6 available (check with.
CSE473/573 – Stereo Correspondence
COMP322/S2000/L271 Stereo Imaging Ref.V.S.Nalwa, A Guided Tour of Computer Vision, Addison Wesley, (ISBN ) Slides are adapted from CS641.
1 Perceiving 3D from 2D Images How can we derive 3D information from one or more 2D images? There have been 2 approaches: 1. intrinsic images: a 2D representation.
3-D Scene u u’u’ Study the mathematical relations between corresponding image points. “Corresponding” means originated from the same 3D point. Objective.
Multi-view geometry. Multi-view geometry problems Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates.
Automatic Camera Calibration
Computer vision: models, learning and inference
What Does the Scene Look Like From a Scene Point? Donald Tanguay August 7, 2002 M. Irani, T. Hassner, and P. Anandan ECCV 2002.
Lecture 11 Stereo Reconstruction I Lecture 11 Stereo Reconstruction I Mata kuliah: T Computer Vision Tahun: 2010.
Structure from images. Calibration Review: Pinhole Camera.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Multi-view geometry.
Lecture 12 Stereo Reconstruction II Lecture 12 Stereo Reconstruction II Mata kuliah: T Computer Vision Tahun: 2010.
Course 12 Calibration. 1.Introduction In theoretic discussions, we have assumed: Camera is located at the origin of coordinate system of scene.
Shape from Stereo  Disparity between two images  Photogrammetry  Finding Corresponding Points Correlation based methods Feature based methods.
3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: , Perspective Geometry Camera Model Stereo Triangulation 3D Reconstruction by.
CS654: Digital Image Analysis Lecture 8: Stereo Imaging.
Metrology 1.Perspective distortion. 2.Depth is lost.
Stereo Course web page: vision.cis.udel.edu/~cv April 11, 2003  Lecture 21.
Binocular Stereo #1. Topics 1. Principle 2. binocular stereo basic equation 3. epipolar line 4. features and strategies for matching.
Computer Vision Stereo Vision. Bahadir K. Gunturk2 Pinhole Camera.
Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.
CSE 185 Introduction to Computer Vision Stereo. Taken at the same time or sequential in time stereo vision structure from motion optical flow Multiple.
Bahadir K. Gunturk1 Phase Correlation Bahadir K. Gunturk2 Phase Correlation Take cross correlation Take inverse Fourier transform  Location of the impulse.
Two-view geometry. Epipolar Plane – plane containing baseline (1D family) Epipoles = intersections of baseline with image planes = projections of the.
stereo Outline : Remind class of 3d geometry Introduction
Feature Matching. Feature Space Outlier Rejection.
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.
Computer vision: models, learning and inference M Ahad Multiple Cameras
Vision Sensors for Stereo and Motion Joshua Gluckman Polytechnic University.
1Ellen L. Walker 3D Vision Why? The world is 3D Not all useful information is readily available in 2D Why so hard? “Inverse problem”: one image = many.
3D Reconstruction Using Image Sequence
John Morris Stereo Vision (continued) Iolanthe returns to the Waitemata Harbour.
Image-Based Rendering Geometry and light interaction may be difficult and expensive to model –Think of how hard radiosity is –Imagine the complexity of.
John Morris These slides were adapted from a set of lectures written by Mircea Nicolescu, University of Nevada at Reno Stereo Vision Iolanthe in the Bay.
Lec 26: Fundamental Matrix CS4670 / 5670: Computer Vision Kavita Bala.
Correspondence and Stereopsis. Introduction Disparity – Informally: difference between two pictures – Allows us to gain a strong sense of depth Stereopsis.
Computer vision: geometric models Md. Atiqur Rahman Ahad Based on: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince.
Multi-view geometry. Multi-view geometry problems Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates.
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry
Processing visual information for Computer Vision
3D Single Image Scene Reconstruction For Video Surveillance Systems
Two-view geometry Computer Vision Spring 2018, Lecture 10
Epipolar geometry.
Geometry 3: Stereo Reconstruction
Common Classification Tasks
Two-view geometry.
Multiple View Geometry for Robotics
Reconstruction.
Two-view geometry.
Course 6 Stereo.
Two-view geometry.
Multi-view geometry.
3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: 13
The Pinhole Camera Model
Presentation transcript:

3D Vision Yang Wang National ICT Australia Computer Vision 3D Vision Yang Wang National ICT Australia

Introduction Single camera Two camera Range Sensor Examples Perspective projection Camera parameters Two camera Depth computation Epipolar geometry Range Sensor Examples

Coordinate Transformation PB=R(PA-t) PA= (xA,yA,zA)T, PB= (xB,yB,zB)T R: 3×3 rotation matrix t: 3×1 translation vector

Homogenous Coordinate PB=R(PA-t)=RPA-Rt XB=TXA

Pinhole Camera Pinhole perspective projection Mathematically simple and convenient

The Perspective Imaging Model 1-D case

The Perspective Imaging Model P=(xc,yc,zc), p=(u,v) u=(f/zc)xc, v=(f/zc)yc

Single Perspective Camera Image affine coordinate system u=fxc/zc, v=fyc/zc u=a(fxc/zc) +b(fyc/zc)+x0 v=c(fyc/zc)+y0

Intrinsic Parameters U=KXc K: intrinsic parameters, zc=α

Extrinsic Parameters Xc, Xw: 3D camera/world coordinate Xc=R(Xw-t) R,t: extrinsic parameters

Projective Matrix Xc=R(Xw-t), U=KXc=KR(Xw-t) Projective matrix U=MX

Single Camera Calibration U=MX Generally, require 6 pairs of (ui,vi) and (xi,yi,zi) to solve M

Single Camera Calibration For each pair of (ui,vi) and (xi,yi,zi)

Depth Perception from Stereo Simple stereo system X -axis are collinear and Y-axis and Z-axis are parallel Disparity d=xl-xr refers to the difference in the image location of the same 3-D point

Correspondence Problem left right left depth

Depth Perception from Stereo Establishing Corresponding The most difficult part of a stereo vision system is not the depth calculation, but the determination of the correspondences used in the depth calculation Cross correlation For pixel P of image I1, the selected region of I2 is searched to find a pixel that maximises the response of the cross correlation operator Symbolic matching and relational constraints Look for a feature in one image that matches a feature in the other Typical features used are junctions, line segments or regions

Epipolar geometry Baseline Epipole Epipolar plane Epipolar line

Depth Perception from Stereo The epipolar constraint The 2-dimentional search space for the point in one image that corresponds to a given point in a second image I reduced to a 1-dimensional search b the so called epipolar geometry of the image pair The plane that contains the 3-D point P, the two point centres (or cameras) C1 and C2, and the two image points P1 and P2 to which P projects is called the epipolar plane The two lines e1 and e2 resulting from the intersection of the epipolar plane with the two image planes I1 and I2 are called epipolar lines The epipole of an image of a stereo pair is the point at which all its epipolar lines intersect Given the point P1 on epipolar line e1 in image I1 and the relative orientations of the cameras, the corresponding epipolar line e2 in image I2 on which the corresponding point P2 must lie can be found

Depth Perception from Stereo The ordering constraint Given a pair of points in the scene and their corresponding projections in each of the two images, the ordering constraint states that if these points lie on a continuous surface in the scene, they will be ordered in the same way along the epipolar lines in each of the image Error versus Coverage Increasing the base line improves accuracy but decreasing coverage of correspondences

General Stereo Configuration U=KX, U'=K'X' X: position of P in left camera coordinate X': position of P in right camera coordinate U,U': position of p1 and p2 in left/right image K,K': intrinsic parameters of left/right camera

Fundamental Matrix U=KX, U'=K'X', X'=R(X-t) Coplanarity: XT(t×X')=0 R,t: rotation and translation Coplanarity: XT(t×X')=0 UT(K-1)Tt×R-1(K')-1U'=0 UTFU'=0 Fundamental matrix F=(K-1)Tt×R-1(K')-1

Essential Matrix Given intrinsic parameters: K and K' (K-1U)Tt×R-1(K')-1U'=0 V=K-1U, V'=(K')-1U' VTEV'=0 Essential matrix E=t×R-1

Depth Perception from Stereo Canonical configuration Image rectification

Range Sensor LIDAR RADAR Structured light Light detection & ranging Radio detection & ranging Structured light

Time-of-Flight Camera Comparison

3-D Cues Available in 2-D Images An image is a 2-D projection of the world Cues exist in 2-D images to interpret the 3-D world Interposition occurs when one object occludes another object, thus indicating that the occluding object is closer to the viewer than the occluded object Perspective scaling indicates that the distance to an object is inversely proportional to its image size

3-D Cues Available in 2-D Images Texture gradient is the change of image texture along some direction in the image Motion parallax indicates the images of closer objects will move faster than the images of distant objects

Other Phenomena Shape from shading Smooth objects often present a highlight at points where a reception from the light source makes equal angles with refection toward the view while get increasingly darker as the surface normal becomes perpendicular to rays of illumination Only expected to work well by itself in highly controlled environments

Other Phenomena Shape from texture Shape from Sihouette Whenever texture is assumed to lie on a single 3-d surface and to be uniform, texture gradient in 2-D can be used to computer 3-D orientation of the surface Shape from Sihouette Extracts the sihouettes of an object using mutiple images with known camera orientation so that 3D shape of the object can be reconstructed.

Other Phenomena Depth from Focus Motion Phenomena By bringing an object into focus, the sensor obtains information on the range to that object Motion Phenomena When a moving visual sensor pursues an object in 3-D, points on that object appear to expand in the 2-D image as the sensor closes in on the object Boundary and Virtual Lines Virtual lines or curves are formed by a compelling grouping of similar points or objects along an image line or curve

Other Phenomena Vanishing Points A 3-D line skew to optimal axis will appear to vanish at a point in the 2-D image Vanishing lines are formed by the vanishing points from different groups of lines parallel to the same plane A horizon line is formed from the vanishing points of different group of parallel lines on the ground plane Using these principles, 3-D models of scenes from an ordinary video taken from several viewpoints in the scene can be built

Example 1: Traffic Monitoring Road traffic monitoring Wang 2006 Road surface is level No camera misalignment

Traffic Monitoring Road and camera geometry Camera height: H Camera angles: tilt  and pan  Focal length: f

Traffic Monitoring U=KR(X-t) Translation Rotation Ignoring affine parameters Translation t=(0,0,H)T Rotation

Traffic Monitoring Mapping from ground to image Reverse transformation

Traffic Monitoring Camera view simulation Camera setting Camera height Focal length Camera setting Roadside/on-street Lane/intersection

Example 2: Segmentation Bi-layer segmentation Kolmogorov et al. 2005 Two layers: Foreground and background Task: Accurately segment foreground objects with two cameras

Bi-layer Segmentation Stereo Foreground color has large disparity Color/contrast Bg/Fg have distinct color distributions Coherence Spatial/temporal Proabilistic approach p(label|disparity,data)

Bi-layer Segmentation Color/contrast+coherence left Stereo+coherence right

Bi-layer Segmentation Fuse stereo and color/contrast Stereo and color complement each other Background substitution

Example 3: Make3D Depth from a single image Learn the relations between various parts of image, and uses monocular cues to learn the depths from data (Saxena et al. 2008)

Make 3D Approach Over-segment image into superpixel Infer 3-D location/orientation of superpixel

Make 3D Image properties Local feature: For a particular region, are the image features strong indicators of the 3D depth/orientation? Co-planarity: Except in case of occlusion, neighboring planes are more likely to be connected to each other. Co-linearity: Long straight lines in the image represent straight lines in 3D.

Make 3D Local features Texture/gradient Color channels Neighbours Scales

Make 3D Co-planarity Co-linearity C B Nice! A

Make 3D Experimental results Image Estimated As backup