Download presentation
Presentation is loading. Please wait.
1
3D Computer Vision and Video Computing 3D Vision Topic 4 of Part II Visual Motion CSc I6716 Fall 2011 Cover Image/video credits: Rick Szeliski, MSR Zhigang Zhu, City College of New York zhu@cs.ccny.cuny.edu
2
3D Computer Vision and Video Computing Outline of Motion n Problems and Applications l The importance of visual motion l Problem Statement n The Motion Field of Rigid Motion l Basics – Notations and Equations l Three Important Special Cases: Translation, Rotation and Moving Plane l Motion Parallax n Optical Flow l Optical flow equation and the aperture problem l Estimating optical flow l 3D motion & structure from optical flow n Feature-based Approach l Two-frame algorithm l Multi-frame algorithm l Structure from motion – Factorization method n Advanced Topics l Spatio-Temporal Image and Epipolar Plane Image l Video Mosaicing and Panorama Generation l Motion-based Segmentation and Layered Representation
3
3D Computer Vision and Video Computing The Importance of Visual Motion n Structure from Motion l Apparent motion is a strong visual clue for 3D reconstruction n More than a multi-camera stereo system n Recognition by motion (only) l Biological visual systems use visual motion to infer properties of 3D world with little a priori knowledge of it n Blurred image sequence n Visual Motion = Video ! [Go to CVPR 2004-2010 Sites for Workshops] l Video Coding and Compression: MPEG 1, 2, 4, 7… l Video Mosaicing and Layered Representation for IBR l Surveillance (Human Tracking and Traffic Monitoring) l HCI using Human Gesture (video camera) l Image-based Rendering l …
4
3D Computer Vision and Video Computing Blurred Sequence An up-sampling from images of resolution 15x20 pixels From: James W. Davis. MIT Media Lab Recognition by Actions: Recognize object from motion even if we cannot distinguish it in any images …
5
3D Computer Vision and Video Computing Problem Statement n Two Subproblems l Correspondence: Which elements of a frame correspond to which elements in the next frame? l Reconstruction :Given a number of correspondences, and possibly the knowledge of the camera’s intrinsic parameters, how to recovery the 3-D motion and structure of the observed world n Main Difference between Motion and Stereo l Correspondence: the disparities between consecutive frames are much smaller due to dense temporal sampling l Reconstruction: the visual motion could be caused by multiple motions ( instead of a single 3D rigid transformation) n The Third Subproblem, and Fourth…. l Motion Segmentation: what are the regions the the image plane corresponding to different moving objects? l Motion Understanding: lip reading, gesture, expression, event…
6
3D Computer Vision and Video Computing Approaches n Two Subproblems l Correspondence: n Differential Methods - >dense measure (optical flow) n Matching Methods -> sparse measure l Reconstruction : More difficult than stereo since n Motion (3D transformation betw. Frames) as well as structure needs to be recovered n Small baseline causes large errors n The Third Subproblem l Motion Segmentation: Chicken and Egg problem n Which should be solved first? Matching or Segmentation n Segmentation for matching elements n Matching for Segmentation
7
3D Computer Vision and Video Computing The Motion Field of Rigid Objects n Motion: l 3D Motion ( R, T): n camera motion (static scene) n or single object motion n Only one rigid, relative motion between the camera and the scene (object) l Image motion field: n 2D vector field of velocities of the image points induced by the relative motion. n Data: Image sequence l Many frames n captured at time t=0, 1, 2, … l Basics: only consider two consecutive frames n We consider a reference frame and its consecutive frame l Image motion field n can be viewed disparity map of the two frames captured at two consecutive camera locations ( assuming we have a moving camera)
8
3D Computer Vision and Video Computing The Motion Field of Rigid Objects n Notations l P = (X,Y,Z) T : 3-D point in the camera reference frame l p = (x,y,f) T : the projection of the scene point in the pinhole camera n Relative motion between P and the camera l T= (T x,T y,T z ) T : translation component of the motion l x y z : the angular velocity n Note: l How to connect this with stereo geometry (with R, T)? l Image velocity v= ? p O X PV f Z Y v
9
3D Computer Vision and Video Computing The Motion Field of Rigid Objects n Notations l P = (X,Y,Z) T : 3-D point in the camera reference frame l p = (x,y,f) T : the projection of the scene point in the pinhole camera n Relative motion between P and the camera l T= (T x,T y,T z ) T : translation component of the motion l x y z : the angular velocity n Note: l How to connect this with stereo geometry (with R, T)?
10
3D Computer Vision and Video Computing Basic Equations of Motion Field n Notes: l Take the time derivative of both sides of the projection equation l The motion field is the sum of two components n Translational part n Rotational part l Assume known intrinsic parameters Rotation part: no depth information Translation part: depth Z
11
3D Computer Vision and Video Computing Motion Field vs. Disparity n Correspondence and Point Displacements StereoMotion DisparityMotion field Displacement – (dx, dy)Differential concept – velocity (v x, v y ), i.e. time derivative (dx/dt, dy/dt) No such constraintConsecutive frame close to guarantee good discrete approximation
12
3D Computer Vision and Video Computing Special Case 1: Pure Translation n Pure Translation ( =0) n Radial Motion Field (Tz <> 0) l Vanishing point p0 =(x 0, y 0 ) T : n motion direction l FOE (focus of expansion) n Vectors away from p0 if Tz < 0 l FOC (focus of contraction) n Vectors towards p0 if Tz > 0 l Depth estimation n depth inversely proportional to magnitude of motion vector v, and also proportional to distance from p to p 0 n Parallel Motion Field (Tz= 0) l Depth estimation: n depth inversely proportional to magnitude of motion vector v Tz =0
13
3D Computer Vision and Video Computing Special Case 2: Pure Rotation n Pure Rotation (T =0) l Does not carry 3D information n Motion Field (approximation) l Small motion l A quadratic polynomial in image coordinates (x,y,f) T n Image Transformation between two frames (accurate) l Motion can be large l Homography (3x3 matrix) for all points n Image mosaicing from a rotating camera l 360 degree panorama
14
3D Computer Vision and Video Computing Special Case 3: Moving Plane n Planes are common in the man-made world n Motion Field (approximation) l Given small motion l a quadratic polynomial in image n Image Transformation between two frames (accurate) l Any amount of motion (arbitrary) l Homography (3x3 matrix) for all points l See Topic 5 Camera Models n Image Mosaicing for a planar scene l Aerial image sequence l Video of blackboard Only has 8 independent parameters (write it out!)
15
3D Computer Vision and Video Computing Special Cases: A Summary n Pure Translation l Vanishing point and FOE (focus of expansion) l Only translation contributes to depth estimation n Pure Rotation l Does not carry 3D information l Motion field: a quadratic polynomial in image, or l Transform: Homography (3x3 matrix R) for all points l Image mosaicing from a rotating camera n Moving Plane l Motion field is a quadratic polynomial in image, or l Transform: Homography (3x3 matrix A) for all points l Image mosaicing for a planar scene
16
3D Computer Vision and Video Computing Motion Parallax n [Observation 1] The relative motion field of two instantaneously coincident points l Does not depend on the rotational component of motion l Points towards (away from) the vanishing point of the translation direction n [Observation 2] The motion field of two frames after rotation compensation l only includes the translation component l points towards (away from) the vanishing point p0 ( the instantaneous epipole) l the length of each motion vector is inversely proportional to the depth, and also proportional to the distance from point p to the vanishing point p0 of the translation direction l Question: how to remove rotation? n Active vision : rotation known approximately?
17
3D Computer Vision and Video Computing Motion Parallax n [Observation 1] The relative motion field of two instantaneously coincident points l Does not depend on the rotational component of motion l Points towards (away from) the vanishing point of the translation direction (the instantaneous epipole) Epipole (x 0, y 0 ) At instant t, three pairs of points happen to be coincident The difference of the motion vectors of each pair cancels the rotational components. … and the relative motion field point in ( towards or away from) the VP of the translational direction (Fig 8.5 ???)
18
3D Computer Vision and Video Computing Motion Parallax n [Observation 2] The motion field of two frames after rotation compensation l only includes the translation component l points towards (away from) the vanishing point p0 ( the instantaneous epipole) l the length of each motion vector is inversely proportional to the depth, l and also proportional to the distance from point p to the vanishing point p0 of the translation direction (if Tz <> 0) Question: how to remove rotation? n Active vision : rotation known approximately? n Rotation compensation can be done by image warping after finding three (3) pairs of coincident points FOE p0p0 p v
19
3D Computer Vision and Video Computing Summary n Importance of visual motion (apparent motion) l Many applications… l Problems: n correspondence, reconstruction, segmentation, understanding in x-y-t space n Image motion field of rigid objects l Time derivative of both sides of the projection equation n Three important special cases l Pure translation – FOE l Pure rotation – no 3D information, but lead to mosaicing l Moving plane – homography with arbitrary motion n Motion parallax l Only depends on translational component of motion
20
3D Computer Vision and Video Computing n Next lecture
21
3D Computer Vision and Video Computing Notion of Optical Flow n The Notion of Optical Flow l Brightness constancy equation n Under most circumstance, the apparent brightness of moving objects remain constant l Optical Flow Equation n Relation of the apparent motion with the spatial and temporal derivatives of the image brightness n Aperture problem l Only the component of the motion field in the direction of the spatial image gradient can be determined l The component in the direction perpendicular to the spatial gradient is not constrained by the optical flow equation ?
22
3D Computer Vision and Video Computing Estimating Optical Flow n Constant Flow Method l Assumption: the motion field is well approximated by a constant vector within any small region of the image plane l Solution: Least square of two variables (u,v) from NxN Equations – NxN (=5x5) planar patch l Condition: A T A is NOT singular (null or parallel gradients) n Weighted Least Square Method l Assumption: the motion field is approximated by a constant vector within any small region, and the error made by the approximation increases with the distance from the center where optical flow is to be computed l Solution: Weighted least square of two variables (u,v) from NxN Equations – NxN patch n Affine Flow Method l Assumption: the motion field is well approximated by a affine parametric model u T = Ap T +b (a plane patch with arbitrary orientation) l Solution: Least square of 6 variables (A,b) from NxN Equations – NxN planar patch
23
3D Computer Vision and Video Computing Using Optical Flow n 3D motion and structure from optical flow (p 208- 212) l Input: n Intrinsic camera parameters n dense motion field (optical flow) of single rigid motion l Algorithm n ( good comprise between ease of implementation and quality of results ) n Stage 1: Translation direction n Epipole (x0, y0) through approximate motion parallax n Key: Instantaneously coincident image points n Approximation: estimating differences for ALMOST coincident image points n Stage 2: Rotation flow and Depth n Knowns: flow vector, and direction of translational component n One point, one equation (without depth)– w Least square approximation of the rotational component of flow n From motion field to depth l Output n Direction of translation (f Tx/Tz, f Ty/Tz, f) = (x0, y0, f) n Angular velocity n 3-D coordinates of scene points (up to a common unknown scale)
24
3D Computer Vision and Video Computing Some Details n Step 1. Get (Tx, Ty, Tz) = s (x0,y0,f) Step 2. For every point (x,y,f) with known v, get one equation about from the motion equation (by eliminate Z since it’s different from point to point) Step 3. Get Z (up to a scale s) given T/s and Rotation part: no depth information Translation part: depth Z
25
3D Computer Vision and Video Computing Feature-Based Approach n Two frame method - Feature matching l An Algorithm Based on the Constant Flow Method n Features – corners detection by observing the coefficient matrix of the spatial gradient evaluation (2x2 matrix A T A) n Iteration approach: estimation – warping – comparison n Multiple frame method - Feature tracking l Kalman Filter Algorithm n Estimating the position and uncertainty of a moving feature in the next frame n Two parts: prediction (from previous trajectory) and measurement from feature matching n Using a sparse motion field l 3D motion and structure by feature tracking over frames l Factorization method n Orthographic projection model n Feature tracking over multiple frames n SVD
26
3D Computer Vision and Video Computing Motion-Based Segmentation n Change Detection l Stationary camera(s), multiple moving subjects l Background modeling and updating l Background subtraction l Occlusion handling n Layered representation (I)– rotating camera l Rotating camera + Independent moving objects l Sprite - background mosaicing l Synopsis – foreground object sequences n Layered representation (II)– translating (and rotating) camera l Arbitrary camera motion l Scene segmentation into layers
27
3D Computer Vision and Video Computing Summary n After learning motion, you should be able to l Explain the fundamental problems of motion analysis l Understand the relation of motion and stereo l Estimate optical flow from a image sequence l Extract and track image features over time l Estimate 3D motion and structure from sparse motion field l Extract Depth from 3D ST image formation under translational motion l Know some important application of motion, such as change detection, image mosaicing and motion-based segmentation
28
3D Computer Vision and Video Computing Next n Reviews, Exam and Projects Exam & Project Presentations n Homework #4 due in May 03, 2011 before class
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.