CSCE 643 Computer Vision: Structure from Motion

Slides:



Advertisements
Similar presentations
Structure from motion.
Advertisements

The fundamental matrix F
Institut für Elektrische Meßtechnik und Meßsignalverarbeitung Professor Horst Cerjak, Augmented Reality VU 2 Calibration Axel Pinz.
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Computer vision: models, learning and inference
Two-View Geometry CS Sastry and Yang
Chapter 6 Feature-based alignment Advanced Computer Vision.
Camera calibration and epipolar geometry
Structure from motion.
Stanford CS223B Computer Vision, Winter 2005 Lecture 11: Structure From Motion 2 Sebastian Thrun, Stanford Rick Szeliski, Microsoft Hendrik Dahlkamp and.
Stanford CS223B Computer Vision, Winter 2007 Lecture 8 Structure From Motion Professors Sebastian Thrun and Jana Košecká CAs: Vaibhav Vaish and David Stavens.
CSCE 641 Computer Graphics: Image Registration Jinxiang Chai.
CSCE 641:Computer Graphics Image Warping/Registration Jinxiang Chai.
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.
Uncalibrated Geometry & Stratification Sastry and Yang
Stanford CS223B Computer Vision, Winter 2006 Lecture 8 Structure From Motion Professor Sebastian Thrun CAs: Dan Maynes-Aminzade, Mitul Saha, Greg Corrado.
3D reconstruction of cameras and structure x i = PX i x’ i = P’X i.
Uncalibrated Epipolar - Calibration
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
Structure From Motion Sebastian Thrun, Gary Bradski, Daniel Russakoff
© 2003 by Davi GeigerComputer Vision October 2003 L1.1 Structure-from-EgoMotion (based on notes from David Jacobs, CS-Maryland) Determining the 3-D structure.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
Automatic Image Alignment (feature-based) : Computational Photography Alexei Efros, CMU, Fall 2006 with a lot of slides stolen from Steve Seitz and.
Computational Photography: Image Registration Jinxiang Chai.
Camera parameters Extrinisic parameters define location and orientation of camera reference frame with respect to world frame Intrinsic parameters define.
Global Alignment and Structure from Motion Computer Vision CSE455, Winter 2008 Noah Snavely.
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.
Multi-view geometry. Multi-view geometry problems Structure: Given projections of the same 3D point in two or more images, compute the 3D coordinates.
776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.
Automatic Camera Calibration
Computer vision: models, learning and inference
Image Stitching Ali Farhadi CSE 455
Chapter 6 Feature-based alignment Advanced Computer Vision.
CSC 589 Lecture 22 Image Alignment and least square methods Bei Xiao American University April 13.
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
Projective cameras Motivation Elements of Projective Geometry Projective structure from motion Planches : –
Course 12 Calibration. 1.Introduction In theoretic discussions, we have assumed: Camera is located at the origin of coordinate system of scene.
Example: line fitting. n=2 Model fitting Measure distances.
3D Reconstruction Jeff Boody. Goals ● Reconstruct 3D models from a sequence of at least two images ● No prior knowledge of the camera or scene ● Use the.
© 2005 Martin Bujňák, Martin Bujňák Supervisor : RNDr.
Lecture 03 15/11/2011 Shai Avidan הבהרה : החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Computer Vision : CISC 4/689 Going Back a little Cameras.ppt.
Affine Structure from Motion
Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.
EECS 274 Computer Vision Affine Structure from Motion.
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.
COS429 Computer Vision =++ Assignment 4 Cloning Yourself.
776 Computer Vision Jan-Michael Frahm & Enrique Dunn Spring 2013.
3D Reconstruction Using Image Sequence
CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.
MASKS © 2004 Invitation to 3D vision Uncalibrated Camera Chapter 6 Reconstruction from Two Uncalibrated Views Modified by L A Rønningen Oct 2008.
Geometry Reconstruction March 22, Fundamental Matrix An important problem: Determine the epipolar geometry. That is, the correspondence between.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Structure from motion Multi-view geometry Affine structure from motion Projective structure from motion Planches : –
EECS 274 Computer Vision Projective Structure from Motion.
Lecture 22: Structure from motion CS6670: Computer Vision Noah Snavely.
Invariant Local Features Image content is transformed into local feature coordinates that are invariant to translation, rotation, scale, and other imaging.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Approximate Models for Fast and Accurate Epipolar Geometry Estimation
Epipolar geometry.
Structure from motion Input: Output: (Tomasi and Kanade)
Uncalibrated Geometry & Stratification
Noah Snavely.
Structure from motion.
Back to equations of geometric transformations
Structure from motion Input: Output: (Tomasi and Kanade)
Lecture 15: Structure from motion
Presentation transcript:

CSCE 643 Computer Vision: Structure from Motion Jinxiang Chai

Stereo reconstruction Given two or more images of the same scene or object, compute a representation of its shape known camera viewpoints

Stereo reconstruction Given two or more images of the same scene or object, compute a representation of its shape known camera viewpoints How to estimate camera parameters? - where is the camera? - where is it pointing? - what are internal parameters, e.g. focal length?

Calibration from 2D motion Structure from motion (SFM) - track points over a sequence of images - estimate for 3D positions and camera positions - calibrate intrinsic camera parameters before hand Self-calibration: - solve for both intrinsic and extrinsic camera parameters

SFM = Holy Grail of 3D Reconstruction Take movie of object Reconstruct 3D model Would be commercially highly viable

How to Get Feature Correspondences Feature-based approach - good for images - feature detection (corners or sift features) - feature matching using RANSAC (epipolar line) Pixel-based approach - good for video sequences - patch based registration with lucas-kanade algorithm - register features across the entire sequence

A Brief Introduction on Feature-based Matching Find a few important features (aka Interest Points) Match them across two images Compute image transformation function h

Feature Detection Two images taken at the same place with different angles Projective transformation H3X3

Feature Matching ? Two images taken at the same place with different angles Projective transformation H3X3

How do we match features across images? Any criterion? Feature Matching ? Two images taken at the same place with different angles Projective transformation H3X3 How do we match features across images? Any criterion?

How do we match features across images? Any criterion? Feature Matching ? Two images taken at the same place with different angles Projective transformation H3X3 How do we match features across images? Any criterion?

Feature Matching Intensity/Color similarity The intensity of pixels around the corresponding features should have similar intensity

Feature Matching Feature similarity (Intensity or SIFT signature) The intensity of pixels around the corresponding features should have similar intensity Cross-correlation, SSD

Feature Matching Feature similarity (Intensity or SIFT signature) The intensity of pixels around the corresponding features should have similar intensity Cross-correlation, SSD Distance constraint The displacement of features should be smaller than a given threshold

Feature Matching Feature similarity (Intensity or SIFT signature) The intensity of pixels around the corresponding features should have similar intensity Cross-correlation, SSD Distance constraint The displacement of features should be smaller than a given threshold Epipolar line constraint The corresponding pixels satisfy epipolar line constraints.

Feature Matching Feature similarity (Intensity or SIFT signature) The intensity of pixels around the corresponding features should have similar intensity Cross-correlation, SSD Distance constraint The displacement of features should be smaller than a given threshold Epipolar line constraint The corresponding pixels satisfy epipolar line constraints. Fundamental matrix H

Feature-space Outlier Rejection bad Good

Feature-space Outlier Rejection Can we now compute H3X3 from the blue points?

Feature-space Outlier Rejection Can we now compute H3X3 from the blue points?

Feature-space Outlier Rejection Can we now compute H3X3 from the blue points? No! Still too many outliers…

Feature-space Outlier Rejection Can we now compute H3X3 from the blue points? No! Still too many outliers… What can we do?

Feature-space Outlier Rejection Can we now compute H3X3 from the blue points? No! Still too many outliers… What can we do? Robust estimation!

Robust Estimation: A Toy Example How to fit a line based on a set of 2D points?

RANSAC for Estimating Projective Transformation RANSAC loop: Select four feature pairs (at random) Compute the transformation matrix H (exact) Compute inliers where SSD(pi’, H pi) < ε Keep largest set of inliers Re-compute least-squares H estimate on all of the inliers For more detail, check - http://research.microsoft.com/en-us/um/people/zhang/INRIA/software-FMatrix.html - Philip H. S. Torr (1997). "The Development and Comparison of Robust Methods for Estimating the Fundamental Matrix". International Journal of Computer Vision 24 (3): 271–300

Structure from Motion Two Principal Solutions Bundle adjustment (nonlinear optimization) Factorization (SVD, through orthographic approximation, affine geometry)

Projection Matrix Perspective projection: 2D coordinates are just a nonlinear function of its 3D coordinates and camera parameters: K R T P 26

Nonlinear Approach for SFM What’s the difference between camera calibration and SFM?

Nonlinear Approach for SFM What’s the difference between camera calibration and SFM? - camera calibration: known 3D and 2D

Nonlinear Approach for SFM What’s the difference between camera calibration and SFM? - camera calibration: known 3D and 2D - SFM: unknown 3D and known 2D

Nonlinear Approach for SFM What’s the difference between camera calibration and SFM? - camera calibration: known 3D and 2D - SFM: unknown 3D and known 2D - what’s 3D-to-2D registration problem?

Nonlinear Approach for SFM What’s the difference between camera calibration and SFM? - camera calibration: known 3D and 2D - SFM: unknown 3D and known 2D - what’s 3D-to-2D registration problem?

SFM: Bundle Adjustment SFM = Nonlinear Least Squares problem Minimize through Gradient Descent Conjugate Gradient Gauss-Newton Levenberg Marquardt common method Prone to local minima

Count # Constraints vs #Unknowns M camera poses N points 2MN point constraints 6M+3N + 4 (unknowns) Suggests: need 2mn  6m + 3n+4 But: Can we really recover all parameters???

Count # Constraints vs #Unknowns M camera poses N points 2MN point constraints 6M+3N+4 unknowns (known intrinsic camera parameters) Suggests: need 2mn  6m + 3n+4 But: Can we really recover all parameters??? Can’t recover origin, orientation (6 params) Can’t recover scale (1 param) Thus, we need 2mn  6m + 3n+4 - 7

Are We Done? No, bundle adjustment has many local minima.

SFM Using Factorization Assume an orthographic camera Image World

SFM Using Factorization Assume orthographic camera Image World Subtract the mean

SFM Using Factorization Stack all the features from the same frame:

SFM Using Factorization Stack all the features from the same frame: Stack all the features from all the images: W

SFM Using Factorization Stack all the features from the same frame: Stack all the features from all the images: W

SFM Using Factorization Stack all the features from all the images: Factorize the matrix into two matrix using SVD: W

SFM Using Factorization Stack all the features from all the images: Factorize the matrix into two matrix using SVD:

SFM Using Factorization Stack all the features from all the images: Factorize the matrix into two matrix using SVD: How to compute the matrix ? W

SFM Using Factorization M is the stack of rotation matrix:

SFM Using Factorization M is the stack of rotation matrix: 1 Orthogonal constraints from rotation matrix 1 1 1

SFM Using Factorization M is the stack of rotation matrix: 1 Orthogonal constraints from rotation matrix 1 1 1

SFM Using Factorization Orthogonal constraints from rotation matrices: 1 1 1 1

SFM Using Factorization Orthogonal constraints from rotation matrices: 1 1 1 QQ: symmetric 3 by 3 matrix 1

SFM Using Factorization Orthogonal constraints from rotation matrices: 1 1 1 QQ: symmetric 3 by 3 matrix 1 How to compute QQT? least square solution - 4F linear constraints, 9 unknowns (6 independent due to symmetric matrix)

SFM Using Factorization Orthogonal constraints from rotation matrices: 1 1 1 QQ: symmetric 3 by 3 matrix 1 How to compute QQT? least square solution - 4F linear constraints, 9 unknowns (6 independent due to symmetric matrix) How to compute Q from QQT: SVD again:

SFM Using Factorization M is the stack of rotation matrix: 1 Orthogonal constraints from rotation matrix 1 1 1 QQT: symmetric 3 by 3 matrix Computing QQT is easy: 3F linear equations 6 independent unknowns

SFM Using Factorization Form the measurement matrix Decompose the matrix into two matrices and using SVD Compute the matrix Q with least square and SVD Compute the rotation matrix and shape matrix: and

Weak-perspective Projection Factorization also works for weak-perspective projection (scaled orthographic projection): z0 d

Factorization for Full-perspective Cameras [Han and Kanade]

SFM for Deformable Objects For detail, click here 55

SFM for Articulated Objects For video, click here

SFM Using Factorization Bundle adjustment (nonlinear optimization) - work with perspective camera model - work with incomplete data - prone to local minima Factorization: - closed-form solution for weak perspective camera - simple and efficient - usually need complete data - becomes complicated for full-perspective camera model Phil Torr’s structure from motion toolkit in matlab (click here) Voodoo camera tracker (click here) 57

All Together Video Click here - feature detection - feature matching (epipolar geometry) - structure from motion - stereo reconstruction - triangulation - texture mapping 58