CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.

Slides:



Advertisements
Similar presentations
The fundamental matrix F
Advertisements

Stereo matching Class 7 Read Chapter 7 of tutorial Tsukuba dataset.
Multiview Reconstruction. Why More Than 2 Views? BaselineBaseline – Too short – low accuracy – Too long – matching becomes hard.
Some books on linear algebra Linear Algebra, Serge Lang, 2004 Finite Dimensional Vector Spaces, Paul R. Halmos, 1947 Matrix Computation, Gene H. Golub,
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Computer vision: models, learning and inference
Chapter 6 Feature-based alignment Advanced Computer Vision.
MultiView Stereo Steve Seitz CSE590SS: Vision for Graphics.
Multiview Reconstruction. Why More Than 2 Views? BaselineBaseline – Too short – low accuracy – Too long – matching becomes hard.
Camera calibration and epipolar geometry
Structure from motion.
Last Time Pinhole camera model, projection
Project 3 extension: Wednesday at noon Final project proposal extension: Friday at noon >consult with Steve, Rick, and/or Ian now! Project 2 artifact winners...
Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.
Stanford CS223B Computer Vision, Winter 2005 Lecture 11: Structure From Motion 2 Sebastian Thrun, Stanford Rick Szeliski, Microsoft Hendrik Dahlkamp and.
Computational Photography: Image-based Modeling Jinxiang Chai.
CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.
Stanford CS223B Computer Vision, Winter 2007 Lecture 8 Structure From Motion Professors Sebastian Thrun and Jana Košecká CAs: Vaibhav Vaish and David Stavens.
Project 2 winners Think about Project 3 Guest lecture on Monday: Aseem Announcements.
Epipolar geometry. (i)Correspondence geometry: Given an image point x in the first view, how does this constrain the position of the corresponding point.
Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.
Uncalibrated Geometry & Stratification Sastry and Yang
Multiview stereo. Volumetric stereo Scene Volume V Input Images (Calibrated) Goal: Determine occupancy, “color” of points in V.
Multi-view stereo Many slides adapted from S. Seitz.
Stanford CS223B Computer Vision, Winter 2006 Lecture 8 Structure From Motion Professor Sebastian Thrun CAs: Dan Maynes-Aminzade, Mitul Saha, Greg Corrado.
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.
Previously Two view geometry: epipolar geometry Stereo vision: 3D reconstruction epipolar lines Baseline O O’ epipolar plane.
Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.
Single-view geometry Odilon Redon, Cyclops, 1914.
Project 1 grades out Announcements. Multiview stereo Readings S. M. Seitz and C. R. Dyer, Photorealistic Scene Reconstruction by Voxel Coloring, International.
Multiview Reconstruction. Why More Than 2 Views? BaselineBaseline – Too short – low accuracy – Too long – matching becomes hard.
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
Announcements Project Update Extension: due Friday, April 20 Create web page with description, results Present your project in class (~10min/each) on Friday,
CSCE 641: Computer Graphics Image-based Rendering Jinxiang Chai.
Camera parameters Extrinisic parameters define location and orientation of camera reference frame with respect to world frame Intrinsic parameters define.
From Images to Voxels Steve Seitz Carnegie Mellon University University of Washington SIGGRAPH 2000 Course on 3D Photography.
Lecture 25: Multi-view stereo, continued
CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.
Today: Calibration What are the camera parameters?
Review: Binocular stereo If necessary, rectify the two stereo images to transform epipolar lines into scanlines For each pixel x in the first image Find.
CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.
776 Computer Vision Jan-Michael Frahm, Enrique Dunn Spring 2013.
Automatic Camera Calibration
Computer vision: models, learning and inference
Camera Calibration & Stereo Reconstruction Jinxiang Chai.
Lecture 11 Stereo Reconstruction I Lecture 11 Stereo Reconstruction I Mata kuliah: T Computer Vision Tahun: 2010.
Some books on linear algebra Linear Algebra, Serge Lang, 2004 Finite Dimensional Vector Spaces, Paul R. Halmos, 1947 Matrix Computation, Gene H. Golub,
CSCE 643 Computer Vision: Structure from Motion
Single-view geometry Odilon Redon, Cyclops, 1914.
Bahadir K. Gunturk1 Phase Correlation Bahadir K. Gunturk2 Phase Correlation Take cross correlation Take inverse Fourier transform  Location of the impulse.
EECS 274 Computer Vision Affine Structure from Motion.
Calibration.
875 Dynamic Scene Reconstruction
776 Computer Vision Jan-Michael Frahm & Enrique Dunn Spring 2013.
3D Reconstruction Using Image Sequence
Single-view geometry Odilon Redon, Cyclops, 1914.
Structure from motion Multi-view geometry Affine structure from motion Projective structure from motion Planches : –
EECS 274 Computer Vision Projective Structure from Motion.
CSE 185 Introduction to Computer Vision Stereo 2.
Robot Vision SS 2011 Matthias Rüther / Matthias Straka 1 ROBOT VISION Lesson 7: Volumetric Object Reconstruction Matthias Straka.
Computer vision: models, learning and inference
Digital Visual Effects, Spring 2007 Yung-Yu Chuang 2007/4/17
Structure from motion Input: Output: (Tomasi and Kanade)
Multiple View Geometry for Robotics
Announcements Midterm due now Project 2 artifacts: vote today!
Filtering Things to take away from this lecture An image as a function
MSR Image based Reality Project
Structure from motion Input: Output: (Tomasi and Kanade)
Announcements Project 3 out today (help session at end of class)
Presentation transcript:

CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai

Outline Stereo matching - Traditional stereo - Multi-baseline stereo - Active stereo Volumetric stereo - Visual hull - Voxel coloring - Space carving

Multi-baseline stereo reconstruction Need to choose reference frames! Does not fully utilize pixel information across all images

Volumetric stereo Scene Volume V Input Images (Calibrated) Goal: Determine occupancy, “color” of points in V

Discrete formulation: Voxel Coloring Discretized Scene Volume Input Images (Calibrated) Goal: Assign RGBA values to voxels in V photo-consistent with images

Complexity and computability Discretized Scene Volume N voxels C colors 3 All Scenes ( C N 3 ) Photo-Consistent Scenes True Scene

Issues Theoretical Questions Identify class of all photo-consistent scenes Practical Questions How do we compute photo-consistent models?

1. C=2 (shape from silhouettes) Volume intersection [Baumgart 1974] >For more info: Rapid octree construction from image sequences. R. Szeliski, CVGIP: Image Understanding, 58(1):23-32, July (this paper is apparently not available online) or >W. Matusik, C. Buehler, R. Raskar, L. McMillan, and S. J. Gortler, Image-Based Visual Hulls, SIGGRAPH 2000 ( pdf 1.6 MB )pdf 1.6 MB 2. C unconstrained, viewpoint constraints Voxel coloring algorithm [Seitz & Dyer 97] 3. General Case Space carving [Kutulakos & Seitz 98] Voxel coloring solutions

Why use silhouettes? Can be computed robustly Can be computed efficiently - =background+foregroundbackgroundforeground

Reconstruction from silhouettes (C = 2) Binary Images How to construct 3D volume?

Reconstruction from silhouettes (C = 2) Binary Images Approach: Backproject each silhouette Intersect backprojected volumes

Volume intersection Reconstruction Contains the True Scene But is generally not the same In the limit (all views) get visual hull >Complement of all lines that don’t intersect S

Voxel algorithm for volume intersection Color voxel black if on silhouette in every image for M images, N 3 voxels Don’t have to search 2 N 3 possible scenes! O( ? ),

Visual Hull Results Download data and results from

Properties of volume intersection Pros Easy to implement, fast Accelerated via octrees [Szeliski 1993] or interval techniques [Matusik 2000] Cons No concavities Reconstruction is not photo-consistent Requires identification of silhouettes

Voxel coloring solutions 1. C=2 (silhouettes) Volume intersection [Baumgart 1974] 2. C unconstrained, viewpoint constraints Voxel coloring algorithm [Seitz & Dyer 97] >For more info: General Case Space carving [Kutulakos & Seitz 98]

1. Choose voxel 2. Project and correlate 3.Color if consistent (standard deviation of pixel colors below threshold) Voxel coloring approach Visibility Problem: in which images is each voxel visible?

The visibility problem Inverse Visibility? known images Unknown Scene Which points are visible in which images? Known Scene Forward Visibility - Z-buffer - Painter’s algorithm known scene

Layers Depth ordering: visit occluders first! SceneTraversal Condition: depth order is the same for all input views

Panoramic depth ordering Cameras oriented in many different directions Planar depth ordering does not apply

Panoramic depth ordering Layers radiate outwards from cameras

Panoramic layering Layers radiate outwards from cameras

Panoramic layering Layers radiate outwards from cameras

Compatible camera configurations Depth-Order Constraint Scene outside convex hull of camera centers Outward-Looking cameras inside scene Inward-Looking cameras above scene

Calibrated image acquisition Calibrated Turntable 360° rotation (21 images) Selected Dinosaur Images Selected Flower Images

Voxel coloring results Dinosaur Reconstruction 72 K voxels colored 7.6 M voxels tested 7 min. to compute on a 250MHz SGI Flower Reconstruction 70 K voxels colored 7.6 M voxels tested 7 min. to compute on a 250MHz SGI

Limitations of depth ordering A view-independent depth order may not exist pq Need more powerful general-case algorithms Unconstrained camera positions Unconstrained scene geometry/topology

Voxel coloring solutions 1. C=2 (silhouettes) Volume intersection [Baumgart 1974] 2. C unconstrained, viewpoint constraints Voxel coloring algorithm [Seitz & Dyer 97] 3. General Case Space carving [Kutulakos & Seitz 98] >For more info:

Space carving algorithm Space Carving Algorithm Image 1 Image N …... Initialize to a volume V containing the true scene Repeat until convergence Choose a voxel on the current surface Carve if not photo-consistent Project to visible input images

Which shape do you get? The Photo Hull is the UNION of all photo-consistent scenes in V It is a photo-consistent scene reconstruction Tightest possible bound on the true scene True Scene V Photo Hull V

Space carving algorithm The Basic Algorithm is Unwieldy Complex update procedure Alternative: Multi-Pass Plane Sweep Efficient, can use texture-mapping hardware Converges quickly in practice Easy to implement Results Algorithm

Multi-pass plane sweep Sweep plane in each of 6 principle directions Consider cameras on only one side of plane Repeat until convergence True SceneReconstruction

Multi-pass plane sweep Sweep plane in each of 6 principle directions Consider cameras on only one side of plane Repeat until convergence

Multi-pass plane sweep Sweep plane in each of 6 principle directions Consider cameras on only one side of plane Repeat until convergence

Multi-pass plane sweep Sweep plane in each of 6 principle directions Consider cameras on only one side of plane Repeat until convergence

Multi-pass plane sweep Sweep plane in each of 6 principle directions Consider cameras on only one side of plane Repeat until convergence

Multi-pass plane sweep Sweep plane in each of 6 principle directions Consider cameras on only one side of plane Repeat until convergence

Space carving results: african violet Input Image (1 of 45)Reconstruction

Space carving results: hand Input Image (1 of 100) Views of Reconstruction

Properties of space carving Pros Voxel coloring version is easy to implement, fast Photo-consistent results No smoothness prior Cons Bulging No smoothness prior

Next lecture Images user input range scans Model Images Image based modeling Image- based rendering Geometry+ Images Geometry+ Materials Images + Depth Light field Panoroma Kinematics Dynamics Etc. Camera + geometry

Stereo reconstruction Given two or more images of the same scene or object, compute a representation of its shape knowncameraviewpoints

Stereo reconstruction Given two or more images of the same scene or object, compute a representation of its shape knowncameraviewpoints How to estimate camera parameters? - where is the camera? - where is it pointing? - what are internal parameters, e.g. focal length?

Spectrum of IBMR Images user input range scans Model Images Image based modeling Image- based rendering Geometry+ Images Geometry+ Materials Images + Depth Light field Panoroma Kinematics Dynamics Etc. Camera + geometry

How can we estimate the camera parameters?

Camera calibration Augmented pin-hole camera - focal point, orientation - focal length, aspect ratio, center, lens distortion Known 3D Classical calibration - 3D 2D - correspondence Camera calibration online resources

Camera and calibration target

Classical camera calibration Known 3D coordinates and 2D coordinates - known 3D points on calibration targets - find corresponding 2D points in image using feature detection algorithm

Camera parameters u0u0 v0v s y 0 sxsx а u v 1 Perspective proj. View trans. Viewport proj. Known 3D coords and 2D coords

Camera parameters u0u0 v0v s y 0 sxsx а u v 1 Perspective proj. View trans. Viewport proj. Known 3D coords and 2D coords Intrinsic camera parameters (5 parameters) extrinsic camera parameters (6 parameters)

Camera matrix Fold intrinsic calibration matrix K and extrinsic pose parameters (R,t) together into a camera matrix M = K [R | t ] (put 1 in lower r.h. corner for 11 d.o.f.)

Camera matrix calibration Directly estimate 11 unknowns in the M matrix using known 3D points (X i,Y i,Z i ) and measured feature positions (u i,v i )

Camera matrix calibration Linear regression: Bring denominator over, solve set of (over-determined) linear equations. How?

Camera matrix calibration Linear regression: Bring denominator over, solve set of (over-determined) linear equations. How? Least squares (pseudo-inverse) - 11 unknowns (up to scale) - 2 equations per point (homogeneous coordinates) - 6 points are sufficient

Nonlinear camera calibration Perspective projection:

Nonlinear camera calibration Perspective projection: KRTP

Nonlinear camera calibration Perspective projection: 2D coordinates are just a nonlinear function of its 3D coordinates and camera parameters: KRTP

Nonlinear camera calibration Perspective projection: 2D coordinates are just a nonlinear function of its 3D coordinates and camera parameters: KRTP

Multiple calibration images Find camera parameters which satisfy the constraints from M images, N points: for j=1,…,M for i=1,…,N This can be formulated as a nonlinear optimization problem:

Multiple calibration images Find camera parameters which satisfy the constraints from M images, N points: for j=1,…,M for i=1,…,N This can be formulated as a nonlinear optimization problem: Solve the optimization using nonlinear optimization techniques: - Gauss-newton - Levenberg-MarquardtLevenberg-Marquardt

Nonlinear approach Advantages: can solve for more than one camera pose at a time fewer degrees of freedom than linear approach Standard technique in photogrammetry, computer vision, computer graphics - [Tsai 87] also estimates lens distortions CMU) Disadvantages: more complex update rules need a good initialization (recover K [R | t] from M)

How can we estimate the camera parameters?

Application: camera calibration for sports video [Farin et. Al] imagesCourt model

Calibration from 2D motion Structure from motion (SFM) - track points over a sequence of images - estimate for 3D positions and camera positions - calibrate intrinsic camera parameters before hand Self-calibration: - solve for both intrinsic and extrinsic camera parameters

SFM = Holy Grail of 3D Reconstruction Take movie of object Reconstruct 3D model Would be commercially highly viable

How to get feature correspondences Feature-based approach - good for images - feature detection (corners) - feature matching using RANSAC (epipolar line) Pixel-based approach - good for video sequences - patch based registration with lucas-kanade algorithm - register features across the entire sequence

Structure from motion Two Principal Solutions Bundle adjustment (nonlinear optimization) Factorization (SVD, through orthographic approximation, affine geometry)

Nonlinear approach for SFM What’s the difference between camera calibration and SFM?

Nonlinear approach for SFM What’s the difference between camera calibration and SFM? - camera calibration: known 3D and 2D

Nonlinear approach for SFM What’s the difference between camera calibration and SFM? - camera calibration: known 3D and 2D - SFM: unknown 3D and known 2D

Nonlinear approach for SFM What’s the difference between camera calibration and SFM? - camera calibration: known 3D and 2D - SFM: unknown 3D and known 2D - what’s 3D-to-2D registration problem?

Nonlinear approach for SFM What’s the difference between camera calibration and SFM? - camera calibration: known 3D and 2D - SFM: unknown 3D and known 2D - what’s 3D-to-2D registration problem?

SFM: bundle adjustment SFM = Nonlinear Least Squares problem Minimize through Gradient Descent Conjugate Gradient Gauss-Newton Levenberg Marquardt common method Prone to local minima

Count # Constraints vs #Unknowns M camera poses N points 2MN point constraints 6M+3N unknowns Suggests: need 2mn  6m + 3n But: Can we really recover all parameters???

Count # Constraints vs #Unknowns M camera poses N points 2MN point constraints 6M+3N unknowns (known intrinsic camera parameters) Suggests: need 2mn  6m + 3n But: Can we really recover all parameters??? Can’t recover origin, orientation (6 params) Can’t recover scale (1 param) Thus, we need 2mn  6m + 3n - 7

Are we done? No, bundle adjustment has many local minima.

SFM using Factorization Assume an othorgraphic camera Image World

SFM using Factorization Assume othorgraphic camera Image World Subtract the mean

SFM using Factorization Stack all the features from the same frame:

SFM using Factorization Stack all the features from the same frame: Stack all the features from all the images: W

SFM using Factorization Stack all the features from the same frame: Stack all the features from all the images: W

SFM using Factorization Stack all the features from all the images: W Factorize the matrix into two matrix using SVD:

SFM using Factorization Stack all the features from all the images: Factorize the matrix into two matrix using SVD:

SFM using Factorization Stack all the features from all the images: W Factorize the matrix into two matrix using SVD: How to compute the matrix ?

SFM using Factorization M is the stack of rotation matrix:

SFM using Factorization M is the stack of rotation matrix: Orthogonal constraints from rotation matrix

SFM using Factorization M is the stack of rotation matrix: Orthogonal constraints from rotation matrix

SFM using Factorization Orthogonal constraints from rotation matrices:

SFM using Factorization Orthogonal constraints from rotation matrices: QQ: symmetric 3 by 3 matrix

SFM using Factorization Orthogonal constraints from rotation matrices: How to compute QQ? least square solution - 4F linear constraints, 9 unknowns (6 independent due to symmetric matrix) QQ: symmetric 3 by 3 matrix

SFM using Factorization Orthogonal constraints from rotation matrices: How to compute QQ? least square solution - 4F linear constraints, 9 unknowns (6 independent due to symmetric matrix) How to compute Q from QQ: SVD again: QQ: symmetric 3 by 3 matrix

SFM using Factorization M is the stack of rotation matrix: Orthogonal constraints from rotation matrix QQ: symmetric 3 by 3 matrix Computing QQ is easy: - 3F linear equations - 6 independent unknowns

SFM using factorization 1.Form the measurement matrix 2.Decompose the matrix into two matrices and using SVD 3.Compute the matrix Q with least square and SVD 4.Compute the rotation matrix and shape matrix: and

Weak-perspective Projection Factorization also works for weak-perspective projection (scaled orthographic projection): d z0z0

SFM using factorization Bundle adjustment (nonlinear optimization) - work with perspective camera model - work with incomplete data - prone to local minima Factorization: - closed-form solution for weak perspective camera - simple and efficient - usually need complete data - becomes complicated for full-perspective camera model Phil Torr’s structure from motion toolkit in matlab (click here)here Voodoo camera tracker (click here)here

Results from SFM [Han and Kanade]

All together video Click herehere - feature detection - feature matching (epipolar geometry) - structure from motion - stereo reconstruction - triangulation - texture mapping

Challenging objects Why are they so difficult to model? How to model them?