Multiview stereo. Volumetric stereo Scene Volume V Input Images (Calibrated) Goal: Determine occupancy, “color” of points in V.

Slides:



Advertisements
Similar presentations
Multi-view Stereo via Volumetric Graph-cuts George Vogiatzis, Philip H. S. Torr Roberto Cipolla.
Advertisements

Multi-view Stereo via Volumetric Graph-cuts
Generating Classic Mosaics with Graph Cuts Y. Liu, O. Veksler and O. Juan University of Western Ontario, Canada Ecole Centrale de Paris, France.
Stereo matching Class 7 Read Chapter 7 of tutorial Tsukuba dataset.
Multiview Reconstruction. Why More Than 2 Views? BaselineBaseline – Too short – low accuracy – Too long – matching becomes hard.
Some books on linear algebra Linear Algebra, Serge Lang, 2004 Finite Dimensional Vector Spaces, Paul R. Halmos, 1947 Matrix Computation, Gene H. Golub,
Graph-Based Image Segmentation
Tracking Multiple Occluding People by Localizing on Multiple Scene Planes Professor :王聖智 教授 Student :周節.
1 s-t Graph Cuts for Binary Energy Minimization  Now that we have an energy function, the big question is how do we minimize it? n Exhaustive search is.
Epipolar lines epipolar lines Baseline O O’ epipolar plane.
MultiView Stereo Steve Seitz CSE590SS: Vision for Graphics.
Multiview Reconstruction. Why More Than 2 Views? BaselineBaseline – Too short – low accuracy – Too long – matching becomes hard.
Silhouettes in Multiview Stereo Ian Simon. Multiview Stereo Problem Input: – a collection of images of a rigid object (or scene) – camera parameters for.
1 Can this be generalized?  NP-hard for Potts model [K/BVZ 01]  Two main approaches 1. Exact solution [Ishikawa 03] Large graph, convex V (arbitrary.
Last Time Pinhole camera model, projection
Background Removal of Multiview Images by Learning Shape Priors Yu-Pao Tsai, Cheng-Hung Ko, Yi-Ping Hung, and Zen-Chung Shih.
Project 3 extension: Wednesday at noon Final project proposal extension: Friday at noon >consult with Steve, Rick, and/or Ian now! Project 2 artifact winners...
Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.
Computational Photography: Image-based Modeling Jinxiang Chai.
CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.
Segmentation CSE P 576 Larry Zitnick Many slides courtesy of Steve Seitz.
Contents Description of the big picture Theoretical background on this work The Algorithm Examples.
Project 2 winners Think about Project 3 Guest lecture on Monday: Aseem Announcements.
Stereo & Iterative Graph-Cuts Alex Rav-Acha Vision Course Hebrew University.
Multi-view stereo Many slides adapted from S. Seitz.
MRF Labeling With Graph Cut CMPUT 615 Nilanjan Ray.
The plan for today Camera matrix
Stereo Computation using Iterative Graph-Cuts
Project 1 grades out Announcements. Multiview stereo Readings S. M. Seitz and C. R. Dyer, Photorealistic Scene Reconstruction by Voxel Coloring, International.
Multiview Reconstruction. Why More Than 2 Views? BaselineBaseline – Too short – low accuracy – Too long – matching becomes hard.
Announcements Project Update Extension: due Friday, April 20 Create web page with description, results Present your project in class (~10min/each) on Friday,
From Images to Voxels Steve Seitz Carnegie Mellon University University of Washington SIGGRAPH 2000 Course on 3D Photography.
Lecture 25: Multi-view stereo, continued
Accurate, Dense and Robust Multi-View Stereopsis Yasutaka Furukawa and Jean Ponce Presented by Rahul Garg and Ryan Kaminsky.
Review: Binocular stereo If necessary, rectify the two stereo images to transform epipolar lines into scanlines For each pixel x in the first image Find.
Reconstructing Relief Surfaces George Vogiatzis, Philip Torr, Steven Seitz and Roberto Cipolla BMVC 2004.
Some books on linear algebra Linear Algebra, Serge Lang, 2004 Finite Dimensional Vector Spaces, Paul R. Halmos, 1947 Matrix Computation, Gene H. Golub,
Graph Cut & Energy Minimization
A Local Adaptive Approach for Dense Stereo Matching in Architectural Scene Reconstruction C. Stentoumis 1, L. Grammatikopoulos 2, I. Kalisperakis 2, E.
Graph Cut Algorithms for Binocular Stereo with Occlusions
Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images (Fri) Young Ki Baik, Computer Vision Lab.
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Geometry 3: Stereo Reconstruction Introduction to Computer Vision Ronen Basri Weizmann Institute of Science.
1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.
Michael Bleyer LVA Stereo Vision
Presenter : Kuang-Jui Hsu Date : 2011/3/24(Thur.).
Segmentation- Based Stereo Michael Bleyer LVA Stereo Vision.
875 Dynamic Scene Reconstruction
Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.
Journal of Visual Communication and Image Representation
1Ellen L. Walker 3D Vision Why? The world is 3D Not all useful information is readily available in 2D Why so hard? “Inverse problem”: one image = many.
The University of Ontario From photohulls to photoflux optimization Yuri Boykov University of Western Ontario Victor Lempitsky Moscow State University.
A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.
CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.
Photoconsistency constraint C2 q C1 p l = 2 l = 3 Depth labels If this 3D point is visible in both cameras, pixels p and q should have similar intensities.
Digital Image Processing CCS331 Relationships of Pixel 1.
Energy functions f(p)  {0,1} : Ising model Can solve fast with graph cuts V( ,  ) = T[  ] : Potts model NP-hard Closely related to Multiway Cut Problem.
Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:
Robot Vision SS 2011 Matthias Rüther / Matthias Straka 1 ROBOT VISION Lesson 7: Volumetric Object Reconstruction Matthias Straka.
Markov Random Fields with Efficient Approximations
Geometry 3: Stereo Reconstruction
From Photohulls to Photoflux Optimization
Announcements Midterm due now Project 2 artifacts: vote today!
Lecture 31: Graph-Based Image Segmentation
MSR Image based Reality Project
Overview of Modeling 김성남.
How to and how not to use graph cuts
Announcements Project 3 out today (help session at end of class)
Occlusion and smoothness probabilities in 3D cluttered scenes
Exact Voxel Occupancy with Graph Cuts
Presentation transcript:

Multiview stereo

Volumetric stereo Scene Volume V Input Images (Calibrated) Goal: Determine occupancy, “color” of points in V

Discrete formulation: Voxel coloring Discretized Scene Volume Input Images (Calibrated) Goal: Assign RGBA values to voxels in V photo-consistent with images

Complexity and computability Discretized Scene Volume N voxels C colors 3 All Scenes ( C N 3 ) Photo-Consistent Scenes True Scene

Voxel coloring 1. Choose voxel 2. Project and correlate 3.Color if consistent (standard deviation of pixel colors below threshold) Visibility Problem: in which images is each voxel visible?

Depth ordering: occluders first! Layers SceneTraversal Condition: depth order is the same for all input views

Panoramic Depth Ordering Cameras oriented in many different directions Planar depth ordering does not apply

Layers radiate outwards from cameras

Compatible Camera Configurations Outward-Looking cameras inside scene Inward-Looking cameras above scene

Voxel Coloring Results Dinosaur Reconstruction 72 K voxels colored 7.6 M voxels tested 7 min. to compute on a 250MHz SGI Flower Reconstruction 70 K voxels colored 7.6 M voxels tested 7 min. to compute on a 250MHz SGI

Limitations of Depth Ordering A view-independent depth order may not exist pq Need more powerful general-case algorithms Unconstrained camera positions Unconstrained scene geometry/topology

Space Carving Algorithm Image 1 Image N …... Initialize to a volume V containing the true scene Repeat until convergence Choose a voxel on the current surface Carve if not photo-consistent Project to visible input images

Convergence Consistency Property The resulting shape is photo-consistent all inconsistent points are removed Convergence Property Carving converges to a non-empty shape a point on the true scene is never removed p

Which shape do you get? The Photo Hull is the UNION of all photo-consistent scenes in V It is a photo-consistent scene reconstruction Tightest possible bound on the true scene True Scene V Photo Hull V

Space Carving Results: African Violet Input Image (1 of 45)Reconstruction

Space Carving Results: Hand Input Image (1 of 100) Views of Reconstruction

Multi-Camera Scene Reconstruction via Graph Cuts

Comparison with stereo Much harder problem than stereo In stereo, most scene elements are visible in both cameras It is common to ignore occlusions Here, almost no scene elements are visible in all cameras Visibility reasoning is vital

Key issues Visibility reasoning Incorporating spatial smoothness Computational tractability Only certain energy functions can be minimized using graph cuts! Handle a large class of camera configurations Treat input images symmetrically

Approach Problem formulation Discrete labels, not voxels Carefully constructed energy function Minimizing the energy via graph cuts Local minimum in a strong sense Use the regularity construction Experimental results Strong preliminary results

Problem formulation Discrete set of labels corresponding to different depths For example, from a single camera Camera pixel plus label = 3D point Goal: find the best configuration Labeling for each pixel in each camera Minimize an energy function over configurations Finding the exact minimum is NP-hard

Sample configuration C1 p r q C2 l = 2 l = 3 Depth labels

Energy function has 3 terms: smoothness, data, visibility Neighborhood systems involve 3D points Smoothness: spatial coherence (within camera) Data: photoconsistency (between cameras) Two pixels looking at the same scene point should see similar intensities Visibility: prohibit certain configurations (between cameras) A pixel in one camera can have its view blocked by a scene element visible from another camera

Smoothness neighborhood r C1 p l = 2 l = 3 Depth labels Smoothness neighbors

Smoothness term Smoothness neighborhood involves pairs of 3D points from the same camera We’ll assume it only depends on a pair of labels for neighboring pixels Usual 4- or 8-connected system among pixels Smoothness penalty for configuration f is V must be a metric, i.e. robustified L 1 (regularity)

Photoconsistency constraint C2 q C1 p l = 2 l = 3 Depth labels If this 3D point is visible in both cameras, pixels p and q should have similar intensities

Photoconsistency neighborhood C1 p C2 q l = 2 l = 3 Depth labels Photoconsistency neighbors

Data (photoconsistency) term Photoconsistency neighborhood N photo Arbitrary set of pairs of 3D points (same depth) Current implementation: if the projection of on C2 is nearest to q Our data penalty for configuration f is Negative for technical reasons (regularity)

Visibility constraint C1 p C2 q l = 2 l = 3 Depth labels is an impossible configuration

Visibility neighborhood C1 p C2 q l = 2 l = 3 Depth labels Visibility neighbors

Visibility term Visibility neighborhood N vis is all pairs of 3D points that violate the visibility constraint Arbitrary set of pairs of points at different depths Needed for regularity The pair of points come from different cameras Current implementation: based on the photoconsistency neighborhood A configuration containing any pair of 3D points in the visibility neighborhood has infinite cost

C1C2 Input (non-binary) f Red expansion move Energy minimization via expansion move algorithm We must solve the binary energy minimization problem of finding the  -expansion move that most reduces E We only need to show that all the terms in E are regular!

Smoothness term is regular True because V is a metric

Visibility term is regular Consider a pair of pixels p,q Input configuration has finite cost Therefore A =0 3D points at the same depth are not in visibility neighborhood N vis Therefore D =0 B,C can be 0 or , hence non- negative

Data term is regular

Tsukuba images Our results, 4 interactions

Comparison Our results, 10 interactionsBest results [SS ’02]