Lecture 25: Multi-view stereo, continued

Slides:

Advertisements

Similar presentations

Using Strong Shape Priors for Multiview Reconstruction Yunda SunPushmeet Kohli Mathieu BrayPhilip HS Torr Department of Computing Oxford Brookes University.

Advertisements

Multi-view Stereo via Volumetric Graph-cuts George Vogiatzis, Philip H. S. Torr Roberto Cipolla.

Multi-view Stereo via Volumetric Graph-cuts

Multi-View Stereo for Community Photo Collections

Stereo matching Class 7 Read Chapter 7 of tutorial Tsukuba dataset.

Multiview Reconstruction. Why More Than 2 Views? BaselineBaseline – Too short – low accuracy – Too long – matching becomes hard.

Some books on linear algebra Linear Algebra, Serge Lang, 2004 Finite Dimensional Vector Spaces, Paul R. Halmos, 1947 Matrix Computation, Gene H. Golub,

Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image Where does the.

Recent work in image-based rendering from unstructured image collections and remaining challenges Sudipta N. Sinha Microsoft Research, Redmond, USA.

MultiView Stereo Steve Seitz CSE590SS: Vision for Graphics.

Multiview Reconstruction. Why More Than 2 Views? BaselineBaseline – Too short – low accuracy – Too long – matching becomes hard.

Silhouettes in Multiview Stereo Ian Simon. Multiview Stereo Problem Input: – a collection of images of a rigid object (or scene) – camera parameters for.

Reconstructing Building Interiors from Images Yasutaka Furukawa Brian Curless Steven M. Seitz University of Washington, Seattle, USA Richard Szeliski Microsoft.

Reconstructing Building Interiors from Images Yasutaka Furukawa Brian Curless Steven M. Seitz University of Washington, Seattle, USA 2011/01/16 蔡禹婷.

Last Time Pinhole camera model, projection

Project 3 extension: Wednesday at noon Final project proposal extension: Friday at noon >consult with Steve, Rick, and/or Ian now! Project 2 artifact winners...

Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.

Computer Vision CSE576, Spring 2005 Richard Szeliski

Computational Photography: Image-based Modeling Jinxiang Chai.

CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.

Multiple View Geometry : Computational Photography Alexei Efros, CMU, Fall 2005 © Martin Quinn …with a lot of slides stolen from Steve Seitz and.

Project 2 winners Think about Project 3 Guest lecture on Monday: Aseem Announcements.

1 Image-Based Visual Hulls Paper by Wojciech Matusik, Chris Buehler, Ramesh Raskar, Steven J. Gortler and Leonard McMillan [

Image-Based Visual Hulls Wojciech Matusik Chris Buehler Leonard McMillan Wojciech Matusik Chris Buehler Leonard McMillan Massachusetts Institute of Technology.

Multiview stereo. Volumetric stereo Scene Volume V Input Images (Calibrated) Goal: Determine occupancy, “color” of points in V.

Multi-view stereo Many slides adapted from S. Seitz.

High-Quality Video View Interpolation

The plan for today Camera matrix

3D from multiple views : Rendering and Image Processing Alexei Efros …with a lot of slides stolen from Steve Seitz and Jianbo Shi.

Stereo and Structure from Motion

Project 1 grades out Announcements. Multiview stereo Readings S. M. Seitz and C. R. Dyer, Photorealistic Scene Reconstruction by Voxel Coloring, International.

Multiview Reconstruction. Why More Than 2 Views? BaselineBaseline – Too short – low accuracy – Too long – matching becomes hard.

Announcements Project Update Extension: due Friday, April 20 Create web page with description, results Present your project in class (~10min/each) on Friday,

Stereo Guest Lecture by Li Zhang

From Images to Voxels Steve Seitz Carnegie Mellon University University of Washington SIGGRAPH 2000 Course on 3D Photography.

Constructing immersive virtual space for HAI with photos Shingo Mori Yoshimasa Ohmoto Toyoaki Nishida Graduate School of Informatics Kyoto University GrC2011.

Multiple View Geometry : Computational Photography Alexei Efros, CMU, Fall 2006 © Martin Quinn …with a lot of slides stolen from Steve Seitz and.

Accurate, Dense and Robust Multi-View Stereopsis Yasutaka Furukawa and Jean Ponce Presented by Rahul Garg and Ryan Kaminsky.

Manhattan-world Stereo Y. Furukawa, B. Curless, S. M. Seitz, and R. Szeliski 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.

Review: Binocular stereo If necessary, rectify the two stereo images to transform epipolar lines into scanlines For each pixel x in the first image Find.

CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.

Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #15.

Reconstructing Relief Surfaces George Vogiatzis, Philip Torr, Steven Seitz and Roberto Cipolla BMVC 2004.

Some books on linear algebra Linear Algebra, Serge Lang, 2004 Finite Dimensional Vector Spaces, Paul R. Halmos, 1947 Matrix Computation, Gene H. Golub,

Peter Sturm INRIA Grenoble – Rhône-Alpes (Institut National de Recherche en Informatique et Automatique) An overview of multi-view stereo and other topics.

Xiaoguang Han Department of Computer Science Probation talk – D Human Reconstruction from Sparse Uncalibrated Views.

Graph Cut Algorithms for Binocular Stereo with Occlusions

Interactive Graph Cuts for Optimal Boundary & Region Segmentation of Objects in N-D Images (Fri) Young Ki Baik, Computer Vision Lab.

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

I 3D: Interactive Planar Reconstruction of Objects and Scenes Adarsh KowdleYao-Jen Chang Tsuhan Chen School of Electrical and Computer Engineering Cornell.

Stereo Many slides adapted from Steve Seitz.

Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image image 1image.

Presenter : Jia-Hao Syu

875 Dynamic Scene Reconstruction

CSCE 641 Computer Graphics: Image-based Rendering (cont.) Jinxiang Chai.

Paper presentation topics 2. More on feature detection and descriptors 3. Shape and Matching 4. Indexing and Retrieval 5. More on 3D reconstruction 1.

The University of Ontario From photohulls to photoflux optimization Yuri Boykov University of Western Ontario Victor Lempitsky Moscow State University.

CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.

Image-Based Rendering Geometry and light interaction may be difficult and expensive to model –Think of how hard radiosity is –Imagine the complexity of.

Photoconsistency constraint C2 q C1 p l = 2 l = 3 Depth labels If this 3D point is visible in both cameras, pixels p and q should have similar intensities.

Center for Machine Perception Department of Cybernetics Faculty of Electrical Engineering Czech Technical University in Prague Segmentation Based Multi-View.

Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:

CSE 185 Introduction to Computer Vision Stereo 2.

Robot Vision SS 2011 Matthias Rüther / Matthias Straka 1 ROBOT VISION Lesson 7: Volumetric Object Reconstruction Matthias Straka.

A Globally Optimal Algorithm for Robust TV-L 1 Range Image Integration Christopher Zach VRVis Research Center Thomas Pock, Horst Bischof.

From Photohulls to Photoflux Optimization

Announcements Midterm due now Project 2 artifacts: vote today!

MSR Image based Reality Project

Announcements Project 3 out today (help session at end of class)

Stereo vision Many slides adapted from Steve Seitz.

Presentation transcript:

Lecture 25: Multi-view stereo, continued CS6670: Computer Vision Noah Snavely Lecture 25: Multi-view stereo, continued

Figures by Carlos Hernandez Multi-view Stereo Input: calibrated images from several viewpoints Output: 3D object model Figures by Carlos Hernandez

The visibility problem Which points are visible in which images? Known Scene Forward Visibility known scene Inverse Visibility known images Unknown Scene

Volumetric stereo Goal: Determine occupancy, “color” of points in V Scene Volume V Input Images (Calibrated) Goal: Determine occupancy, “color” of points in V

Discrete formulation: Voxel Coloring Discretized Scene Volume Input Images (Calibrated) Goal: Assign RGBA values to voxels in V photo-consistent with images

Complexity and computability Discretized Scene Volume 3 N voxels C colors All Scenes (CN3) True Scene Photo-Consistent Scenes

Issues Theoretical Questions Practical Questions Identify class of all photo-consistent scenes Practical Questions How do we compute photo-consistent models?

Voxel coloring solutions 1. C=2 (shape from silhouettes) Volume intersection [Baumgart 1974] For more info: Rapid octree construction from image sequences. R. Szeliski, CVGIP: Image Understanding, 58(1):23-32, July 1993. (this paper is apparently not available online) or W. Matusik, C. Buehler, R. Raskar, L. McMillan, and S. J. Gortler, Image-Based Visual Hulls, SIGGRAPH 2000 ( pdf 1.6 MB ) 2. C unconstrained, viewpoint constraints Voxel coloring algorithm [Seitz & Dyer 97] 3. General Case Space carving [Kutulakos & Seitz 98]

Reconstruction from Silhouettes (C = 2) Binary Images Approach: Backproject each silhouette Intersect backprojected volumes

Volume intersection Reconstruction Contains the True Scene But is generally not the same In the limit (all views) get visual hull Complement of all lines that don’t intersect S

Voxel algorithm for volume intersection O(MN^3) Color voxel black if on silhouette in every image for M images, N3 voxels Don’t have to search 2N3 possible scenes! O( ? ),

Properties of Volume Intersection Pros Easy to implement, fast Accelerated via octrees [Szeliski 1993] or interval techniques [Matusik 2000] Cons No concavities Reconstruction is not photo-consistent Requires identification of silhouettes

Voxel Coloring Solutions 1. C=2 (silhouettes) Volume intersection [Baumgart 1974] 2. C unconstrained, viewpoint constraints Voxel coloring algorithm [Seitz & Dyer 97] For more info: http://www.cs.washington.edu/homes/seitz/papers/ijcv99.pdf 3. General Case Space carving [Kutulakos & Seitz 98]

Voxel Coloring Approach 1. Choose voxel Color if consistent (standard deviation of pixel colors below threshold) 2. Project and correlate Visibility Problem: in which images is each voxel visible?

Depth Ordering: visit occluders first! Layers Scene Traversal Condition: depth order is the same for all input views

Panoramic Depth Ordering Cameras oriented in many different directions Planar depth ordering does not apply

Panoramic Depth Ordering Layers radiate outwards from cameras

Panoramic Layering Layers radiate outwards from cameras

Panoramic Layering Layers radiate outwards from cameras

Compatible Camera Configurations Depth-Order Constraint Scene outside convex hull of camera centers Outward-Looking cameras inside scene Inward-Looking cameras above scene

Calibrated Image Acquisition Selected Dinosaur Images Calibrated Turntable 360° rotation (21 images) Selected Flower Images

Voxel Coloring Results Dinosaur Reconstruction 72 K voxels colored 7.6 M voxels tested 7 min. to compute on a 250MHz SGI Flower Reconstruction 70 K voxels colored 7.6 M voxels tested 7 min. to compute on a 250MHz SGI

Limitations of Depth Ordering A view-independent depth order may not exist p q Need more powerful general-case algorithms Unconstrained camera positions Unconstrained scene geometry/topology

Voxel Coloring Solutions 1. C=2 (silhouettes) Volume intersection [Baumgart 1974] 2. C unconstrained, viewpoint constraints Voxel coloring algorithm [Seitz & Dyer 97] 3. General Case Space carving [Kutulakos & Seitz 98] For more info: http://www.cs.washington.edu/homes/seitz/papers/kutu-ijcv00.pdf

Space Carving Algorithm Initialize to a volume V containing the true scene Carve if not photo-consistent Choose a voxel on the current surface Project to visible input images Image 1 Image N …... Space Carving Algorithm Repeat until convergence

Which shape do you get? Photo Hull V V True Scene The Photo Hull is the UNION of all photo-consistent scenes in V It is a photo-consistent scene reconstruction Tightest possible bound on the true scene

Space Carving Algorithm Basic algorithm is unwieldy Complex update procedure Alternative: Multi-Pass Plane Sweep Efficient, can use texture-mapping hardware Converges quickly in practice Easy to implement Results Algorithm

Space Carving Results: African Violet Input Image (1 of 45) Reconstruction Reconstruction Reconstruction

Space Carving Results: Hand Input Image (1 of 100) Views of Reconstruction

Properties of Space Carving Pros Voxel coloring version is easy to implement, fast Photo-consistent results No smoothness prior Cons Bulging

Improvements Unconstrained camera viewpoints Evolving a surface Space carving [Kutulakos & Seitz 98] Evolving a surface Level sets [Faugeras & Keriven 98] More recent work by Pons et al. Global optimization Graph cut approaches [Kolmogoriv & Zabih, ECCV 2002] [Vogiatzis et al., PAMI 2007] Modeling shiny (and other reflective) surfaces e.g., Zickler et al., Helmholtz Stereopsis

Questions? 2-minute break

Reconstructing Building Interiors from Images Yasutaka Furukawa Brian Curless Steven M. Seitz University of Washington, Seattle, USA Richard Szeliski Microsoft Research, Redmond, USA

Reconstruction & Visualization of Architectural Scenes Manual (semi-automatic) Google Earth & Virtual Earth Façade [Debevec et al., 1996] CityEngine [Müller et al., 2006, 2007] Automatic Ground-level images [Cornelis et al., 2008, Pollefeys et al., 2008] Aerial images [Zebedin et al., 2008] Google Earth Virtual Earth Müller et al. Zebedin et al.

Reconstruction & Visualization of Architectural Scenes Manual (semi-automatic) Google Earth & Virtual Earth Façade [Debevec et al., 1996] CityEngine [Müller et al., 2006, 2007] Automatic Ground-level images [Cornelis et al., 2008, Pollefeys et al., 2008] Aerial images [Zebedin et al., 2008] Google Earth Virtual Earth Müller et al. Zebedin et al.

Reconstruction & Visualization of Architectural Scenes Manual (semi-automatic) Google Earth & Virtual Earth Façade [Debevec et al., 1996] CityEngine [Müller et al., 2006, 2007] Automatic Ground-level images [Cornelis et al., 2008, Pollefeys et al., 2008] Aerial images [Zebedin et al., 2008] Google Earth Virtual Earth Müller et al. Zebedin et al.

Reconstruction & Visualization of Architectural Scenes Little attention paid to indoor scenes Google Earth Virtual Earth Müller et al. Zebedin et al.

Our Goal Fully automatic system for indoors/outdoors Reconstructs a simple 3D model from images Provides real-time interactive visualization

What are the challenges?

Challenges - Reconstruction Multi-view stereo (MVS) typically produces a dense model We want the model to be Simple for real-time interactive visualization of a large scene (e.g., a whole house) Accurate for high-quality image-based rendering

Challenges - Reconstruction Multi-view stereo (MVS) typically produces a dense model We want the model to be Simple for real-time interactive visualization of a large scene (e.g., a whole house) Accurate for high-quality image-based rendering Simple mode is effective for compelling visualization

Challenges – Indoor Reconstruction Texture-poor surfaces Remove bullets and put text behing each photo

Challenges – Indoor Reconstruction Texture-poor surfaces Complicated visibility Remove bullets and put text behing each photo

Challenges – Indoor Reconstruction Texture-poor surfaces Complicated visibility Remove bullets and put text behing each photo Prevalence of thin structures (doors, walls, tables)

System pipeline Images Images

System pipeline Structure-from-Motion Images Bundler by Noah Snavely Structure from Motion for unordered image collections http://phototour.cs.washington.edu/bundler/ Images

System pipeline Images SFM

System pipeline Multi-view Stereo Images SFM PMVS by Yasutaka Furukawa and Jean Ponce Patch-based Multi-View Stereo Software http://grail.cs.washington.edu/software/pmvs/ Images SFM

System pipeline Images SFM MVS

Manhattan-world Stereo System pipeline Manhattan-world Stereo [Furukawa et al., CVPR 2009] Images SFM MVS

Manhattan-world Stereo System pipeline Manhattan-world Stereo [Furukawa et al., CVPR 2009] Images SFM MVS

Manhattan-world Stereo System pipeline Manhattan-world Stereo [Furukawa et al., CVPR 2009] Images SFM MVS

Manhattan-world Stereo System pipeline Manhattan-world Stereo [Furukawa et al., CVPR 2009] Images SFM MVS

Manhattan-world Stereo System pipeline Manhattan-world Stereo [Furukawa et al., CVPR 2009] Images SFM MVS

Manhattan-world Stereo System pipeline Manhattan-world Stereo [Furukawa et al., CVPR 2009] Images SFM MVS

System pipeline Images SFM MVS MWS

Manhattan-World Stereo (MWS) 0. Assume that surfaces are oriented along three mutually orthogonal directions Axis-align model by detecting dominant orientations Generate hypothesis planes for each direction Label each pixel in each image with a plane (graph cuts)

Finding hypothesis planes

Example Input image Computed depth map Mesh model Texture-mapped model

Manhattan-World Stereo (MWS) Video

Axis-aligned depth map merging System pipeline Axis-aligned depth map merging (our contribution) Images SFM MVS MWS

Rendering: simple view-dependent texture mapping System pipeline Rendering: simple view-dependent texture mapping Images SFM MVS MWS Merging

Outline System pipeline (system contribution) Algorithmic details (technical contribution) Experimental results Conclusion and future work

Axis-aligned Depth-map Merging Basic framework is similar to volumetric MRF [Vogiatzis 2005, Sinha 2007, Zach 2007, Hernández 2007] Hernandez is the name

Axis-aligned Depth-map Merging Basic framework is similar to volumetric MRF [Vogiatzis 2005, Sinha 2007, Zach 2007, Hernández 2007] First explain the algorithm then differences from competing approaches.

Axis-aligned Depth-map Merging Basic framework is similar to volumetric MRF [Vogiatzis 2005, Sinha 2007, Zach 2007, Hernández 2007] First explain the algorithm then differences from competing approaches.

Axis-aligned Depth-map Merging Basic framework is similar to volumetric MRF [Vogiatzis 2005, Sinha 2007, Zach 2007, Hernández 2007] First explain the algorithm then differences from competing approaches.

Axis-aligned Depth-map Merging Basic framework is similar to volumetric MRF [Vogiatzis 2005, Sinha 2007, Zach 2007, Hernández 2007] First explain the algorithm then differences from competing approaches.

Axis-aligned Depth-map Merging Basic framework is similar to volumetric MRF [Vogiatzis 2005, Sinha 2007, Zach 2007, Hernández 2007] First explain the algorithm then differences from competing approaches.

Key Feature 1 - Penalty terms Binary penalty

Key Feature 1 - Penalty terms

Key Feature 1 - Penalty terms

Axis-aligned Depth-map Merging Align voxel grid with the dominant axes Put here how typical approaches do, and why they do not work Talk about 4d neighborhood, Put texts at the bottom of the figures

Axis-aligned Depth-map Merging Align voxel grid with the dominant axes Data term (unary) Put here how typical approaches do, and why they do not work Talk about 4d neighborhood, Put texts at the bottom of the figures

Axis-aligned Depth-map Merging Align voxel grid with the dominant axes Data term (unary) Put here how typical approaches do, and why they do not work Talk about 4d neighborhood, Put texts at the bottom of the figures

Axis-aligned Depth-map Merging Align voxel grid with the dominant axes Data term (unary) Put here how typical approaches do, and why they do not work Talk about 4d neighborhood, Put texts at the bottom of the figures

Axis-aligned Depth-map Merging Align voxel grid with the dominant axes Data term (unary) Smoothness (binary) Put here how typical approaches do, and why they do not work Talk about 4d neighborhood, Put texts at the bottom of the figures

Axis-aligned Depth-map Merging Align voxel grid with the dominant axes Data term (unary) Smoothness (binary) Put here how typical approaches do, and why they do not work Talk about 4d neighborhood, Put texts at the bottom of the figures

Axis-aligned Depth-map Merging Align voxel grid with the dominant axes Data term (unary) Smoothness (binary) Graph-cuts Put here how typical approaches do, and why they do not work Talk about 4d neighborhood, Put texts at the bottom of the figures

Key Feature 2 – Regularization For large scenes, data info are not complete

Key Feature 2 – Regularization For large scenes, data info are not complete Typical volumetric MRFs bias to general minimal surface [Boykov and Kolmogorov, 2003] We bias to piece-wise planar axis-aligned for architectural scenes

Key Feature 2 – Regularization

Key Feature 2 – Regularization

Key Feature 2 – Regularization

Key Feature 2 – Regularization

Key Feature 2 – Regularization

Key Feature 2 – Regularization Same energy (ambiguous)

Key Feature 2 – Regularization Same energy (ambiguous) Data penalty: 0

Key Feature 2 – Regularization Same energy (ambiguous) Data penalty: 0 Data penalty: 0 Smoothness penalty: Data penalty: 0 Smoothness penalty: 24

Key Feature 2 – Regularization shrinkage

Key Feature 2 – Regularization Axis-aligned neighborhood + Potts model Ambiguous Break ties with the minimum-volume solution Piece-wise planar axis-aligned model

Key Feature 3 – Sub-voxel accuracy

Key Feature 3 – Sub-voxel accuracy

Key Feature 3 – Sub-voxel accuracy

Key Feature 3 – Sub-voxel accuracy

Summary of Depth-map Merging For a simple and axis-aligned model Explicit regularization in binary Axis-aligned neighborhood system & minimum-volume solution For an accurate model Sub-voxel refinement

Outline System pipeline (system contribution) Algorithmic details (technical contribution) Experimental results Conclusion and future work

Kitchen - 22 images 1364 triangles hall - 97 images 3344 triangles house - 148 images 8196 triangles gallery - 492 images 8302 triangles

Demo

Questions?