A Bayesian Approach For 3D Reconstruction From a Single Image

Slides:

Advertisements

Similar presentations

Automatic Photo Pop-up Derek Hoiem Alexei A.Efros Martial Hebert Carnegie Mellon University.

Advertisements

Analysis of Contour Motions Ce Liu William T. Freeman Edward H. Adelson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.

SPONSORED BY SA2014.SIGGRAPH.ORG Annotating RGBD Images of Indoor Scenes Yu-Shiang Wong and Hung-Kuo Chu National Tsing Hua University CGV LAB.

High-Resolution Three- Dimensional Sensing of Fast Deforming Objects Philip Fong Florian Buron Stanford University This work supported by:

3D Object Recognition Pipeline Kurt Konolige, Radu Rusu, Victor Eruhmov, Suat Gedikli Willow Garage Stefan Holzer, Stefan Hinterstoisser TUM Morgan Quigley,

Hilal Tayara ADVANCED INTELLIGENT ROBOTICS 1 Depth Camera Based Indoor Mobile Robot Localization and Navigation.

Vision Based Control Motion Matt Baker Kevin VanDyke.

Qualifying Exam: Contour Grouping Vida Movahedi Supervisor: James Elder Supervisory Committee: Minas Spetsakis, Jeff Edmonds York University Summer 2009.

Uncertainty Representation. Gaussian Distribution variance Standard deviation.

Optimization & Learning for Registration of Moving Dynamic Textures Junzhou Huang 1, Xiaolei Huang 2, Dimitris Metaxas 1 Rutgers University 1, Lehigh University.

Structure from motion.

3-D Depth Reconstruction from a Single Still Image 何開暘

Multiple View Geometry : Computational Photography Alexei Efros, CMU, Fall 2005 © Martin Quinn …with a lot of slides stolen from Steve Seitz and.

Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.

1 Learning to Detect Natural Image Boundaries David Martin, Charless Fowlkes, Jitendra Malik Computer Science Division University of California at Berkeley.

CSCE 641 Computer Graphics: Image-based Modeling Jinxiang Chai.

Computational Vision Jitendra Malik University of California at Berkeley Jitendra Malik University of California at Berkeley.

Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.

High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning Jeff Michels Ashutosh Saxena Andrew Y. Ng Stanford University ICML 2005.

CSCE 641 Computer Graphics: Image-based Modeling (Cont.) Jinxiang Chai.

Stereo Guest Lecture by Li Zhang

Project 1 artifact winners Project 2 questions Project 2 extra signup slots –Can take a second slot if you’d like Announcements.

Heather Dunlop : Advanced Perception January 25, 2006

Multiple View Geometry : Computational Photography Alexei Efros, CMU, Fall 2006 © Martin Quinn …with a lot of slides stolen from Steve Seitz and.

Manhattan-world Stereo Y. Furukawa, B. Curless, S. M. Seitz, and R. Szeliski 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.

Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #15.

The Three R’s of Vision Jitendra Malik.

Linked Edges as Stable Region Boundaries* Michael Donoser, Hayko Riemenschneider and Horst Bischof This work introduces an unsupervised method to detect.

Announcements Project 1 artifact winners Project 2 questions

Structure from images. Calibration Review: Pinhole Camera.

1/20 Obtaining Shape from Scanning Electron Microscope Using Hopfield Neural Network Yuji Iwahori 1, Haruki Kawanaka 1, Shinji Fukui 2 and Kenji Funahashi.

SPIE'01CIRL-JHU1 Dynamic Composition of Tracking Primitives for Interactive Vision-Guided Navigation D. Burschka and G. Hager Computational Interaction.

A General Framework for Tracking Multiple People from a Moving Camera

CSCE 5013 Computer Vision Fall 2011 Prof. John Gauch

Dynamic 3D Scene Analysis from a Moving Vehicle Young Ki Baik (CV Lab.) (Wed)

3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: , Perspective Geometry Camera Model Stereo Triangulation 3D Reconstruction by.

Metrology 1.Perspective distortion. 2.Depth is lost.

Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.

EIE426-AICV 1 Computer Vision Filename: eie426-computer-vision-0809.ppt.

Single View Geometry Course web page: vision.cis.udel.edu/cv April 9, 2003  Lecture 20.

Learning the Appearance and Motion of People in Video Hedvig Sidenbladh, KTH Michael Black, Brown University.

Acquiring 3D models of objects via a robotic stereo head David Virasinghe Department of Computer Science University of Adelaide Supervisors: Mike Brooks.

A Region Based Stereo Matching Algorithm Using Cooperative Optimization Zeng-Fu Wang, Zhi-Gang Zheng University of Science and Technology of China Computer.

Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.

3D Sensing Camera Model Camera Calibration

A Flexible New Technique for Camera Calibration Zhengyou Zhang Sung Huh CSPS 643 Individual Presentation 1 February 25,

Probabilistic Graphical Models seminar 15/16 ( ) Haim Kaplan Tel Aviv University.

Computer Vision, CS766 Staff Instructor: Li Zhang TA: Yu-Chi Lai

Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.

3D Sensing 3D Shape from X Perspective Geometry Camera Model Camera Calibration General Stereo Triangulation 3D Reconstruction.

MASKS © 2004 Invitation to 3D vision Uncalibrated Camera Chapter 6 Reconstruction from Two Uncalibrated Views Modified by L A Rønningen Oct 2008.

GM-Carnegie Mellon Autonomous Driving CRL 1 TitleRobust Detection of Curbs Thrust AreaPerception Project LeadDavid Wettergreen, CMU Wende Zhang, GM Inna.

DISCRIMINATIVELY TRAINED DENSE SURFACE NORMAL ESTIMATION ANDREW SHARP.

Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Automatic 3D modelling of Architecture Anthony Dick 1 Phil Torr 2 Roberto Cipolla 1 1 Department of Engineering 2 Microsoft Research, University of Cambridge.

Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:

Learning Image Statistics for Bayesian Tracking Hedvig Sidenbladh KTH, Sweden Michael Black Brown University, RI, USA

Perceptual real-time 2D-to-3D conversion using cue fusion

Processing visual information for Computer Vision

3D Single Image Scene Reconstruction For Video Surveillance Systems

CS4670 / 5670: Computer Vision Kavita Bala Lec 27: Stereo.

Segmentation Based Environment Modeling Using a Single Image

Common Classification Tasks

A Bayesian Estimation of Building Shape using MCMC

Video Compass Jana Kosecka and Wei Zhang George Mason University

Analysis of Contour Motions

CMSC 426: Image Processing (Computer Vision)

Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu

Computing the Stereo Matching Cost with a Convolutional Neural Network

Presentation transcript:

A Bayesian Approach For 3D Reconstruction From a Single Image Presented By: Erick Delage Supervisor: Prof. Andrew Y. Ng AI Laboratory, Stanford University

Autonomous Monocular Vision Depth Reconstruction for Indoor Image Main problem presentation Can a robot reconstruct 3D from a single image? Erick Delage, Stanford University, 2005

Review of Publications Popular 3d reconstruction Stereo Vision (Trucco et Verri, 1998) Structure from Motion Single View 3d reconstruction Shape from Shading (Zhang et al., 1999) 3d Metrology (Criminisi et al., 2000) Our Goal To develop an autonomous algorithm that recovers 3D information from a single image in a complex environment Shape from stereovision (dependent on base line) State of the art in single view (shape from shading, Single view metrology) Limitations Compare to us leads to idea of learning prior knowledge for depth reconstruction Erick Delage, Stanford University, 2005

Simplification of the Problem Assumptions Image contains flat floor and walls Camera is parallel to the ground plane The camera is at a known height above the ground The image is obtained by perspective projection Our Theory: Given the floor boundary position, the 3D coordinates in an image of all points can be recovered Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 General Approach Show images that present the whole approach Orig -> detected boundary -> 3d recon Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 General Approach (2) Prior Knowledge about Indoor + Machine Learning Image Analysis Floor Boundary detection (Machine Learning) 3D reconstruction Show images that present the whole approach Orig -> detected boundary -> 3d recon Erick Delage, Stanford University, 2005

Floor Boundary Detection Magnitude of Image gradient Difference in chromatic space Input Image Difference from the floor color How can we combine these image features for floor boundary detection ? Erick Delage, Stanford University, 2005

Floor Boundary Detection Input Image Using Logistic Regression : (Martin, D. R., et al., 2002) Training Mask The model was trained using 25 labeled images of a diverse range of indoor environments on Stanford’s campus Develop and train a logistic model of the probability that a point in the image is part of the edge of the floor based on the evaluation of local feature functions Martin, D. R., Fowlkes, C. C., & Malik, J. (2002). Learning to Detect Natural Image Boundaries using Brightness and Texture. NIPS. Erick Delage, Stanford University, 2005

Floor Boundary Detection : Results Trade-off between accuracy and noise as detector threshold varies. Precision is the fraction of detections which are true positives, while recall is the fraction of positives that are detected Erick Delage, Stanford University, 2005

Floor Boundary Detection : Results Trade-off between accuracy and noise as detector threshold varies. Precision is the fraction of detections which are true positives, while recall is the fraction of positives that are detected Precision = “true positives” / “all positives” Recall = “true positives” / “all true’s” Erick Delage, Stanford University, 2005

Bayesian Inference on Floor Boundary Can we use prior knowledge about the structure of floors and their boundaries? Present image with little squares that explains how boundary evolve Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Bayesian Inference Y1 D1 C X1 Y3 D3 X3 YN DN XN … Di : Direction of the floor boundary in column i Yi : Position of floor boundary in column i Xi : Local image features C : Color of the floor Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Bayesian Inference D1 D1 D3 … DN Y1 Y1 Y3 … YN X1 X1 X3 … XN … C : initial distribution of variables Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Bayesian Inference D1 D1 D3 … DN Y1 Y1 Y3 … YN X1 X1 X3 … XN … C Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Bayesian Inference D1 D1 D3 … DN Y1 Y1 Y3 … YN X1 X1 X3 … XN … C Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Bayesian Inference D1 D1 D3 … DN Y1 Y1 Y3 … YN X1 X1 X3 … XN … C from the detection algorithm Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Bayesian Inference D1 D1 D3 … DN Y1 Y1 Y3 … YN X1 X1 X3 … XN … C Erick Delage, Stanford University, 2005

Training / Bayesian Inference 60 images of indoor environment in 8 different buildings of Stanford’s campus Leave-one-out cross-validation: train on 7 buildings, test on 1 Parameters for density models estimated from training data using Maximum Likelihood Exact inference on graph done using Viterbi-like algorithm Erick Delage, Stanford University, 2005

Results – Floor Boundary Detection Erick Delage, Stanford University, 2005

Results – Floor Boundary Detection Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 3D Reconstruction Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 3D Reconstruction Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 3D Reconstruction Extra Material: Exemples #1, #2, #3 Or at: http://www.stanford.edu/~edelage/indoor3drecon Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Performance Precision of floor boundary in segmentation Precision of floor boundary in 3d localization Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Conclusion Monocular 3d reconstruction is a good example of an ambiguous problem that can be resolved using prior knowledge about the domain The presented Bayesian network proves high efficiency in learning prior knowledge necessary for this application This is the first autonomous algorithm for depth recovery in a rich, textured indoor scene. Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Future Work Apply graphical modeling for more complex geometry. Formulate the problem in a form that scales precision performance with depth of objects. Embed this approach in real robot navigation problem (ex. RC car, night indoor navigation) Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 Questions ? Erick Delage, Stanford University, 2005

Erick Delage, Stanford University, 2005 References Criminisi, A., Reid, I., & Zisserman, A. (2000). Single View Metrology. IJCV, 40, 123-148. Martin, D. R., Fowlkes, C. C., & Malik, J. (2002). Learning to Detect Natural Image Boundaries using Brightness and Texture. NIPS. Trucco, E., & Verri, A. (1998). Introductory techniques for 3d computer vision. Prentice Hall. Zhang, R., Tsai, P.-S., Cryer, J. E., & M. Shah (1999). Shape from shading: a survey. IEEE Trans. On PAMI, 21, 690-706. Erick Delage, Stanford University, 2005