Recovering Surface Layout from a Single Image D. Hoiem, A.A. Efros, M. Hebert Robotics Institute, CMU Presenter: Derek Hoiem CS 598, Spring 2009 Jan 29,

Recovering Surface Layout from a Single Image D. Hoiem, A.A. Efros, M. Hebert Robotics Institute, CMU Presenter: Derek Hoiem CS 598, Spring 2009 Jan 29, 2009

Why worry about 3d scenes?

Reason 1: We may want to interact with the scene NavigationManipulation

4 Reason 2: We need context

2D Object Detection

What the 2D Detector Sees

Computers need context too True Detection True Detections Missed False Detections Local Detector: [Dalal-Triggs 2005]

9 Context in Image Space [Kumar Hebert 2005] [Torralba Murphy Freeman 2004] [He Zemel Cerreira-Perpiñán 2004]

We need 3d info to reason about 3d relationships Close Not Close

How to represent scene space?

Holistic Scene Space: “Gist” Oliva & Torralba 2001 Torralba & Oliva 2002

How to represent scene space? Depth Map Saxena, Chung & Ng 2005, 2007

Gibson’s Surface Layout slide from Aude Oliva Gibson: “The elementary impressions of a visual world are those of surface and edge.” The Perception of the Visual World (1950) Focus on texture gradients

Surface Layout (Gibson cont.) slide from Aude Oliva Gibson’s Surface Layout

Marr’s 2½D Sketch Marr’s 2½-D Sketch Figs from Aude Oliva slide

Surface Layout (this paper) Goal: Label image into 7 Geometric Classes: Support Vertical – Planar: facing Left (  ), Center ( ), Right (  ) – Non-planar: Solid (X), Porous or wiry (O) Sky

Our Main Challenge Recovering 3D geometry from single 2D projection Infinite number of possible solutions! …

Our World is Structured Abstract World Our World Image Credit (left): F. Cunin and M.J. Sailor, UCSD

Most Early Work Tried to Manually Specify the Structure Hansen & Riseman 1978 (VISIONS) Barrow & Tenenbaum 1978 (Intrinsic Images) Brooks 1979 (ACRONYM) Marr 1982 (2½ D Sketch) Ohta & Kanade 1978 Guzman 1968

Learn the Structure of the World …

Infer Most Likely Scene Unlikely Likely

1. Use All Available Cues Vanishing points, lines Color, texture, image location Texture gradient

Use All Available Cues

2. Get Good Spatial Support 50x50 Patch

Image Segmentation Single segmentation won’t work Solution: multiple segmentations …

… … For each segment: - Get P(good segment | data) P(label | good segment, data) Labeling Segments

Image Labeling … Labeled Segmentations Labeled Pixels

30 … Gray? High in Image? Many Long Lines? Yes No Yes Very High Vanishing Point? High in Image? Smooth?Green? Blue? Yes No Yes Decision Trees + Adaboost Ground Vertical Sky Collins et al. 2002

Surface Confidence Maps P(Support)P(Vertical)P(Sky) P(Planar Left)P(Planar Center)P(Planar Right) P(Non-Planar Porous) P(Non-Planar Solid) Test Image

Experiments: Input Image

Experiments: Ground Truth

Experiments: Our Result

Surface Estimates: Outdoor Input ImageGround TruthOur Result Avg. Accuracy Main Class: 88% Subclass: 62%

Input Image Ground TruthOur Result Surface Estimates: Outdoor

Surface Estimates: Paintings Input Image Our Result

Surface Estimates: Indoor Avg. Accuracy Main Class: 93% Subclass: 76% Input ImageGround TruthOur Result

Failures: Reflections and Shadows Input Image Our Result

Average Accuracy Main Class: 88% Subclasses: 61%

Importance of Many Cues AllPosition Only Color Only Texture Only Perspective Only Main 88% 83%72%80%68% Subclass 61% 43% 55%52% AllAll But Position All But Color All But Texture All But Perspective Main 88% 84%87% 88% Subclass 61% 60% 58%57%

Importance of Many Cues

Spatial Support Matters

Automatic Photo Popup Labeled Image Fit Ground-Vertical Boundary with Line Segments Form Segments into Polylines Cut and Fold Final Pop-up Model [Hoiem Efros Hebert 2005]

Surfaces Not Enough – Need Occlusion Reasoning ImageSurface Labels 3D Model

Surfaces + Occlusions + Objects = Better 3D Models Surfaces Occlusions Objects and Viewpoint Support Horizon, Object Maps Surface Maps Depth, Boundaries Boundaries Horizon, Object Maps Viewpoint/Size Reasoning

video 2

Contributions General principles – Learn the structure of the world – Use all available cues – Spatial support matters – Use redundancy to deal with unreliable processes (segmentation) Results include entire spread of failure and success First work to convincingly demonstrate single-view reconstruction

Criticisms Still just 2D pattern recognition? Not clear how to generalize to arbitrary 3d angles Restricted to visible portion of scene Coarse layout: not clear if applicable to personal space or object shapes

Ideas for improvement Try improving features (e.g., add bag of words) Extend to characterize object shapes? Combine this surface-based layout with depth estimates from Saxena et al.

Discussion Use for context (Eamon) Multiple segmentations (Duan, Sanketh) Subcategories (Duan, Sanketh) Global info, use of object knowledge (Binbin) Combination with multiview cues (Mani) Landmarks (Gang)

Thank you

Things to cover when you present Background Overview of method Results Things you like Things you don’t Ideas for improvement Address bulletin board postings

Recovering Surface Layout from a Single Image D. Hoiem, A.A. Efros, M. Hebert Robotics Institute, CMU Presenter: Derek Hoiem CS 598, Spring 2009 Jan 29,

Similar presentations

Presentation on theme: "Recovering Surface Layout from a Single Image D. Hoiem, A.A. Efros, M. Hebert Robotics Institute, CMU Presenter: Derek Hoiem CS 598, Spring 2009 Jan 29,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Recovering Surface Layout from a Single Image D. Hoiem, A.A. Efros, M. Hebert Robotics Institute, CMU Presenter: Derek Hoiem CS 598, Spring 2009 Jan 29,

Similar presentations

Presentation on theme: "Recovering Surface Layout from a Single Image D. Hoiem, A.A. Efros, M. Hebert Robotics Institute, CMU Presenter: Derek Hoiem CS 598, Spring 2009 Jan 29,"— Presentation transcript:

Similar presentations

About project

Feedback