Perceptual real-time 2D-to-3D conversion using cue fusion

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

For Internal Use Only. © CT T IN EM. All rights reserved. 3D Reconstruction Using Aerial Images A Dense Structure from Motion pipeline Ramakrishna Vedantam.
Analysis of Contour Motions Ce Liu William T. Freeman Edward H. Adelson Computer Science and Artificial Intelligence Laboratory Massachusetts Institute.
Joint Optimisation for Object Class Segmentation and Dense Stereo Reconstruction Ľubor Ladický, Paul Sturgess, Christopher Russell, Sunando Sengupta, Yalin.
Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image Where does the.
1.Introduction 2.Article [1] Real Time Motion Capture Using a Single TOF Camera (2010) 3.Article [2] Real Time Human Pose Recognition In Parts Using a.
GrabCut Interactive Image (and Stereo) Segmentation Carsten Rother Vladimir Kolmogorov Andrew Blake Antonio Criminisi Geoffrey Cross [based on Siggraph.
Tracking Multiple Occluding People by Localizing on Multiple Scene Planes Professor :王聖智 教授 Student :周節.
Chapter 8: Vision in three dimensions Basic issue: How do we construct a three-dimension visual experience from two- dimensional visual input? Important.
ICG Professor Horst Cerjak, Thomas Pock Variational Methods for 3D Reconstruction Thomas Pock 1, Chrisopher Zach 2 and Horst Bischof 1 1 Institute.
Boundary matting for view synthesis Samuel W. Hasinoff Sing Bing Kang Richard Szeliski Computer Vision and Image Understanding 103 (2006) 22–32.
A Novel 2D-to-3D Conversion System Using Edge Information IEEE Transactions on Consumer Electronics 2010 Chao-Chung Cheng Chung-Te li Liang-Gee Chen.
Copyright  Philipp Slusallek Cs fall IBR: Model-based Methods Philipp Slusallek.
3-D Depth Reconstruction from a Single Still Image 何開暘
Effects of Viewing Geometry on Combination of Disparity and Texture Gradient Information Michael S. Landy Martin S. Banks James M. Hillis.
High-Quality Video View Interpolation
High Dynamic Range Imaging: Spatially Varying Pixel Exposures Shree K. Nayar, Tomoo Mitsunaga CPSC 643 Presentation # 2 Brien Flewelling March 4 th, 2009.
Human Visual System Lecture 3 Human Visual System – Recap
High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning Jeff Michels Ashutosh Saxena Andrew Y. Ng Stanford University ICML 2005.
Visual motion Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.
Accurate, Dense and Robust Multi-View Stereopsis Yasutaka Furukawa and Jean Ponce Presented by Rahul Garg and Ryan Kaminsky.
Stereo matching Class 10 Read Chapter 7 Tsukuba dataset.
Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.
A Bayesian Approach For 3D Reconstruction From a Single Image
Krzysztof Templin 1,2 Piotr Didyk 2 Tobias Ritschel 3 Elmar Eisemann 3 Karol Myszkowski 2 Hans-Peter Seidel 2 Apparent Resolution Enhancement for Animations.
THE UNIVERSITY OF BRITISH COLUMBIA Random Forests-Based 2D-to- 3D Video Conversion Presenter: Mahsa Pourazad M. Pourazad, P. Nasiopoulos, and A. Bashashati.
A General Framework for Tracking Multiple People from a Moving Camera
3D SLAM for Omni-directional Camera
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
Dynamic 3D Scene Analysis from a Moving Vehicle Young Ki Baik (CV Lab.) (Wed)
#MOTION ESTIMATION AND OCCLUSION DETECTION #BLURRED VIDEO WITH LAYERS
1 Perception, Illusion and VR HNRS 299, Spring 2008 Lecture 8 Seeing Depth.
Depth Perception and Visualization Matt Williams From:
Visual motion Many slides adapted from S. Seitz, R. Szeliski, M. Pollefeys.
Stereo Many slides adapted from Steve Seitz.
Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.
How natural scenes might shape neural machinery for computing shape from texture? Qiaochu Li (Blaine) Advisor: Tai Sing Lee.
Learning the Appearance and Motion of People in Video Hedvig Sidenbladh, KTH Michael Black, Brown University.
Stereo Viewing Mel Slater Virtual Environments
CSE 185 Introduction to Computer Vision Stereo. Taken at the same time or sequential in time stereo vision structure from motion optical flow Multiple.
Depth Perception and Perceptional Illusions. Depth Perception The use of visual cues to perceive the distance or three-dimensional characteristics of.
Evaluating Perceptual Cue Reliabilities Robert Jacobs Department of Brain and Cognitive Sciences University of Rochester.
CDVS on mobile GPUs MPEG 112 Warsaw, July Our Challenge CDVS on mobile GPUs  Compute CDVS descriptor from a stream video continuously  Make.
Demosaicking for Multispectral Filter Array (MSFA)
Journal of Visual Communication and Image Representation
Perception and VR MONT 104S, Fall 2008 Lecture 8 Seeing Depth
Stereo Vision Local Map Alignment for Robot Environment Mapping Computer Vision Center Dept. Ciències de la Computació UAB Ricardo Toledo Morales (CVC)
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
Cues We Use To Infer Depth Sensation and Perception.
Motion Segmentation at Any Speed Shrinivas J. Pundlik Department of Electrical and Computer Engineering, Clemson University, Clemson, SC.
How good are you are judging distance?. We are learning about...We are learning how to... Perceiving the world visually Depth perception Binocular depth.
Learning Image Statistics for Bayesian Tracking Hedvig Sidenbladh KTH, Sweden Michael Black Brown University, RI, USA
Algorithm For Image Flow Extraction- Using The Frequency Domain
Real-Time Soft Shadows with Adaptive Light Source Sampling
A Novel 2D-to-3D Conversion System Using Edge Information
Summary of “Efficient Deep Learning for Stereo Matching”
Prof. Riyadh Al_Azzawi F.R.C.Psych
Common Classification Tasks
Perception Chapter 8-3.
Coding Approaches for End-to-End 3D TV Systems
The Open World of Micro-Videos
Visual Perceptions: Motion, Depth, Form
Uncalibrated Geometry & Stratification
Prof. Riyadh Al_Azzawi F.R.C.Psych
How do we maintain a stable perceptual world?
Analysis of Contour Motions
Prof. Riyadh Al_Azzawi F.R.C.Psych
Occlusion and smoothness probabilities in 3D cluttered scenes
Week 5 Cecilia La Place.
Presentation transcript:

Perceptual real-time 2D-to-3D conversion using cue fusion Thomas Leimkühler1 Petr Kellnhofer1 Tobias Ritschel2 Karol Myszkowski1 Hans-Peter Seidel1 1 2 20+5 minutes

Stereo 3D Stereo 3D has become a significant part of visual media production 3D-Television has arrived Problem 1: Viewing discomfort Solution: Careful content production & improved hardware technology Problem 2: Increased production costs compared to 2D Solution: 2D-to-3D conversion Image Source: http://vr-zone.com/articles/sony-and-panasonic-to-use-lgs-3d-technology-in-their-tvs/15416.html Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

2D-to-3D Conversion Least expensive method for producing Stereo 3D Only method to deal with 2D legacy content Ideally: Real-time performance = On-the-fly conversion Mono Image Disparity Map Stereo Image Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

2D-to-3D Conversion Massive research on accurate 3D reconstruction Very hard problem, far from being solved Higher-quality systems take minutes per frame Exact 3D reconstruction not necessary for Stereo 3D Exploit limits of human stereo perception Low-pass filtering in space and time, except at luminance discontinuities [Kane et al. 2014, Kellnhofer et al. 2015] Relax the problem Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Plausible 2D-to-3D Conversion Mono Input ≠ Ground Truth Depth Distorted Depth ≡ Stereo from Ground Truth Depth Stereo from Distorted Depth Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Monocular Depth Cues … Aerial Perspective Defocus Perspective Occlusion Motion Parallax Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

The Idea Use several monocular cues to produce binocular disparity Wish list Real-time system: Efficient inference, GPU processing Robust fusion: Resolve contradicting cues Spatial and temporal coherence: Long-range exchange of information Probabilistic model: Per-pixel normal distributions of disparity Confidence-aware processing Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

1. Learning Disparity Priors Pipeline Ω 1. Learning Disparity Priors 2. Cue Extraction 3. Cue Fusion 4. Stereo Image Generation Pre-process Runtime Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Disparity Priors Acquired from our own stereo database Conditioned on Publicly available Conditioned on Scene class Close-up, Coast, Forest, Indoor, Inside City, Mountain, Open Country, Portrait, Street, Tall Buildings Location in the image plane Appearance SVM trained for scene classification “Close-up” “Forest” Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Disparity Priors Appearance Samples Mean Disparity Confidence “Open Country” “Portrait” Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

1. Learning Disparity Priors Pipeline Ω 1. Learning Disparity Priors 2. Cue Extraction 3. Cue Fusion 4. Stereo Image Generation Pre-process Runtime Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Defocus Input Laplacian Pyramid Disparity Mean Confidence Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Aerial Perspective Input Disparity Mean Confidence Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Vanishing Points Input Line Accumulation Disparity Mean Confidence Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Occlusion Input T-junctions Filter Bank Disparity Mean Confidence Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Motion Input Disparity Mean Confidence Optical flow Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

1. Learning Disparity Priors Pipeline Ω 1. Learning Disparity Priors 2. Cue Extraction 3. Cue Fusion 4. Stereo Image Generation Pre-process Runtime Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Step 1: Maximum Likelihood Estimation Cue Mean Disparity 𝜇 MLE 𝒙 = 1 𝑍(𝒙) 𝑖=1 𝑛 𝑐 𝜎 𝑖 −2 (𝒙) 𝜇 𝑖 𝒙 Cue Confidence Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Step 2: Maximum a Posteriori Estimation Cue Evidence 𝜇 MAP 𝒙 = 1 𝑍 𝒙 𝜎 0 −2 𝒙 𝜇 0 𝒙 + 𝑖=1 𝑛 𝑐 𝜎 𝑖 −2 (𝒙) 𝜇 𝑖 𝒙 Prior Evidence Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Step 3: Robust Estimation Far Near Prior Aerial Persp. Defocus Van. Point Occlusion Motion MAP estimation Robust MAP estimation Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Step 4: Pairwise Estimation Robust MAP disparity 𝜇 𝒙 = 1 𝑍 𝒙 Ω 𝑣(𝒙,𝒚) 𝜎 𝑀𝐴𝑃 −2 𝒚 𝜇 𝑀𝐴𝑃 𝒚 d𝒚 Entire space-time domain Robust MAP confidence Relate v to perceptual findings 𝒩 𝒙−𝒚 , 𝜎 𝑑 𝒩 𝐼(𝒙)−𝐼(𝒚) , 𝜎 𝑟 Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

1. Learning Disparity Priors Pipeline Ω 1. Learning Disparity Priors 2. Cue Extraction 3. Cue Fusion 4. Stereo Image Generation Pre-process Runtime Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Prior “Tall Buildings” Results Vanishing Point Defocus Prior “Mountain” Prior “Tall Buildings” Occlusion Prior “Forest” Aerial Perspective Aerial Perspective Occlusion Vanishing Point Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Results Prior “Forest” Aerial Perspective Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Results Defocus Motion Aerial Perspective Prior “Street” Motion Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Evaluation Perceptual study Quantitative evaluation We can outperform real-time 2D-to-3D conversion systems We can achieve similar (and better) user preference compared to offline methods Quantitative evaluation Very similar results across tested methods No reliable quality metric for Stereo 3D content exists Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Conclusion Real-time 2D-to-3D conversion can be successful, if We aim at reconstructing plausible disparity Multiple sources of information are combined We use a simple, yet expressive probabilistic model allowing for parallel inference Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016

Results Gallery & Prior Database http://resources.mpi-inf.mpg.de/StereoCueFusion Acknowledgments Adam Laskowski, Dushyant Mehta, Elena Arabadzhiyska, Krzysztof Templin, and Waqar Khan Contact tleimkueh@mpi-inf.mpg.de Thank you! Leimkühler et al.: Perceptual real-time 2D-to-3D conversion using cue fusion. GI 2016, Victoria/Canada, June 1st – 3rd, 2016