THE UNIVERSITY OF BRITISH COLUMBIA Random Forests-Based 2D-to- 3D Video Conversion Presenter: Mahsa Pourazad M. Pourazad, P. Nasiopoulos, and A. Bashashati.

Slides:

Advertisements

Similar presentations

Tae-Shick Wang; Kang-Sun Choi; Hyung-Seok Jang; Morales, A.W.; Sung-Jea Ko; IEEE Transactions on Consumer Electronics, Vol. 56, No. 2, May 2010 ENHANCED.

Advertisements

Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.

CS 376b Introduction to Computer Vision 04 / 21 / 2008 Instructor: Michael Eckmann.

Depth Cues Pictorial Depth Cues: aspects of 2D images that imply depth

PERCEPTION Chapter 4.5. Gestalt Principles  Gestalt principles are based on the idea that the whole is greater than the sum of the parts.  These principles.

Unit 4: Sensation & Perception

PSYC 1000 Lecture 21. Selective Attention: Stroop.

Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.

Chapter 8: Vision in three dimensions Basic issue: How do we construct a three-dimension visual experience from two- dimensional visual input? Important.

ICME 2008 Huiying Liu, Shuqiang Jiang, Qingming Huang, Changsheng Xu.

A Novel 2D-to-3D Conversion System Using Edge Information IEEE Transactions on Consumer Electronics 2010 Chao-Chung Cheng Chung-Te li Liang-Gee Chen.

Advanced Computer Vision Introduction Goal and objectives To introduce the fundamental problems of computer vision. To introduce the main concepts and.

3-D Depth Reconstruction from a Single Still Image 何開暘

Imaging Science FundamentalsChester F. Carlson Center for Imaging Science Binocular Vision and The Perception of Depth.

DEPTH AND SIZE PERCEPTION Problems for Perceiving Depth and Size Oculomotor Cues Monocular Cues Binocular Disparity Size Constancy.

Virtual Control of Optical Axis of the 3DTV Camera for Reducing Visual Fatigue in Stereoscopic 3DTV Presenter: Yi Shi & Saul Rodriguez March 26, 2008.

A Self-Organizing Approach to Background Subtraction for Visual Surveillance Applications Lucia Maddalena and Alfredo Petrosino, Senior Member, IEEE.

High Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning Jeff Michels Ashutosh Saxena Andrew Y. Ng Stanford University ICML 2005.

A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 20, NO. 11, NOVEMBER 2011 Qian Zhang, King Ngi Ngan Department of Electronic Engineering, the Chinese university.

1 Angel: Interactive Computer Graphics 4E © Addison-Wesley 2005 Models and Architectures Ed Angel Professor of Computer Science, Electrical and Computer.

Reading Gregory 24 th Pinker 26 th. Seeing Depth What’s the big problem with seeing depth ?

Manhattan-world Stereo Y. Furukawa, B. Curless, S. M. Seitz, and R. Szeliski 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.

Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.

PRINCIPLES OF DESIGN PHOTOGRAPHY. BALANCE Visual center is above geometric center. Visual weight is determined by many variables Size Darkness – A strong.

3D/Multview Video. Outline Introduction 3D Perception and HVS 3D Displays 3D Video Representation Compression.

Perception Illusion A false representation of the environment

CAP4730: Computational Structures in Computer Graphics 3D Concepts.

Technology and Historical Overview. Introduction to 3d Computer Graphics  3D computer graphics is the science, study, and method of projecting a mathematical.

Real-Time Phase-Stamp Range Finder with Improved Accuracy Akira Kimachi Osaka Electro-Communication University Neyagawa, Osaka , Japan 1August.

Olga Zoidi, Anastasios Tefas, Member, IEEE Ioannis Pitas, Fellow, IEEE

By Andrea Rees. Gestalt Principles 1) Closure 2) Proximity 3) Similarity 4) Figure VISUAL PERCEPTION PRINCIPLES OVERVIEW Depth Principles Binocular 1)

Perception Introduction Pattern Recognition Image Formation

A General Framework for Tracking Multiple People from a Moving Camera

Flow Separation for Fast and Robust Stereo Odometry [ICRA 2009]

Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.

2004, 9/1 1 Optimal Content-Based Video Decomposition for Interactive Video Navigation Anastasios D. Doulamis, Member, IEEE and Nikolaos D. Doulamis, Member,

BY JESSIE PARKER VISUAL PERCEPTION PRINCIPLES. VISUAL PERCEPTION Visual perception is the ability to interpret the surrounding environment by processing.

CSC 461: Lecture 3 1 CSC461 Lecture 3: Models and Architectures  Objectives –Learn the basic design of a graphics system –Introduce pipeline architecture.

University of Toronto Aug. 11, 2004 Learning the “Epitome” of a Video Sequence Information Processing Workshop 2004 Vincent Cheung Probabilistic and Statistical.

Computer Vision Why study Computer Vision? Images and movies are everywhere Fast-growing collection of useful applications –building representations.

1 Introduction to Computer Graphics with WebGL Ed Angel Professor Emeritus of Computer Science Founding Director, Arts, Research, Technology and Science.

1 Perception, Illusion and VR HNRS 299, Spring 2008 Lecture 8 Seeing Depth.

Perception. The process of organizing and interpreting sensory information.

Road Scene Analysis by Stereovision: a Robust and Quasi-Dense Approach Nicolas Hautière 1, Raphaël Labayrade 2, Mathias Perrollaz 2, Didier Aubert 2 1.

A Region Based Stereo Matching Algorithm Using Cooperative Optimization Zeng-Fu Wang, Zhi-Gang Zheng University of Science and Technology of China Computer.

CSE 185 Introduction to Computer Vision Stereo. Taken at the same time or sequential in time stereo vision structure from motion optical flow Multiple.

Chapter 5 Multi-Cue 3D Model- Based Object Tracking Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical.

Lecture 6 Rasterisation, Antialiasing, Texture Mapping,

How Far Away Is It? Depth Perception

Journal of Visual Communication and Image Representation

Perception and VR MONT 104S, Fall 2008 Lecture 8 Seeing Depth

Anaglyph overview stereoscopic viewing technology.

An H.264-based Scheme for 2D to 3D Video Conversion Mahsa T. Pourazad Panos Nasiopoulos Rabab K. Ward IEEE Transactions on Consumer Electronics 2009.

AUDIO VIDEO SYSTEMS Prepared By :- KISHAN DOSHI ( ) PARAS BHRAMBHATT ( ) VAIBHAV SINGH THAKURALE ( )

1 2D TO 3D IMAGE AND VIDEO CONVERSION. INTRODUCTION The goal is to take already existing 2D content, and artificially produce the left and right views.

Presenting: Shlomo Ben-Shoshan, Nir Straze Supervisors: Dr. Ofer Hadar, Dr. Evgeny Kaminsky.

Local Stereo Matching Using Motion Cue and Modified Census in Video Disparity Estimation Zucheul Lee, Ramsin Khoshabeh, Jason Juang and Truong Q. Nguyen.

Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.

Perceptual real-time 2D-to-3D conversion using cue fusion

렌즈왜곡 관련 논문 - 기반 논문: R.Y. Tsai, An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision. Proceedings of IEEE Conference on Computer.

Padmasri Dr.BV Raju Institute Of Technology

Unit 4: Perceptual Organization and Interpretation

A Novel 2D-to-3D Conversion System Using Edge Information

Guillaume-Alexandre Bilodeau

Automatic Video Shot Detection from MPEG Bit Stream

3D TV TECHNOLOGY.

Common Classification Tasks

Chapter 6: Perception Pages

A Block Based MAP Segmentation for Image Compression

Presentation transcript:

THE UNIVERSITY OF BRITISH COLUMBIA Random Forests-Based 2D-to- 3D Video Conversion Presenter: Mahsa Pourazad M. Pourazad, P. Nasiopoulos, and A. Bashashati

22 Outline Introduction to 3D TV & 3D Content Motivation for 2D to 3D Video conversion Proposed 2D to 3D video conversion scheme Conclusions

33 Stereoscopic Dual Camera Image-Based Rendering Technique Stereo Video Stereo Video 3D Depth Range Camera 2D Video Depth Map Introduction to 3D TV & 3D content:

44 Industry is investing in 3D TV and broadcasting Hollywood already is investing in 3D Technology Are we ready for this?  No! One of the issues: lack of content  Converting existing 2D to 3D: Resell existing content (Movies, TV series, etc.) Motivation for 2D to 3D Video Conversion:

55 Sharpness, motion, occlusion, texture, perspective, and… 5 How it Works - 3D Perception

66 2D-to-3D Conversion Depth Map 2D Video 2D to 3D Video conversion : Monocular Depth Cues (Motion parallax, Sharpness, Occlusion and…) Proper integration of more monocular depth cues results in more accurate depth map estimate (imitating human brain system)

77 Depth Estimation for 2D Video Using motion: time view Disparity Vector Stereoscopic Cameras Right View Left View 2D Video Cameras Motion Vector time Idea: Motion vector resembles disparity

88 Motion-based 2D to 3D video conversion*: 2D video Motion Vectors (MVs) Motion Correction Camera Motion Correction Object-based Motion Correction Object-based Motion Estimation Non-Linear Transforming Model * * Pourazad, M.T., Nasiopoulos, P. and Ward, R.K. (2009) An H.264-based scheme for 2D to 3D video conversion. IEEE Transactions on Consumer Electronic, vol. 55, no. 2: Estimated Depth Map Main issue: Estimating depth information for static objects Near objects move faster across the retina than further objects do

9 Our Suggested Scheme: (integrating multiple monocular depth cues) Extracting Features Representing Monocular Depth Cues Motion Parallax 4x4 blocks 2D Video Implement a block matching technique between consecutive frames.

10 Motion Parallax Depth Cue: Finding disparity over time (motion vectors):  Implementing Depth Estimation Reference Software (designed for Multiview streams): F0F0 F1F1 F2F2 F3F3 F1F1 F2F2 F3F3 F4F4 F3F3 F4F4 F5F5 F6F6 ……… Virtual Camera (Left Camera) 2D Video Camera (Center Camera) F: 2D Video Frames Virtual Camera (Right Camera)  Estimated disparity over time for each 4x4 block represents Motion Parallax

11 Our Suggested Scheme: (integrating multiple monocular depth cues) Extracting Features Representing Monocular Depth Cues Motion Parallax Texture Variation 4x4 blocks 2D Video Face-texture of a textured material is more apparent when it is closer

12 Texture Variation Depth Cue: Applying Law’s texture energy masks to 4x4 blocks’ luma information as: L3L3L3L3 L3E3L3E3 L3S3L3S3 E3L3E3L3 E3E3E3E3 E3S3E3S3 S3L3S3L3 S3E3S3E3 S3S3S3S3 I: Luma information of each 4x4 block (Y) F: Law’s mask Law’s texture energy masks  Feature set with18 components represents texture variation depth cue for each 4x4 block

13 Our Suggested Scheme: (integrating multiple monocular depth cues) Extracting Features Representing Monocular Depth Cues Motion Parallax Texture Variation Haze 4x4 blocks 2D Video Distant objects visually appear less distinct and more bluish than objects nearby due to haze

14 Haze Depth Cue: Haze is reflected in the low frequency information of chroma (U & V):  Apply L 3 L 3 Law’s texture energy mask (local averaging) to 4x4 blocks’ Chroma information as: L3L3L3L3 C: Chroma information of each 4x4 block (U & V) F: Law’s mask  Feature set with 4 components represents haze depth cue for each 4x4 block

15 Our Suggested Scheme: (integrating multiple monocular depth cues) Extracting Features Representing Monocular Depth Cues Motion Parallax Texture Variation Haze Perspective 4x4 blocks 2D Video The more the lines converge, the farther away they appear to be Applying the Radon Transform to the luma information of each block (   {0 , 30 , 60 , 90 , 120 , 150  }).  Amplitude and phase of the most dominant edge are selected Feature set with 2 components

16 Our Suggested Scheme: (integrating multiple monocular depth cues) Extracting Features Representing Monocular Depth Cues Motion Parallax Texture Variation Haze Perspective Vertical Coordinate 4x4 blocks 2D Video In general the objects closer to the bottom boarder of the image are closer to the viewer Feature set includes vertical spatial coordinate of each 4x4 block (as a percentage of the frame’s height)

17 Our Suggested Scheme: (integrating multiple monocular depth cues) Extracting Features Representing Monocular Depth Cues Motion Parallax Texture Variation Haze Perspective Vertical Coordinate Sharpness 4x4 blocks 2D Video Closer objects appear sharper Sharpness of each 4x4 block is measured by implementing diagonal Laplacian method* * A. Thelen, S. Frey, S. Hirsch, and P. Hering, “Improvements in shape-from-focus for holographic reconstructions with regard to focus operators, neighborhood-size, and height value interpolation”, IEEE Trans.on Image Processing, Vol. 18, no. 1, pp , 2009

18 Our Suggested Scheme: (integrating multiple monocular depth cues) Extracting Features Representing Monocular Depth Cues Motion Parallax Texture Variation Haze Perspective Vertical Coordinate Sharpness Occlusion 4x4 blocks 2D Video The object which overlaps or partly obscures our view of another object, is closer. Extracting all feature sets for each 4x4 patch at three different image-resolution levels (1, 1/2, and 1/4).  Capture occlusion  Global accountable features

19 Our Suggested Scheme: (integrating multiple monocular depth cues) Extracting Features Representing Monocular Depth Cues Motion Parallax Texture Variation Haze Perspective Vertical Coordinate Sharpness Occlusion Random Forests (RF) Machine Learning Depth-Map Model Estimation 4x4 blocks 2D Video 81-dimensional feature vectors RF: A classification & regression technique which is a collection of individual Decision Trees (DTs)*  Randomly select the input feature vectors  Application: where DTs do not perform well on unseen test data individually, but the contribution of DTs perform well to unseen data * L. Breiman, and A. Cutler, “Random forest.” Machine Learning, 45, pp. 5–32, Training Set  input: feature vectors of 4x4 blocks with pixels mostly belonging to a common object of key frames  output: known depth values Test Set: 4x4 blocks of an unseen video

20 Our Suggested Scheme: (integrating multiple monocular depth cues) Extracting Features Representing Monocular Depth Cues Motion Parallax Texture Variation Haze Perspective Vertical Coordinate Sharpness Occlusion Depth-Map Model Estimation Estimated Depth Map Mean shift Image segmentation* Object-based depth information 4x4 blocks 2D Video Depth Map 81-dimensional feature vectors * D. Comaniciu, and P. Meer, “Mean Shift: A Robust Approach toward Feature Space Analysis,” IEEE Trans. Pattern Analysis Machine Intell., vol. 24, no. 5, pp , Random Forests (RF) Machine Learning

21 Experiments: Training sequences: Test sequences:

22 Results: 2D VideoAvailable Depth MapExisting Motion-based Technique Our Proposed Technique Subjective Test (ITU-R BT ):  18 people graded the stereo videos from 1 to 10

23 Conclusions: A new and efficient 2D to 3D video conversion method was presented. The method uses Random Forest Regression to estimate the depth map model based on multiple depth cues. Performance evaluations show that our approach outperforms a state of the art existing motion based method The subjective visual quality of our created 3D stream was also confirmed by watching the resulted 3D streams on a stereoscopic display. Our method is real-time and can be implemented at the receiver side without any burden into the network