Implementation of a Biologically Inspired Stereoscopic Vision Model in C+CUDA Most of the depth perception processing is done in the visual cortex, mainly.

Slides:



Advertisements
Similar presentations
Chapter 4: The Visual Cortex and Beyond
Advertisements

The Primary Visual Cortex
Gabor Filter: A model of visual processing in primary visual cortex (V1) Presented by: CHEN Wei (Rosary) Supervisor: Dr. Richard So.
for image processing and computer vision
What is vision Aristotle - vision is knowing what is where by looking.
COMPUTATIONAL NEUROSCIENCE FINAL PROJECT – DEPTH VISION Omri Perez 2013.
Real-Time Accurate Stereo Matching using Modified Two-Pass Aggregation and Winner- Take-All Guided Dynamic Programming Xuefeng Chang, Zhong Zhou, Yingjie.
Chapter 6 Opener. Figure 6.1 The Euclidean geometry of the three-dimensional world turns into something quite different on the curved, two-dimensional.
Chapter 6 The Visual System
Higher Processing of Visual Information: Lecture III
2009/04/07 Yun-Yang Ma.  Overview  What is CUDA ◦ Architecture ◦ Programming Model ◦ Memory Model  H.264 Motion Estimation on CUDA ◦ Method ◦ Experimental.
What is Stereopsis? The process in visual perception that leads to the sensation of depth due to the slightly different perspectives that our two eyes.
Read Pinker article for Thurs.. Seeing in Stereo.
Color vision Different cone photo- receptors have opsin molecules which are differentially sensitive to certain wavelengths of light – these are the physical.
Binocular Disparity points nearer than horopter have crossed disparity
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 1 Computational Architectures in Biological.
3D Computer Vision and Video Computing 3D Vision Lecture 14 Stereo Vision (I) CSC 59866CD Fall 2004 Zhigang Zhu, NAC 8/203A
Chapter 10 The Central Visual System. Introduction Neurons in the visual system –Neural processing results in perception Parallel pathway serving conscious.
The visual system Lecture 1: Structure of the eye
A Performance and Energy Comparison of FPGAs, GPUs, and Multicores for Sliding-Window Applications From J. Fowers, G. Brown, P. Cooke, and G. Stitt, University.
Outline 1) Goal-Directed Feature Learning (Weber & Triesch, IJCNN 2009)‏ Task 4.1 Visual processing based on feature abstraction 2) Emergence of Disparity.
Copyright © 2007 Wolters Kluwer Health | Lippincott Williams & Wilkins Neuroscience: Exploring the Brain, 3e Chapter 10: The Central Visual System.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #15.
Lecture 11 Stereo Reconstruction I Lecture 11 Stereo Reconstruction I Mata kuliah: T Computer Vision Tahun: 2010.
Another viewpoint: V1 cells are spatial frequency filters
1 Computational Vision CSCI 363, Fall 2012 Lecture 26 Review for Exam 2.
Binocular Vision, Fusion, and Accommodation
Implementation in C+CUDA of Multi-Label Text Categorizers In automated multi-label text categorization problems with large numbers of labels, the training.
THE UNIVERSITY OF BRITISH COLUMBIA Random Forests-Based 2D-to- 3D Video Conversion Presenter: Mahsa Pourazad M. Pourazad, P. Nasiopoulos, and A. Bashashati.
Low Level Visual Processing. Information Maximization in the Retina Hypothesis: ganglion cells try to transmit as much information as possible about the.
ECE532 Final Project Demo Disparity Map Generation on a FPGA Using Stereoscopic Cameras ECE532 Final Project Demo Team 3 – Alim, Muhammad, Yu Ting.
The Visual Cortex: Anatomy
Slide 1 Neuroscience: Exploring the Brain, 3rd Ed, Bear, Connors, and Paradiso Copyright © 2007 Lippincott Williams & Wilkins Bear: Neuroscience: Exploring.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 34, NO. 2, FEBRUARY Leonardo De-Maeztu, Arantxa Villanueva, Member, IEEE, and.
1 Perception, Illusion and VR HNRS 299, Spring 2008 Lecture 8 Seeing Depth.
CS332 Visual Processing Department of Computer Science Wellesley College Binocular Stereo Vision Region-based stereo matching algorithms Properties of.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
Adam Wagner Kevin Forbes. Motivation  Take advantage of GPU architecture for highly parallel data-intensive application  Enhance image segmentation.
Fast Census Transform-based Stereo Algorithm using SSE2
1 Computational Vision CSCI 363, Fall 2012 Lecture 24 Computing Motion.
Efficient Color Boundary Detection with Color-opponent Mechanisms CVPR2013 Posters.
1 Computational Vision CSCI 363, Fall 2012 Lecture 16 Stereopsis.
Digital Image Processing Additional Material : Imaging Geometry 11 September 2006 Digital Image Processing Additional Material : Imaging Geometry 11 September.
Stereo March 8, 2007 Suggested Reading: Horn Chapter 13.
Grenoble Images Parole Signal Automatique Modeling of visual cortical processing to estimate binocular disparity Introduction - The objective is to estimate.
How do we see in 3 dimensions?
Outline Of Today’s Discussion 1.LGN Projections & Color Opponency 2.Primary Visual Cortex: Structure 3.Primary Visual Cortex: Individual Cells.
An H.264-based Scheme for 2D to 3D Video Conversion Mahsa T. Pourazad Panos Nasiopoulos Rabab K. Ward IEEE Transactions on Consumer Electronics 2009.
Independent Component Analysis features of Color & Stereo images Authors: Patrik O. Hoyer Aapo Hyvarinen CIS 526: Neural Computation Presented by: Ajay.
Processing visual information - pathways
A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004.
Jure Zbontar, Yann LeCun
Brodmann’s Areas. Brodmann’s Areas The Primary Visual Cortex Hubel and Weisel discovered simple, complex and hypercomplex cells in the striate.
Early Processing in Biological Vision
The Visual System: Higher Cortical Mechanisms
Binocular Stereo Vision
Ascending Visual Pathways
Binocular Stereo Vision
Visual Perceptions: Motion, Depth, Form
Perception: Structures
Binocular Stereo Vision
Stereopsis Current Biology
Binocular Disparity and the Perception of Depth
Neural Mechanisms of Visual Motion Perception in Primates
Machine Vision By: Reza Ebrahimpour 2009
Receptive Fields of Disparity-Tuned Simple Cells in Macaque V1
From Functional Architecture to Functional Connectomics
Scalable light field coding using weighted binary images
Construction of Complex Receptive Fields in Cat Primary Visual Cortex
Volume 27, Issue 1, Pages (January 2017)
Presentation transcript:

Implementation of a Biologically Inspired Stereoscopic Vision Model in C+CUDA Most of the depth perception processing is done in the visual cortex, mainly in the primary (V1) and medial temporal (MT) areas. In this work, we modeled the neural architecture of the V1 and MT cortices using as building blocks previous models of cortical cells and log-polar mapping. A sequential implementation of our model can build a tridimensional representation of the external world using stereoscopic image pairs obtained from a pair of fronto-parallel cameras. A C+CUDA parallel implementation is almost 60 times faster and allows real-time 3D reconstruction. Once we have vergence, we can build a disparity map by taking the disparity of the least active neuron from each column of MT layers as the disparity of the corresponding point of the right image (Figure 3). By performing the inverse of our log-polar transformation, we can map each point of the right image back onto 3D space (Figure 4). Camilo A. Carvalho, Lucas de P. Veronese, Hallysson Oliveira, Alberto F. De Souza Departamento de Informática – Laboratório de Computação de Alto Desempenho Universidade Federal do Espírito Santo, Av. F. Ferrari 514, Vitória-ES, Brazil {camilo, lucas.veronese, hallysson, In our model, a MT “layer” is a group of log-polar- retinotopically-arranged MT cells with the same disparity, d (see Figure 2). A group of MT layers codes different degrees of disparity, as shown in Figure 3. By summing up the output of all neurons of each MT layer we can perform vergence, i.e., we can find the point in the left image that corresponds to the center of the log-polar mapping of the right image onto MT – we just need to take de disparity of the least active MT layer (the MT cells are tuned inhibitory cells). [1]ANZAI, A.; OHZAWA, I.; FREEMAN, R. D. Neural Mechanisms for Processing Binocular Information I. Simple Cells. The Journal of Neurophysiology, Vol. 82 No. 2, August 1999, pp [2]OHZAWA, I.; DeANGELIS, G. C.; FREEMAN, R. D. Encoding of Binocular Disparity by Complex Cells in the Cat’s Visual Cortex. The Journal of Neurophysiology, Vol. 77, No. 6, June 1997, pp [3]GONZALEZ, F.; PEREZ, R. Neural Mechanisms Underlying Stereoscopic Vision. Progress in Neurobiology, Vol. 55, June 1998, pp BibliographyBibliography Table 1: One stereo frame 3D reconstruction: experimental results. C (s)C+CUDA (s)Speedup 16,88060, ,38 We ran the sequential and C+CUDA versions of our model in an AMD Athlon 64 X2 (Dual Core) 5,200+ of 2.7 GHz, with 3GB of 800 MHz DRAM DDR2, and video card NVIDIA GeForce GTX 285, with 1GB of DRAM GDDR3. Table 1 shows the execution times of each implementation (columns), in addition to the speed-up over the sequential implementation (last column). As the table shows, the speed-up achieved (about 60) with C+CUDA allows real-time 3D reconstruction (Figure 4). Figure 4: 3D reconstruction. Figure 3: From MT to disparity map. Figure 2: MT neural layers. Figure 1: MT cell model. Log-Polar Gabor Filter V1 Cortex V5 Cortex or MT Area Left RetinaRight Retina COLUMN DISPARITY MAP IntroductionIntroduction In our model (Figure 1), we use: (i) two types of simple cells (see simple cells models in [1]) – monocular simple cells (S L, S L Q, S R and S R Q ) and binocular simple cells (S LR and S LR Q ); (ii) two types of complex cells – monocular complex cell, build from two monocular simple cells in quadrature phase (90º) (C L and C R ), and binocular complex cell build from four monocular simple cells (C LR – energy model [2]); and our medial temporal (MT) cell. Our MT cell divides the output of the binocular complex cell by the sum of the output of two monocular complex cells and a constant (see equation in Figure 1), which can be adjusted to make the MT cell behave like a tuned inhibitory cell [3]. Our Model Experimental and Results