Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC 2001. Lecture 8: Stereoscopic Vision 1 Computational Architectures in Biological.

Slides:



Advertisements
Similar presentations
Chapter 10: Perceiving Depth and Size
Advertisements

Stereo Vision Reading: Chapter 11
CS 376b Introduction to Computer Vision 04 / 21 / 2008 Instructor: Michael Eckmann.
Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923.
Gratuitous Picture US Naval Artillery Rangefinder from World War I (1918)!!
1 Computational Vision CSCI 363, Fall 2012 Lecture 35 Perceptual Organization II.
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Stereo.
What happens when no correspondence is possible? Highly mismatched stereo-pairs lead to ‘binocular rivalry’ TANGENT ALERT! Open question: Can rivalry and.
December 5, 2013Computer Vision Lecture 20: Hidden Markov Models/Depth 1 Stereo Vision Due to the limited resolution of images, increasing the baseline.
Last Time Pinhole camera model, projection
How does the visual system represent visual information? How does the visual system represent features of scenes? Vision is analytical - the system breaks.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 10: Color Perception. 1 Computational Architectures in Biological.
Multiple View Geometry : Computational Photography Alexei Efros, CMU, Fall 2005 © Martin Quinn …with a lot of slides stolen from Steve Seitz and.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 4: Introduction to Vision 1 Computational Architectures in Biological.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 6: Low-level features 1 Computational Architectures in Biological.
1 Introduction to 3D Imaging: Perceiving 3D from 2D Images How can we derive 3D information from one or more 2D images? There have been 2 approaches: 1.
Introduction to Computer Vision 3D Vision Topic 9 Stereo Vision (I) CMPSCI 591A/691A CMPSCI 570/670.
Stereopsis Mark Twain at Pool Table", no date, UCR Museum of Photography.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 11: Visual Illusions 1 Computational Architectures in Biological.
© 2004 by Davi GeigerComputer Vision March 2004 L1.1 Binocular Stereo Left Image Right Image.
Computational Architectures in Biological Vision, USC, Spring 2001
May 2004Stereo1 Introduction to Computer Vision CS / ECE 181B Tuesday, May 11, 2004  Multiple view geometry and stereo  Handout #6 available (check with.
CSE473/573 – Stereo Correspondence
December 2, 2014Computer Vision Lecture 21: Image Understanding 1 Today’s topic is.. Image Understanding.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 7: Coding and Representation 1 Computational Architectures in.
Announcements PS3 Due Thursday PS4 Available today, due 4/17. Quiz 2 4/24.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 5: Introduction to Vision 2 1 Computational Architectures in.
Multiple View Geometry : Computational Photography Alexei Efros, CMU, Fall 2006 © Martin Quinn …with a lot of slides stolen from Steve Seitz and.
Stereo matching “Stereo matching” is the correspondence problem –For a point in Image #1, where is the corresponding point in Image #2? C1C1 C2C2 ? ? C1C1.
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.
Stereo vision A brief introduction Máté István MSc Informatics.
Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.
1 Perceiving 3D from 2D Images How can we derive 3D information from one or more 2D images? There have been 2 approaches: 1. intrinsic images: a 2D representation.
3-D Scene u u’u’ Study the mathematical relations between corresponding image points. “Corresponding” means originated from the same 3D point. Objective.
1B50 – Percepts and Concepts Daniel J Hulme. Outline Cognitive Vision –Why do we want computers to see? –Why can’t computers see? –Introducing percepts.
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #15.
Structure from images. Calibration Review: Pinhole Camera.
Lecture 12 Stereo Reconstruction II Lecture 12 Stereo Reconstruction II Mata kuliah: T Computer Vision Tahun: 2010.
Active Vision Key points: Acting to obtain information Eye movements Depth from motion parallax Extracting motion information from a spatio-temporal pattern.
1 Computational Vision CSCI 363, Fall 2012 Lecture 31 Heading Models.
Stereo Vision Reading: Chapter 11 Stereo matching computes depth from two or more images Subproblems: –Calibrating camera positions. –Finding all corresponding.
December 4, 2014Computer Vision Lecture 22: Depth 1 Stereo Vision Comparing the similar triangles PMC l and p l LC l, we get: Similarly, for PNC r and.
Lecture 2b Readings: Kandell Schwartz et al Ch 27 Wolfe et al Chs 3 and 4.
1 Computational Vision CSCI 363, Fall 2012 Lecture 20 Stereo, Motion.
1 Perception, Illusion and VR HNRS 299, Spring 2008 Lecture 8 Seeing Depth.
Course 9 Texture. Definition: Texture is repeating patterns of local variations in image intensity, which is too fine to be distinguished. Texture evokes.
Computer Vision, Robert Pless
Lec 22: Stereo CS4670 / 5670: Computer Vision Kavita Bala.
: Chapter 11: Three Dimensional Image Processing 1 Montri Karnjanadecha ac.th/~montri Image.
Computer Vision Stereo Vision. Bahadir K. Gunturk2 Pinhole Camera.
Computer Vision Lecture #10 Hossam Abdelmunim 1 & Aly A. Farag 2 1 Computer & Systems Engineering Department, Ain Shams University, Cairo, Egypt 2 Electerical.
CSE 185 Introduction to Computer Vision Stereo. Taken at the same time or sequential in time stereo vision structure from motion optical flow Multiple.
Bahadir K. Gunturk1 Phase Correlation Bahadir K. Gunturk2 Phase Correlation Take cross correlation Take inverse Fourier transform  Location of the impulse.
stereo Outline : Remind class of 3d geometry Introduction
Correspondence and Stereopsis Original notes by W. Correa. Figures from [Forsyth & Ponce] and [Trucco & Verri]
1 Computational Vision CSCI 363, Fall 2012 Lecture 16 Stereopsis.
John Morris Stereo Vision (continued) Iolanthe returns to the Waitemata Harbour.
Image-Based Rendering Geometry and light interaction may be difficult and expensive to model –Think of how hard radiosity is –Imagine the complexity of.
How we actively interpret our environment..  Perception: The process in which we understand sensory information.  Illusions are powerful examples of.
1 Computational Vision CSCI 363, Fall 2012 Lecture 18 Stereopsis III.
Computational Vision CSCI 363, Fall 2012 Lecture 17 Stereopsis II
Correspondence and Stereopsis. Introduction Disparity – Informally: difference between two pictures – Allows us to gain a strong sense of depth Stereopsis.
A Plane-Based Approach to Mondrian Stereo Matching
Computational Vision CSCI 363, Fall 2016 Lecture 15 Stereopsis
CS4670 / 5670: Computer Vision Kavita Bala Lec 27: Stereo.
Common Classification Tasks
Computer Vision Stereo Vision.
Binocular Disparity and the Perception of Depth
Chapter 11: Stereopsis Stereopsis: Fusing the pictures taken by two cameras and exploiting the difference (or disparity) between them to obtain the depth.
Presentation transcript:

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 1 Computational Architectures in Biological Vision, USC, Spring 2001 Lecture 8. Stereoscopic Vision Reading Assignments: Second part of Chapter 10.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 2 Seeing With Multiple Eyes From a single eye: can analyze color, luminance, orientation, etc of objects. But to locate objects in depth we need multiple projective views. Objects at different depths/distances yield different projections onto the retinas/cameras.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 3 Depth Perception Several cues allow us to locate objects in depth: Stereopsis: based on correlating cues from two spatially separated eyes. Optic flow: based on cues provided to one eye at moments separated in time. Accommodation: determines what focal length will best bring an object into focus. Size constancy: our knowledge of the real size of an object allows to estimate its distance from its perceived size. Direct measurements: for machine vision systems; e.g., range-finders, sonars, etc.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 4 Stereoscopic Vision 1) Extract features from each image, that can be matched between both images. 2) Establish the correspondence between features in one image and those in the other image. Difficulty: partial occlusions! 3) Compute disparity, i.e., difference in image position between matching features. From that and known optical geometry of the setup, recover distance to objects. 4) Interpolation/denoising/filling-in: from recovered depth at locations of features, infer dense depth field over entire images.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 5 The Correspondence Problem 16 possible objects… but only 4 were actually present… Problem: how do we pair the L i points to the R i points?

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 6 The Correspondence Problem The correspondence problem: to match corresponding points on the two retinas such as to be able to triangulate their depth. why a problem? because ambiguous! presence of “ghosts:” A scene with objects A and B yields exactly the same two retinal views as a scene with objects C and D. Given the two images, what objects were in the scene?

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 7 Computing Correspondence: naïve approach - extract “features” in both views. - loop over features in one view; find best matching features by searching over the entire other view. - for all paired features, compute depth. - interpolate to whole scene.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 8 Epipolar Geometry baseline: line joining both eyes’ optical centers epipole: intersection of baseline with image plane

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 9 Epipolar Geometry epipolar plane: plane defined by 3D point and both optical centers epipolar line: intersection of epipolar plane with image plane epipolar geometry: given the projection of a 3D point on one image plane, we can draw the epipolar plane, and the projection of that 3D point onto the other image plane is on that image plane’s corresponding epipolar line. So, for a given point in one image, the search for the corresponding point in the other image is 1D rather than 2D!

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 10 Feature Matching Main issue for computer vision systems: what should the features be? - edges? - corners, junctions? - “rich” edges, corners and junctions (i.e., where not only edge information but also local color, intensity, etc are used)? - jets, i.e., vector of responses from a basis of wavelets; textures? - small parts of objects? - whole objects?

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 11 How about biology? Classical question in psychology: do we recognize objects first then infer their depth, or can we perceive depth before recognizing an object? Does the brain take the image from each eye separately to recognize, for example, a house therein, and then uses the disparity between the two house images to recognize the depth of the house in space? or Does our visual system match local stimuli presented to both eyes, thus building up a depth map of surfaces and small objects in space which provides the input for perceptual recognition? Bela Julesz (1971) answered this question using random-dot stereograms

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 12 Random-dot Stereograms - start with a random dot pattern and a depth map - cut out the random dot pattern from one eye, shift it according to the disparity inferred from the depth map and paste it into the pattern for the other eye - fill any blanks with new randomly chosen dots.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 13 Example Random-Dot Stereogram

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 14 Associated depth map

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 15 Conclusion from RDS We can perceive depth before we recognize objects. Thus, the brain is able to solve the correspondence problem using only simple features, and does not (only) rely on matching views of objects.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 16 Reverse Correlation Technique Simplified view: Show random sequence of all possible stimuli. Record responses. Start with an empty image; add up all stimuli that elicited a response. Result: average stimulus profile that cause the cell to fire.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 17 Spatial RFs Simple cells in V1 of cat. Well modeled by Gabor functions with various preferred orientations (here all normalized to vertical) and spatial phases.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 18 RFs are spatio- temporal!

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 19 Parametrizing the results

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 20 Binocular-responsive simple cells in V1 Cells respond well to stimuli presented to either eye. but the phase of their RF depends on the eye! Ohzawa et al, 1996

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 21 Space-Time Analysis

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 22 Summary of results Approximately 30% of all neurons studied showed differences in their spatial RF for the two eyes. Of these, nearly all prefer orientations between oblique and vertical; hence could be involved in processing horizontal disparity. Conversely, most cells found with horizontal preferred orientation showed no RF difference between eyes. RF properties change over time, but in a similar way for both eyes.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 23 Main issue with local features The depth map inferred from local features will not be complete: - missing information in uniform image regions - partial occlusions (features seen in one eye but occluded in the other) - ghosts and ambiguous correspondences - false matches due to noise typical solution: use a regularization process to infer depth in regions where its direct computation is unclear, based on neighboring regions where its computation was unambiguous.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 24 The Dev Model Example of depth reconstruction model that includes a regularization process: Arbib, Boylls and Dev’s model. Regularizing hypotheses: - the scene has a small number of continuous surfaces. - at one location, there is only one depth So, - depth at a given location, if ambiguous, is inferred from depth at neighboring locations - at a given location, multiple possible depths values compete

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 25 The Dev Model consider a 1D input along axis “q”; object at each location lies at a given depth, corresponding to a given disparity along the “d” axis. along q: cooperate & interpolate through excitatory field along d: compete & enforce 1 active location through winner-take-all

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 26 Regularization in Biology Regularization is omnipresent in the biological visual system (e.g., filling-in of blind spot). We saw that some V1 cells are tuned to disparity We saw (last lecture) that long-range (non-classical) interactions exist among V1 cells, both excitatory and inhibitory So it looks like biology has the basic elements for a regularized depth reconstruction algorithm. Its detailed understanding will require more research ;-)

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 27 Low-Level Disparity is Not The Only Cue as exemplified by size constancy illusions: when we have no disparity cue to infer depth (e.g., a 2D image of a 3D scene), we still tend to perceive the scene in 3D and infer depth from the known relative sizes between the various elements in the scene.

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 28 More Biological Depth Tuning Dobbins, Jeo & Allman, Science, Record from V1, V2 and V4 in awake monkey. Show disks of various sizes, on a computer screen at variable distance from animal. Typical cells: - are size tuned, i.e., prefer the same retinal image size regardless of distance; - but their response may be modulated by screen distance!

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 29 Distance tuning A: nearness cell (fires more when object is near, for same retinal size); B: farness cell C: distance-independent cell

Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 8: Stereoscopic Vision 30 Outlook Depth computation can be carried out by inferring distance from disparity, i.e., displacement between an object’s projection on two cameras or eyes The major computational challenge is the correspondence problem, i.e., pairing visual features across both eyes Biological neurons in early visual areas, with small RF sizes, are already disparity-tuned, suggesting that biological brains solve the correspondence problem in part based on localized, low-level cues However, low-level cues provide only sparse depth maps; using regularization processes and higher-level cues (e.g., whole objects) provides increased robustness