Download presentation
Presentation is loading. Please wait.
1
Human Vision and Cameras
CS 678 Spring 2018
2
Outline Human vision system Human vision for computer vision
Cameras and image formation Projection geometry Reading: Chapter 1 (F&P book) Chapter 2 (Szeliski)
3
Human Eyes The human eye is the organ which gives us the sense of sight, allowing us to observe and learn more about the surrounding world than we do with any of the other four senses. The eye allows us to see and interpret the shapes, colors, and dimensions of objects in the world by processing the light they reflect or emit. The eye is able to detect bright light or dim light, but it cannot sense objects when light is absent.
4
Anatomy of the Human Eye
5
Some Concepts Retina – The retina is the innermost layer of the eye and is comparable to the film inside of a camera. It is composed of nerve tissue which senses the light entering the eye.
6
Concepts The macula lutea is the small, yellowish central portion of the retina. It is the area providing the clearest, most distinct vision. The center of the macula is called the fovea centralis, an area where all of the photoreceptors are cones; there are no rods in the fovea. Learn more concepts at
7
Rods and Cones The retina contains two types of photoreceptors, rods and cones. The rods are more numerous, some 120 million, and are more sensitive than the cones. However, they are not sensitive to color. The 6 to 7 million cones provide the eye's color sensitivity and they are much more concentrated in the central yellow spot known as the macula.
8
The Electromagnetic Spectrum
9
Human Vision System We do not “see” with our eyes, but with our brains
10
Human Vision for Computer Vision
"Human vision is vastly better at recognition than any of our current computer systems, so any hints of how to proceed from biology are likely to be very useful." – David Lowe
11
Feedforward Processing
LGN -- The lateral geniculate nucleus V1 – The primary visual cortex V2 – Visual area V2 IT – Inferior temporal cortex Thorpe & Fabre-Thorpe 2001
12
A Hierarchical Model Contains alternating layers called simple and complex cell units creating increasing complexity: [Hubel & Wiesel 1962, Riesenhuber & Poggio 1999, Serre et al. 2007] Simple Cell (linear operation) – Selective Complex Cell (nonlinear operation) – Invariant
13
S1 Layer Being selective
Applying Gabor filters to the input image, setting parameters to match what is known about the primate visual system 8 bands and 4 orientations
14
Models with Four Layers (S1 C1 S2 C2)
Powerful for object category recognition, Representative works include Riesenhuber & Poggio ’99, Serre et al. ’05, ’07, Mutch & Lowe ’06 Air planes Motor bikes Faces Cars
15
Serre et al.’s model (PAMI’07)
C1 S2 C2 Pixelwise MAX Template matching Spatial pooling MAX Global MAX Gabor filtering . v1 v2 . vN . On-line Off-line Pre-learned prototypes … p1 p2 pN Serre et al.’s model (PAMI’07)
16
Biologically Inspired Models
T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber, and T. Poggio. Robust object recognition with cortex-like mechanisms. IEEE Trans. Pattern Anal. Mach. Intell., 29(3):411–426, 2007. G. Guo, G. Mu, Y. Fu, and T. S. Huang. Human age estimation using bioinspired features. In IEEE CVPR, 2009. Any other biologically inspired models? Optional Homework: read Serre et al.’ paper or other newer models, discuss and compare those models, and think about the advantages and disadvantages
17
Cameras
18
Image Formation Digital Camera Film Alexei Efros’ slide
19
How do we see the world? Let’s design a camera
Idea 1: put a piece of film in front of an object Do we get a reasonable image? Slide by Steve Seitz
20
Pinhole camera Add a barrier to block off most of the rays
It gets inverted Add a barrier to block off most of the rays This reduces blurring The opening known as the aperture Slide by Steve Seitz
21
Pinhole camera model Pinhole model:
Captures pencil of rays – all rays through a single point The point is called Center of Projection (focal point) The image is formed on the Image Plane Slide by Steve Seitz
22
Dimensionality Reduction Machine (3D to 2D)
3D world 2D image What have we lost? Angles Distances (lengths) Slide by A. Efros Figures © Stephen E. Palmer, 2002
23
Projection properties
Many-to-one: any points along same ray map to same point in image Points → points But projection of points on focal plane is undefined Lines → lines (collinearity is preserved) But line through focal point projects to a point Planes → planes (or half-planes) But plane through focal point projects to line
24
Projection properties
Parallel lines converge at a vanishing point Each direction in space has its own vanishing point But parallels parallel to the image plane remain parallel All directions in the same plane have vanishing points on the same line How do we construct the vanishing point/line?
25
Vanishing points each set of parallel lines meets at a different point
The vanishing point for this direction Sets of parallel lines on the same plane lead to collinear vanishing points. The line is called the horizon for that plane Good ways to spot faked images scale and perspective don’t work vanishing points behave badly
26
Distant objects are smaller
Size is inversely proportional to distance.
27
Perspective distortion
What does a sphere project to?
28
Shrinking the aperture
Why not make the aperture as small as possible? Less light gets through Diffraction effects… Slide by Steve Seitz
29
Shrinking the aperture
30
The reason for lenses
31
Adding a lens A lens focuses light onto the film
Rays passing through the center are not deviated Slide by Steve Seitz
32
Adding a lens A lens focuses light onto the film
focal point f A lens focuses light onto the film Rays passing through the center are not deviated All parallel rays converge to one point on a plane located at the focal length f Slide by Steve Seitz
33
Adding a lens A lens focuses light onto the film
“circle of confusion” A lens focuses light onto the film There is a specific distance at which objects are “in focus” other points project to a “circle of confusion” in the image Slide by Steve Seitz
34
Thin lens formula D’ D f Frédo Durand’s slide
35
Thin lens formula D’ D f Similar triangles everywhere!
Frédo Durand’s slide
36
Thin lens formula y’/y = D’/D D’ D f y y’
Similar triangles everywhere! D’ D f y y’ Frédo Durand’s slide
37
Thin lens formula y’/y = D’/D y’/y = (D’-f)/f D’ D f y y’
Similar triangles everywhere! y’/y = (D’-f)/f D’ D f y y’ Frédo Durand’s slide
38
Thin lens formula 1 1 1 + = D’ D f D’ D f
Any point satisfying the thin lens equation is in focus. + = D’ D f D’ D f Frédo Durand’s slide
39
Depth of Field Slide by A. Efros
40
How can we control the depth of field?
Changing the aperture size affects depth of field A smaller aperture increases the range in which the object is approximately in focus But small aperture reduces amount of light – need to increase exposure Slide by A. Efros
41
Varying the aperture Large aperture = small DOF
Small aperture = large DOF Slide by A. Efros
42
Nice Depth of Field effect
Source: F. Durand
43
Field of View (Zoom) Slide by A. Efros
44
Field of View (Zoom) Slide by A. Efros
45
Field of View f f FOV depends on focal length and size of the camera retina Smaller FOV = larger Focal Length Slide by A. Efros
46
Field of View / Focal Length
Large FOV, small f Camera close to car Small FOV, large f Camera far from the car Sources: A. Efros, F. Durand
47
Same effect for faces standard wide-angle telephoto Source: F. Durand
48
Approximating an affine camera
Source: Hartley & Zisserman
49
Lens systems A good camera lens may contain 15 elements and cost a thousand dollars The best modern lenses may contain aspherical elements
50
Lens Flaws: Chromatic Aberration
Lens has different refractive indices for different wavelengths: causes color fringing Near Lens Center Near Lens Outer Edge
51
Lens flaws: Spherical aberration
Spherical lenses don’t focus light perfectly Rays farther from the optical axis focus closer
52
Lens flaws: Vignetting
53
Radial Distortion Caused by imperfect lenses
Deviations are most noticeable for rays that pass through the edge of the lens No distortion Pin cushion Barrel
54
Digital camera A digital camera replaces film with a sensor array
Each cell in the array is light-sensitive diode that converts photons to electrons Two common types Charge Coupled Device (CCD) Complementary metal oxide semiconductor (CMOS) Slide by Steve Seitz
55
CCD vs. CMOS CCD: transports the charge across the chip and reads it at one corner of the array. An analog-to-digital converter (ADC) then turns each pixel's value into a digital value by measuring the amount of charge at each photosite and converting that measurement to binary form CMOS: uses several transistors at each pixel to amplify and move the charge using more traditional wires. The CMOS signal is digital, so it needs no ADC.
56
Color sensing in camera: Color filter array
Bayer grid Estimate missing components from neighboring values (demosaicing) Why more green? Human Luminance Sensitivity Function Source: Steve Seitz
57
Problem with demosaicing: color moire
Slide by F. Durand
58
The cause of color moire
detector Fine black and white detail in image misinterpreted as color information Slide by F. Durand
59
Color sensing in camera: Prism
Requires three chips and precise alignment More expensive CCD(R) CCD(G) CCD(B)
60
Color sensing in camera: Foveon X3
CMOS sensor Takes advantage of the fact that red, blue and green light penetrate silicon to different depths better image quality Source: M. Pollefeys
61
Issues with digital cameras
Noise low light is where you most notice noise light sensitivity (ISO) / noise tradeoff stuck pixels Resolution: Are more megapixels better? requires higher quality lens noise issues In-camera processing oversharpening can produce halos RAW vs. compressed file size vs. quality tradeoff Blooming charge overflowing into neighboring pixels Color artifacts purple fringing from microlenses, artifacts from Bayer patterns white balance More info online: Slide by Steve Seitz
62
Historical context Pinhole model: Mozi ( BCE), Aristotle ( BCE) Principles of optics (including lenses): Alhacen ( CE) Camera obscura: Leonardo da Vinci ( ), Johann Zahn ( ) First photo: Joseph Nicephore Niepce (1822) Daguerréotypes (1839) Photographic film (Eastman, 1889) Cinema (Lumière Brothers, 1895) Color Photography (Lumière Brothers, 1908) Television (Baird, Farnsworth, Zworykin, 1920s) First consumer camera with CCD: Sony Mavica (1981) First fully digital camera: Kodak DCS100 (1990) Alhacen’s notes Niepce, “La Table Servie,” 1822 CCD chip
63
Modeling projection The coordinate system y z x
We will use the pinhole model as an approximation Put the optical center (O) at the origin Put the image plane (Π’) in front of O This way the image is right-side-up Source: J. Ponce, S. Seitz
64
Modeling projection Projection equations y z x
Compute intersection with Π’ of ray from P = (x,y,z) to O Derived using similar triangles We get the projection by throwing out the last coordinate: Source: J. Ponce, S. Seitz
65
Homogeneous coordinates
Is this a linear transformation? no—division by z is nonlinear Trick: add one more coordinate: homogeneous image coordinates homogeneous scene coordinates Converting from homogeneous coordinates Slide by Steve Seitz
66
Perspective Projection Matrix
Projection is a matrix multiplication using homogeneous coordinates: divide by the third coordinate
67
Perspective Projection Matrix
Projection is a matrix multiplication using homogeneous coordinates: divide by the third coordinate In practice: lots of coordinate transformations… 3D point (4x1) World to camera coord. trans. matrix (4x4) 2D point (3x1) Camera to pixel coord. trans. matrix (3x3) Perspective projection matrix (3x4) =
68
Weak perspective Assume object points are all at same depth -z0
69
Orthographic Projection
Special case of perspective projection Distance from center of projection to image plane is infinite Also called “parallel projection” What’s the projection matrix? Image World Slide by Steve Seitz
70
Pros and Cons of These Models
Weak perspective (including orthographic) has simpler mathematics Accurate when object is small relative to its distance. Most useful for recognition. Perspective is much more accurate for scenes. Used in structure from motion. When accuracy really matters, we must model the real camera Use perspective projection with other calibration parameters (e.g., radial lens distortion)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.