Download presentation
Presentation is loading. Please wait.
1
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Outline: Motivation Human vision and illusions Image representation: Sampling, Quantization, Thresholding Stereo vision as an AI problem Stereograms, Geometry of stereograms, Computing correspondences Letting cues vote for hypotheses: Polar representation of a line, Hough transform Gestalt grouping CSE (c) S. Tanimoto, Image Understanding
2
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Motivation Allow computer and robots to read books. Allow mobile robots to navigate using vision. Support applications in industrial inspection, medical image analysis, security and surveillance, and remote sensing of the environment. Permit computers to recognize users’ faces, fingerprints, and to track them in various environments. Provide prostheses for the blind. Develop artistic intelligence. CSE (c) S. Tanimoto, Image Understanding
3
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Human Vision 25% of brain volume is allocated to visual perception. Human vision is a parallel & distributed system, involving 2 eyes, retinal processing, and multiple layers of processing in the striate cortex. Most humans are trichromats and they perceive color in a 3-D color space (except for bichromats and monochromats). Vision provides a high-bandwidth input mechanism... “a picture is worth 1000 words.” CSE (c) S. Tanimoto, Image Understanding
4
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Visual Illusions They provide insights about the nature of the human visual system, helping us understand how it works. Mueller-Lyer illusion CSE (c) S. Tanimoto, Image Understanding
5
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Herman Grid Illusion CSE (c) S. Tanimoto, Image Understanding
6
Herman Grid Illusion (dark on light)
CSE (c) S. Tanimoto, Image Understanding
7
Subjective Contour (Triangle)
CSE (c) S. Tanimoto, Image Understanding
8
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Image Representation Sampling: Number and density of “pixel” measurements Quantization: Number of levels permitted in pixel values. CSE (c) S. Tanimoto, Image Understanding
9
Image Representation (cont.)
Sampling: e.g., 4 by 4, square grid, 1 pixel/cm Quantization: e.g., binary, {0, 1}, 0 = black, 1 = white. 1 1 1 1 1 1 CSE (c) S. Tanimoto, Image Understanding
10
Aliasing due to Under-sampling
Here the apparent frequency is about 1/5 the true frequency. CSE (c) S. Tanimoto, Image Understanding
11
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Quantization Capturing a wide dynamic range of brightness levels or colors requires fine quantization. Common is 256 levels of each of red, green and blue. Segmentation is simplified by having a small number of levels -- provided foreground and background pixels are reliably distinguished by their dark or light value. Grayscale thresholding is typically to used to reduce the number of quantization levels to 2. CSE (c) S. Tanimoto, Image Understanding
12
Vision as Inferring Information from Clues
Deriving 3D structure from 2D info requires additional information: e.g., constraints. Deriving global descriptions from local data requires information fusion, i.e., inference. CSE (c) S. Tanimoto, Image Understanding
13
Stereo Vision as an AI Problem
Projection from 3 dimension to 2 loses information. With 2 projections, we can gain back some of that information. Recovering the missing information is an inference problem. The missing information is constrained by knowledge about the real world and assumptions about the scene. The use of knowledge and assumptions to make inferences is a standard approach in artificial intelligence. CSE (c) S. Tanimoto, Image Understanding
14
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Stereograms Two-view stereograms: 1. spatially separated left-eye/right-eye pair (including virtual-reality goggles) 2. superimposed, with separation using color filters. 3. superimposed, with temporal shuttering. 4. superimposed, with separation using polarizing filters. Single-view stereograms: 1. Magic-eye pictures with depth-modulated carrier. 2. Wallpaper offering depth effects due to its periodicity. CSE (c) S. Tanimoto, Image Understanding
15
Geometry of Stereograms
CSE (c) S. Tanimoto, Image Understanding
16
Computing Correspondence
Approach 1: Extract features and find a consistent matching of features in each view. Approach 2: Directly compute a disparity map, performing local correlations of the views. CSE (c) S. Tanimoto, Image Understanding
17
Inferring Trends via Voting Methods
The classical Hough Transform identifies prominent lines in a scene by letting each edge point vote for the line(s) it is on. Voting methods can do well under noisy conditions. Votes are tallied in an array of accumulators, indexed by theta and rho (polar parameters of a line). ρ = x cos θ + y sin θ. CSE (c) S. Tanimoto, Image Understanding
18
Letting a Point Vote for all the Lines that Pass Through It
CSE (c) S. Tanimoto, Image Understanding
19
Hough Transform: Polar representation
ρ = x cos θ + y sin θ. (x, y) ρ (0, 0) θ CSE (c) S. Tanimoto, Image Understanding
20
Hough Transform (Cont.)
nondirectional, unweighted Hough Transform: H(θ,ρ) = Σ Σ f(x,y) δ(x cos θ + y sin θ - ρ). δ(x) = if | x | < 1 otherwise CSE (c) S. Tanimoto, Image Understanding
21
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Gestalt Grouping CSE (c) S. Tanimoto, Image Understanding
22
CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Gestalt Grouping Texture element = “texel” Texel directionality Texel granularity Alignments of endpoints Spacing of texels Groups cue for surfaces, objects. CSE (c) S. Tanimoto, Image Understanding
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.