CSE (c) S. Tanimoto, 2007 Image Understanding

CSE 415 -- (c) S. Tanimoto, 2007 Image Understanding
Outline: Motivation Human vision and illusions Image representation: Sampling, Quantization, Thresholding Stereo vision as an AI problem Stereograms, Geometry of stereograms, Computing correspondences Letting cues vote for hypotheses: Polar representation of a line, Hough transform Gestalt grouping CSE (c) S. Tanimoto, Image Understanding

Motivation Allow computer and robots to read books. Allow mobile robots to navigate using vision. Support applications in industrial inspection, medical image analysis, security and surveillance, and remote sensing of the environment. Permit computers to recognize users’ faces, fingerprints, and to track them in various environments. Provide prostheses for the blind. Develop artistic intelligence. CSE (c) S. Tanimoto, Image Understanding

Human Vision 25% of brain volume is allocated to visual perception. Human vision is a parallel & distributed system, involving 2 eyes, retinal processing, and multiple layers of processing in the striate cortex. Most humans are trichromats and they perceive color in a 3-D color space (except for bichromats and monochromats). Vision provides a high-bandwidth input mechanism... “a picture is worth 1000 words.” CSE (c) S. Tanimoto, Image Understanding

Visual Illusions They provide insights about the nature of the human visual system, helping us understand how it works. Mueller-Lyer illusion CSE (c) S. Tanimoto, Image Understanding

Herman Grid Illusion CSE (c) S. Tanimoto, Image Understanding

Herman Grid Illusion (dark on light)
CSE (c) S. Tanimoto, Image Understanding

Subjective Contour (Triangle)

Image Representation Sampling: Number and density of “pixel” measurements Quantization: Number of levels permitted in pixel values. CSE (c) S. Tanimoto, Image Understanding

Image Representation (cont.)
Sampling: e.g., 4 by 4, square grid, 1 pixel/cm Quantization: e.g., binary, {0, 1}, 0 = black, 1 = white. 1 1 1 1 1 1 CSE (c) S. Tanimoto, Image Understanding

Aliasing due to Under-sampling
Here the apparent frequency is about 1/5 the true frequency. CSE (c) S. Tanimoto, Image Understanding

Quantization Capturing a wide dynamic range of brightness levels or colors requires fine quantization. Common is 256 levels of each of red, green and blue. Segmentation is simplified by having a small number of levels -- provided foreground and background pixels are reliably distinguished by their dark or light value. Grayscale thresholding is typically to used to reduce the number of quantization levels to 2. CSE (c) S. Tanimoto, Image Understanding

Vision as Inferring Information from Clues
Deriving 3D structure from 2D info requires additional information: e.g., constraints. Deriving global descriptions from local data requires information fusion, i.e., inference. CSE (c) S. Tanimoto, Image Understanding

Stereo Vision as an AI Problem
Projection from 3 dimension to 2 loses information. With 2 projections, we can gain back some of that information. Recovering the missing information is an inference problem. The missing information is constrained by knowledge about the real world and assumptions about the scene. The use of knowledge and assumptions to make inferences is a standard approach in artificial intelligence. CSE (c) S. Tanimoto, Image Understanding

Stereograms Two-view stereograms: 1. spatially separated left-eye/right-eye pair (including virtual-reality goggles) 2. superimposed, with separation using color filters. 3. superimposed, with temporal shuttering. 4. superimposed, with separation using polarizing filters. Single-view stereograms: 1. Magic-eye pictures with depth-modulated carrier. 2. Wallpaper offering depth effects due to its periodicity. CSE (c) S. Tanimoto, Image Understanding

Geometry of Stereograms

Computing Correspondence
Approach 1: Extract features and find a consistent matching of features in each view. Approach 2: Directly compute a disparity map, performing local correlations of the views. CSE (c) S. Tanimoto, Image Understanding

Inferring Trends via Voting Methods
The classical Hough Transform identifies prominent lines in a scene by letting each edge point vote for the line(s) it is on. Voting methods can do well under noisy conditions. Votes are tallied in an array of accumulators, indexed by theta and rho (polar parameters of a line). ρ = x cos θ + y sin θ. CSE (c) S. Tanimoto, Image Understanding

Letting a Point Vote for all the Lines that Pass Through It

Hough Transform: Polar representation
ρ = x cos θ + y sin θ. (x, y) ρ (0, 0) θ CSE (c) S. Tanimoto, Image Understanding

Hough Transform (Cont.)
nondirectional, unweighted Hough Transform: H(θ,ρ) = Σ Σ f(x,y) δ(x cos θ + y sin θ - ρ). δ(x) = if | x | < 1 otherwise CSE (c) S. Tanimoto, Image Understanding

Gestalt Grouping CSE (c) S. Tanimoto, Image Understanding

Gestalt Grouping Texture element = “texel” Texel directionality Texel granularity Alignments of endpoints Spacing of texels Groups cue for surfaces, objects. CSE (c) S. Tanimoto, Image Understanding

CSE (c) S. Tanimoto, 2007 Image Understanding

Similar presentations

Presentation on theme: "CSE (c) S. Tanimoto, 2007 Image Understanding"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CSE (c) S. Tanimoto, 2007 Image Understanding

Similar presentations

Presentation on theme: "CSE (c) S. Tanimoto, 2007 Image Understanding"— Presentation transcript:

Similar presentations

About project

Feedback