Artificial Intelligence Chapter 6 Robot Vision Biointelligence Lab School of Computer Sci. & Eng. Seoul National University
(c) SNU CSE Biointelligence Lab2 Introduction Computer vision Endowing machines with the means to “see” Create an image of a scene and extract features Very difficult problem for machines Several different scenes can produce identical images. Images can be noisy. Cannot directly ‘invert’ the image to reconstruct the scene. Figure 6.1 The Many-to-One Nature of the Imaging Process
(c) SNU CSE Biointelligence Lab3 Steering an Automobile ALVINN system [Pomerleau 1991,1993] Uses Artificial Neural Network Used 30*32 TV image as input (960 input node) 5 Hidden node 30 output node Training regime: modified “on-the-fly” A human driver drives the car, and his actual steering angles are taken as correct labels for the corresponding inputs. Shifted and rotated images were also used for training. ALVINN has driven for 120 consecutive kilometers at speeds up to 100km/h.
(c) SNU CSE Biointelligence Lab4 Steering an Automobile-ALVINN network Figure 6.2 The ALVINN Network
(c) SNU CSE Biointelligence Lab5 Two stages of Robot Vision (1/3) Finding out objects in the scene Looking for “edges” in the image Edge:a part of the image across which the image intensity or some other property of the image changes abruptly. Attempting to segment the image into regions. Region:a part of the image in which the image intensity or some other property of the image changes only gradually. Figure 6.3 Scene Discontinuities
(c) SNU CSE Biointelligence Lab6 Two stages of Robot Vision (2/3) Image processing stage Transform the original image into one that is more amendable to the scene analysis stage. Involves various filtering operations that help reduce noise, accentuate edges, and find regions. Scene analysis stage Attempt to create an iconic or a feature-based description of the original scene, providing the task-specific information. Figure 6.4 The Two Stages of Robot Vision
(c) SNU CSE Biointelligence Lab7 Two stages of Robot Vision (3/3) Scene analysis stage produces task-specific information. If only the disposition of the blocks is important, appropriate iconic model can be (C B A FLOOR) If it is important to determine whether there is another block on top of the block labeled C, adequate description will include the value of a feature, CLEAR_C. Figure 6.5 A Robot in a Room with Toy Blocks
(c) SNU CSE Biointelligence Lab8 Averaging (1/4) Original image can be represented as an m*n array of numbers. The numbers represent the light intensities at corresponding points in the image. Certain irregularities in the image can be smoothed by an averaging operation. Averaging operation involves sliding an averaging widow all over the image array.
(c) SNU CSE Biointelligence Lab9 Averaging (2/4) Smoothing operation thickens broad lines and eliminates thin lines and small details. The averaging window is centered at each pixel, and the weighted sum of all the pixel numbers within the averaging window is computed. This sum then replaces the original value at that pixel. Figure 6.6 Elements of the Averaging Operation
(c) SNU CSE Biointelligence Lab10 Averaging (3/4) Common function used for smoothing is a Gaussian of two dimensions. Convolving an image with a Gaussian is equivalent to finding the solution to a diffusion equation when the initial condition is given by the image intensity field.
(c) SNU CSE Biointelligence Lab11 Averaging (4/4) Figure 6.7 The Gaussian Smoothing Function Figure 6.8 Image Smoothing with a Gaussian Filter
(c) SNU CSE Biointelligence Lab12 Edge enhancement (1/2) Edge: any boundary between parts of the image with markedly different values of some property. Edges are often related to important object properties. Edges in the image occur at places where the second derivative of the image intensity is zero.
(c) SNU CSE Biointelligence Lab13 Edge enhancement (2/2) Figure 6.9 Edge Enhancement Figure 6.10 Taking Derivatives of Image Intensity
(c) SNU CSE Biointelligence Lab14 Combining Edge Enhancement with Averaging (1/2) Edge enhancement alone would tend to emphasize noise elements along with enhancing edges. To be less sensitive to noise, both operations are needed. (First averaging and then edge enhancing) We can convolve the one-dimensional image with the second derivative of a Gaussian curve to combine both operation.
(c) SNU CSE Biointelligence Lab15 Combining Edge Enhancement with Averaging (2/2) Laplacian is second-derivate-type operation that enhances edges of any orientation. Laplacian of the two-dimensional Gaussian function looks like an upside-down hat, often called a sombrero function. Entire averaging/edge-finding operation can be achieved by convolving the image with the sombrero function (called Laplacian filtering) Figure 6.11 The Sombrero Function Used in Laplacian Filtering