Download presentation
Presentation is loading. Please wait.
1
Visual Attention Jeremy Wyatt
2
Where to look? Many visual processes are expensive Humans don’t process the whole visual field How do we decide what to process? How can we use insights about this to make machine vision more efficient?
3
Visual salience Salience ~ visual prominence Must be cheap to calculate Related to features that we collect from very early stages of visual processing Colour, orientation, intensity change and motion are all important indicators of salience
4
On/Off cells Recall centre surround cells ON area OFF area ON area Light spot Time Light ON Cell OFF Cell
5
Colour sensitive On/Off cells Recall that some ganglion ON cells are sensitive to the outputs of cones ON OFF
6
An intensity change map I = (r+g+b)/3 gives I, the intensity map The intensity change map is formed from a grid of on/off cells (they overlap) There are several maps, each from cells with receptive fields at a different scale Each cell fires for its area
7
How do we calculate the maps? We can create each on cell using a pair of Gaussians - = ON area OFF area Light spot
8
How do we calculate the maps? Imagine grids of fat and thin Gaussians We calculate the value of each Gaussian in each grid and then subtract one grid (here with 16 elements) from the other This implements our grid of on cells
9
Calculating the intensity change map We do this for a mix of scales We have to interpolate the values of some maps to match the outputs of others (this corresponds to cells that have overlapping receptive fields) By aligning and then combining the maps at different scales we have implemented a grid of on cells, or a grid of off cells
10
Other maps We can now do this for red, green, yellow and blue We also do this for intensity changes of a certain orientation - gives
11
Combining maps to calculate saliency We now add the maps to obtain the saliency of each group of pixels in the scene Saliency map We normalise each map to the same range before adding We weight each map before combining it We attend to the most active point in the saliency map
12
Attending to areas of the scene We use the salience model I have described to attend to certain areas of the scene We can now use this salience model to make other visual processes more efficient (e.g. object recognition)
13
Learning names and appearances of objects
14
Salience can be modulated by language
15
Modulating visual salience by language:results Number of Fixations PackageFull Scene Bottom up salience Modulated by Context Sprite Can 14.51 Diet Coke Can 173.1 Coke can13.51 Magic122 Lucozade Bottle 112 Fanta bottle 111.911.7
16
Summary Visual attention is guided by many features A good model of attention involves parts of early visual processing we have already seen We can use this to make object learning in robots more efficient
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.