Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004.

Similar presentations


Presentation on theme: "A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004."— Presentation transcript:

1 A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004

2 Outline 1. Deco and Rolls’ network designed for visual object recognition and attention l Architecture l Low-level features l Dynamics of neuron activity l Learning the weights l How to bias attention 2. Experiments and results 3. Discussion

3 Architecture: General Features l Hierarchical (Multi-module) l Areas: V1, V2, V4, IT, PP, PF46v and PF46d l Two separate pathways l V1 – V2 – V4 – IT – PF46v (``What’’) l (V1, V2) – MT – PP – PF46d (``Where’’) l Bottom-up connections: larger receptive fields up to IT l Top-down connections: object and area biasing effects l Lateral inhibitory connections in each layer

4 Architecture: Diagram Note: Columnar feature stacking (depth) in all layers except PP

5 Low-Level (Input) Features: Gabor Filters l Product of two functions: 1. Complex plane wave 2. Gaussian envelope l Daugman (1985): General 2D form l Lee (1996): Image Representation Using 2D Gabor Wavelets in PAMI l Derived a constrained form (used in this paper) for a family of Gabor filters l Satisfies neurophysiological constraints and completeness Left: Family of 1.5 octave bandwidth filters covering the spatial frequency plane, satisfying Lee’s constraints

6 Low-Level (Input) Features: Gabor Filters l Remaining DOF: Spatial center position (p,q) Spatial resolution k Orientation: l l ``Mother wavelet’’ in filter family

7 Features of V1 Module l Nv1 x Nv1 hypercolumns covering N x N scene l Each hypercolumn has L orientation columns with different spatial frequencies l Magnification factor: more high spatial resolution filters nearer the fovea l Modeled by Gaussian centered at fovea l Finally, note that the input to filters is image without DC component:

8 Neuron ``Pool’’ Dynamics l Mean-field approximation: average activity level of a pool of neurons l Dynamical equations for neuronal pool activity levels l Wilson and Cowan 1972, Gerstner 2000

9

10

11

12 Non-linear response function F

13 Temporal Evolution of Entire System p, q: spatial position k: spatial resolution (only for V1) l: pool number l V1:

14 Temporal Evolution of Entire System p, q: spatial position k: spatial resolution (only for V1) l: pool number l V1: l Other modules:

15 Top-Down Biasing 1. Spatial attention, e.g: l Weights from PP as Gaussians based on spatial position l Feature-based attention is similar, but using explicit weights l Note that top-down input is not used in learning phase

16 Lateral Inhibition l Only negative term, defined as: l Decays with distance by Gaussian modulation

17 ``Trace’’ Learning Rule l How do they learn invariance? l Slowly change object appearance l Retain a trace output for all pools (slow decay) l Hebbian-like updating rule for change in weight strength:

18 How They Control Attention l Two modes 1. Object recognition 2. Visual search

19 Experiments: Simulate FMRI data l Simultaneous stimuli compete l Attentional modulation: measure effect of using attention vs. without l Two conditions: simultaneous and sequential l 250 ms presentation, 750 ms without l 1 ms per time step (solved via Euler method)

20 Experiments: IT Receptive Field l One object, learn to recognize in all positions l Test with no background vs. cluttered background (natural scene) l With and without object attention

21 Experiments: Distractor Object and Placement

22 Experiment: Visual Search and Object Attention Stimuli: Monkey face on cluttered background

23 Discussion! l Effective in performing object recognition and visual attention l Straightforward implementation l But many parameters l Scalability has not been demonstrated (only two IT neurons) l Thoughts?

24 Receptive Field Convergence


Download ppt "A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004."

Similar presentations


Ads by Google