A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004.

Slides:



Advertisements
Similar presentations
Biologically Motivated Computer Vision
Advertisements

Chapter 2.
Gabor Filter: A model of visual processing in primary visual cortex (V1) Presented by: CHEN Wei (Rosary) Supervisor: Dr. Richard So.
A Neural Model for Detecting and Labeling Motion Patterns in Image Sequences Marc Pomplun 1 Julio Martinez-Trujillo 2 Yueju Liu 2 Evgueni Simine 2 John.
Central Visual Processes. Anthony J Greene2 Central Visual Pathways I.Primary Visual Cortex Receptive Field Columns Hypercolumns II.Spatial Frequency.
for image processing and computer vision
ECE738 Advanced Image Processing Face Recognition by Elastic Bunch Graph Matching IEEE Trans. PAMI, July 1997.
HMAX Models Architecture Jim Mutch March 31, 2010.
Presented by: Mingyuan Zhou Duke University, ECE September 18, 2009
Image Segmentation by Complex-Valued Units Cornelius Weber and Stefan Wermter Hybrid Intelligent Systems School of Computing and Technology University.
Gain Modulation Huei-Ju Chen Papers: Chance, Abbott, and Reyes(2002) E. Salinas & T. Sejnowski(2001) E. Salinas & L.G. Abbott (1997, 1996) Pouget & T.
BEYOND SIMPLE FEATURES: A LARGE-SCALE FEATURE SEARCH APPROACH TO UNCONSTRAINED FACE RECOGNITION Nicolas Pinto Massachusetts Institute of Technology David.
Question Examples If you were a neurosurgeon and you needed to take out part of the cortex of a patient, which technique would you use to identify the.
Exam 1 week from today in class assortment of question types including written answers.
Michigan State University1 Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways Zhengping Ji Embodied Intelligence.
CSE 153 Cognitive ModelingChapter 3 Representations and Network computations In this chapter, we cover: –A bit about cortical architecture –Possible representational.
Aula 5 Alguns Exemplos PMR5406 Redes Neurais e Lógica Fuzzy.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 6: Low-level features 1 Computational Architectures in Biological.
December 1, 2009Introduction to Cognitive Science Lecture 22: Neural Models of Mental Processes 1 Some YouTube movies: The Neocognitron Part I:
Overview 1.The Structure of the Visual Cortex 2.Using Selective Tuning to Model Visual Attention 3.The Motion Hierarchy Model 4.Simulation Results 5.Conclusions.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 5: Introduction to Vision 2 1 Computational Architectures in.
Connected Populations: oscillations, competition and spatial continuum (field equations) Lecture 12 Course: Neural Networks and Biological Modeling Wulfram.
Overview of Back Propagation Algorithm
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
A Model of Saliency-Based Visual Attention for Rapid Scene Analysis Laurent Itti, Christof Koch, and Ernst Niebur IEEE PAMI, 1998.
Another viewpoint: V1 cells are spatial frequency filters
1 Computational Vision CSCI 363, Fall 2012 Lecture 26 Review for Exam 2.
Presented by Tienwei Tsai July, 2005
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition by D. Tao, X. Li, and J. Maybank, TPAMI 2007 Presented by Iulian Pruteanu.
2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.
THE VISUAL SYSTEM: EYE TO CORTEX Outline 1. The Eyes a. Structure b. Accommodation c. Binocular Disparity 2. The Retina a. Structure b. Completion c. Cone.
15 1 Grossberg Network Biological Motivation: Vision Eyeball and Retina.
黃文中 Introduction The Model Results Conclusion 2.
The primate visual systemHelmuth Radrich, The primate visual system 1.Structure of the eye 2.Neural responses to light 3.Brightness perception.
National Taiwan A Road Sign Recognition System Based on a Dynamic Visual Model C. Y. Fang Department of Information and.
Human vision Jitendra Malik U.C. Berkeley. Visual Areas.
Image Stabilization by Bayesian Dynamics Yoram Burak Sloan-Swartz annual meeting, July 2009.
1 Computational Vision CSCI 363, Fall 2012 Lecture 24 Computing Motion.
Image Segmentation by Complex-Valued Units Cornelius Weber Hybrid Intelligent Systems School of Computing and Technology University of Sunderland Presented.
1 Computational Vision CSCI 363, Fall 2012 Lecture 16 Stereopsis.
Object and face recognition
How Recurrent Dynamics Explain Crowding Aaron Clarke & Michael H. Herzog Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale.
National Taiwan Normal A System to Detect Complex Motion of Nearby Vehicles on Freeways C. Y. Fang Department of Information.
Bayesian Brain - Chapter 11 Neural Models of Bayesian Belief Propagation Rajesh P.N. Rao Summary by B.-H. Kim Biointelligence Lab School of.
Convolutional Neural Network
Article Review Todd Hricik.
Journal of Vision. 2017;17(4):9. doi: / Figure Legend:
Brodmann’s Areas. Brodmann’s Areas The Primary Visual Cortex Hubel and Weisel discovered simple, complex and hypercomplex cells in the striate.
Object Recognition in the Dynamic Link Architecture
11. System level organization and coupled networks
Brain States: Top-Down Influences in Sensory Processing
Visual Cortex   Vision Science Lectures in Ophthalmology Curtis Baker.
Binocular Stereo Vision
Spatial Vision (continued)
Presented by Rhee, Je-Keun
Intelligent Leaning -- A Brief Introduction to Artificial Neural Networks Chiung-Yao Fang.
Mind, Brain & Behavior Wednesday February 12, 2003.
Grossberg Network.
Object Recognition Today we will move on to… April 12, 2018
LECTURE 35: Introduction to EEG Processing
LECTURE 33: Alternative OPTIMIZERS
Brain States: Top-Down Influences in Sensory Processing
Volume 64, Issue 6, Pages (December 2009)
High-Resolution fMRI Reveals Laminar Differences in Neurovascular Coupling between Positive and Negative BOLD Responses  Jozien Goense, Hellmut Merkle,
The Normalization Model of Attention
CSC 578 Neural Networks and Deep Learning
Volume 27, Issue 2, Pages (August 2000)
Margarita Kuzmina, Eduard Manykin
Edge Detection via Lateral Inhibition
Presentation transcript:

A Neurodynamical Cortical Model of Visual Attention and Invariant Object Recognition Gustavo Deco Edmund T. Rolls Vision Research, 2004

Outline 1. Deco and Rolls’ network designed for visual object recognition and attention l Architecture l Low-level features l Dynamics of neuron activity l Learning the weights l How to bias attention 2. Experiments and results 3. Discussion

Architecture: General Features l Hierarchical (Multi-module) l Areas: V1, V2, V4, IT, PP, PF46v and PF46d l Two separate pathways l V1 – V2 – V4 – IT – PF46v (``What’’) l (V1, V2) – MT – PP – PF46d (``Where’’) l Bottom-up connections: larger receptive fields up to IT l Top-down connections: object and area biasing effects l Lateral inhibitory connections in each layer

Architecture: Diagram Note: Columnar feature stacking (depth) in all layers except PP

Low-Level (Input) Features: Gabor Filters l Product of two functions: 1. Complex plane wave 2. Gaussian envelope l Daugman (1985): General 2D form l Lee (1996): Image Representation Using 2D Gabor Wavelets in PAMI l Derived a constrained form (used in this paper) for a family of Gabor filters l Satisfies neurophysiological constraints and completeness Left: Family of 1.5 octave bandwidth filters covering the spatial frequency plane, satisfying Lee’s constraints

Low-Level (Input) Features: Gabor Filters l Remaining DOF: Spatial center position (p,q) Spatial resolution k Orientation: l l ``Mother wavelet’’ in filter family

Features of V1 Module l Nv1 x Nv1 hypercolumns covering N x N scene l Each hypercolumn has L orientation columns with different spatial frequencies l Magnification factor: more high spatial resolution filters nearer the fovea l Modeled by Gaussian centered at fovea l Finally, note that the input to filters is image without DC component:

Neuron ``Pool’’ Dynamics l Mean-field approximation: average activity level of a pool of neurons l Dynamical equations for neuronal pool activity levels l Wilson and Cowan 1972, Gerstner 2000

Non-linear response function F

Temporal Evolution of Entire System p, q: spatial position k: spatial resolution (only for V1) l: pool number l V1:

Temporal Evolution of Entire System p, q: spatial position k: spatial resolution (only for V1) l: pool number l V1: l Other modules:

Top-Down Biasing 1. Spatial attention, e.g: l Weights from PP as Gaussians based on spatial position l Feature-based attention is similar, but using explicit weights l Note that top-down input is not used in learning phase

Lateral Inhibition l Only negative term, defined as: l Decays with distance by Gaussian modulation

``Trace’’ Learning Rule l How do they learn invariance? l Slowly change object appearance l Retain a trace output for all pools (slow decay) l Hebbian-like updating rule for change in weight strength:

How They Control Attention l Two modes 1. Object recognition 2. Visual search

Experiments: Simulate FMRI data l Simultaneous stimuli compete l Attentional modulation: measure effect of using attention vs. without l Two conditions: simultaneous and sequential l 250 ms presentation, 750 ms without l 1 ms per time step (solved via Euler method)

Experiments: IT Receptive Field l One object, learn to recognize in all positions l Test with no background vs. cluttered background (natural scene) l With and without object attention

Experiments: Distractor Object and Placement

Experiment: Visual Search and Object Attention Stimuli: Monkey face on cluttered background

Discussion! l Effective in performing object recognition and visual attention l Straightforward implementation l But many parameters l Scalability has not been demonstrated (only two IT neurons) l Thoughts?

Receptive Field Convergence