Visual Saliency: the signal from V1 to capture attention Li Zhaoping Head, Laboratory of natural intelligence Department of Psychology University College.

Slides:



Advertisements
Similar presentations
Selective Visual Attention: Feature Integration Theory PS2009/10 Lecture 9.
Advertisements

V1 Physiology. Questions Hierarchies of RFs and visual areas Is prediction equal to understanding? Is predicting the mean responses enough? General versus.
Neural Network Models in Vision Peter Andras
Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.
Human (ERP and imaging) and monkey (cell recording) data together 1. Modality specific extrastriate cortex is modulated by attention (V4, IT, MT). 2. V1.
黃文中 Preview 2 3 The Saliency Map is a topographically arranged map that represents visual saliency of a corresponding visual scene. 4.
A saliency map model explains the effects of random variations along irrelevant dimensions in texture segmentation and visual search Li Zhaoping, University.
Features and Objects in Visual Processing
Giessen University Dept. of Psychology
Visual Search: finding a single item in a cluttered visual scene.
Electrophysiology of Visual Attention. Moran and Desimone (1985) “Classical” RF prediction: there should be no difference in responses in these two conditions.
Searching for the NCC We can measure all sorts of neural correlates of these processes…so we can see the neural correlates of consciousness right? So what’s.
Upcoming Stuff: Finish attention lectures this week No class Tuesday next week – What should you do instead? Start memory Thursday next week – Read Oliver.
Attention as Information Selection. Early Selection Early Selection model postulated that attention acted as a strict gate at the lowest levels of sensory.
Pre-attentive Visual Processing: What does Attention do? What don’t we need it for?
Visual search: Who cares?
Features and Object in Visual Processing. The Waterfall Illusion.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC. Lecture 12: Visual Attention 1 Computational Architectures in Biological Vision,
Christian Siagian Laurent Itti Univ. Southern California, CA, USA
Features and Object in Visual Processing. The Waterfall Illusion.
Associative Learning in Hierarchical Self Organizing Learning Arrays Janusz A. Starzyk, Zhen Zhu, and Yue Li School of Electrical Engineering and Computer.
Interactive Activation: Behavioral and Brain Evidence and the Interactive Activation Model PDP Class January 8, 2010.
December 2, 2014Computer Vision Lecture 21: Image Understanding 1 Today’s topic is.. Image Understanding.
Michael Arbib & Laurent Itti: CS664 – Spring Lecture 5: Visual Attention (bottom-up) 1 CS 664, USC Spring 2002 Lecture 5. Visual Attention (bottom-up)
Computer Science Department, Duke UniversityPhD Defense TalkMay 4, 2005 Fast Extraction of Feature Salience Maps for Rapid Video Data Analysis Nikos P.
Studying Visual Attention with the Visual Search Paradigm Marc Pomplun Department of Computer Science University of Massachusetts at Boston
Neural mechanisms of Spatial Learning. Spatial Learning Materials covered in previous lectures Historical development –Tolman and cognitive maps the classic.
Manipulating Attention in Computer Games Matthias Bernhard, Le Zhang, Michael Wimmer Institute of Computer Graphics and Algorithms Vienna University of.
Text Lecture 2 Schwartz.  The problem for the designer is to ensure all visual queries can be effectively and rapidly served.  Semantically meaningful.
Neural coding (1) LECTURE 8. I.Introduction − Topographic Maps in Cortex − Synesthesia − Firing rates and tuning curves.
Lecture 2b Readings: Kandell Schwartz et al Ch 27 Wolfe et al Chs 3 and 4.
Deriving connectivity patterns in the primary visual cortex from spontaneous neuronal activity and feature maps Barak Blumenfeld, Dmitri Bibitchkov, Shmuel.
黃文中 Introduction The Model Results Conclusion 2.
Training Phase Results The RT difference between gain and loss was numerically larger for the second half of the trials than the first half, as predicted,
Neuronal Adaptation to Visual Motion in Area MT of the Macaque -Kohn & Movshon 지각 심리 전공 박정애.
Understanding V1 --- Saliency map and pre- attentive segmentation Li Zhaoping University College London, Psychology Lecture.
Binding problems and feature integration theory. Feature detectors Neurons that fire to specific features of a stimulus Pathway away from retina shows.
Computer Science Readings: Reinforcement Learning Presentation by: Arif OZGELEN.
Laurent Itti: CS599 – Computational Architectures in Biological Vision, USC Lecture 12: Visual Attention 1 Computational Architectures in Biological.
A bottom up visual saliency map in the primary visual cortex --- theory and its experimental tests. Li Zhaoping University College London Invited presentation.
Image Segmentation by Complex-Valued Units Cornelius Weber Hybrid Intelligent Systems School of Computing and Technology University of Sunderland Presented.
Bayesian inference accounts for the filling-in and suppression of visual perception of bars by context Li Zhaoping 1 & Li Jingling 2 1 University College.
Week 4 Motion, Depth, Form: Cormack Wolfe Ch 6, 8 Kandell Ch 27, 28 Advanced readings: Werner and Chalupa Chs 49, 54, 57.
Goals for Today’s Class
Object and face recognition
Biological Modeling of Neural Networks: Week 10 – Neuronal Populations Wulfram Gerstner EPFL, Lausanne, Switzerland 10.1 Cortical Populations - columns.
Outline Of Today’s Discussion
Computational Vision --- a window to our brain
A bottom up visual saliency map in the primary visual cortex
Spring Courses CSCI 5922 – Probabilistic Models (Mozer) CSCI Mind Reading Machines (Sidney D’Mello) CSCI 7000 – Human Centered Machine Learning.
Effective Connectivity
The longer one looks, the less one sees --- catching the features
Brain States: Top-Down Influences in Sensory Processing
Crowding by a single bar
One-Dimensional Dynamics of Attention and Decision Making in LIP
View from the Top Neuron
Bayesian inference Li Zhaoping1 & Li Jingling2 accounts for the
Border Ownership from Intracortical Interactions in Visual Area V2
Christopher C. Pack, Richard T. Born, Margaret S. Livingstone  Neuron 
Pattern Recognition Binding Edge Detection
The Neural Basis of Perceptual Learning
Volume 75, Issue 1, Pages (July 2012)
Nicholas J. Priebe, David Ferster  Neuron 
Effective Connectivity
Neural Mechanisms of Visual Motion Perception in Primates
Brain States: Top-Down Influences in Sensory Processing
Human vision: function
Stephen V. David, Benjamin Y. Hayden, James A. Mazer, Jack L. Gallant 
The Normalization Model of Attention
Neural Network Models in Vision
Presentation transcript:

Visual Saliency: the signal from V1 to capture attention Li Zhaoping Head, Laboratory of natural intelligence Department of Psychology University College London December 2002

Visual tasks: object recognition and localization Pre-condition: object segmentation. Question: how can one segment object before recognition? (how can there be egg before chicken?) I recently proposed (Li 1998, 1999, 2000, in particular Li TICS, Jan. 2002) Pre-attentive segmentation by highlighting conspicuous image areas where homogeneity breaks down (these areas are candidate locations for object boundaries, thus serving segmentation). V1 produces a saliency map containing such highlights, by intracortical interactions.

Contrast input to V1 Saliency output from V1 model V1 produces a saliency map V1 model Intra-cortical Interactions Highlighting important image locations. The V1 model is based on V1 physiology and anatomy (e.g., horizontal connections linking cells tuned to similar orientations), tested to be consistent with physiological data on contextual influences (e.g., iso-orientation suppression, Knierim and van Essen (1992) co-linear facilitation, Kapadia et al 1995).

Original inputV1 response S S=0.2, S=0.4, S=0.12, S=0.22 Z = (S-S)/σ --- z score, measuring saliencies of items Z=1.0 Z=7 Z=-1.3 Z=1.7 Histogram of all responses S regardless of features s  Pop-out Saliency of an item is assumed to increase with its evoked V1 response. We assume that efficiency of a visual search task increases with the salience of the target (or its most salient part, e.g., the horizontal bar in the target cross above). The high z score, z = 7, (of the horizontal bar), a measure of the cross’ salience, enables the cross to pop out, since its evoked V1 response (to the horizontal bar) is much higher than the average population response of the whole image. The cross has a unique feature, the horizontal bar, which evokes the highest response since it experiences no iso- orientation suppression while all distractors do. Hence, intra-cortical interaction is a neural basis for why feature searches are often efficient.

V1’s output as saliency map is viewed under the idealization of the top-down feedback to V1 being disabled, e.g., shortly after visual exposure or under anesthesia. Signal saliency regardless of object features: contrary to common belief, cells signaling saliency can also be tuned to features. V1 can produce a saliency map even though its cells are tuned to features. V1 neurons’ firing rate is the currency for saliency, just like a US dollar is a functional currency even if the holder has his/her own particular nationality Chinese, US, or other.

The V1 saliency map agrees with visual search behavior. Z= Target=Conjunction search --- serial search inputV1 output Target = + Z=7 Feature search --- pop out

The saliency map also explains spatial and distractor effects. Z=0.25 Target= Distractors dissimilar to each other Z=3.4 Target= Homogeneous background, identical distractors regularly placed Z=0.22 Target= Distractors irregularly placed InputsV1 outputs The easiest search Same target, different backgrounds

The V1 saliency map also explains: Visual search asymmetry: e.g., searching for longer line among shorter ones is easier than the reverse, searching for circle with a gap among closed circles easier than reverse, etc. Why some conjunction searches are easier than others, e.g., searching for a motion-orientation conjunction is easier than a color-orientation conjunction Etc. The V1 saliency map has made testable predictions and confirmed by subsequent tests E.g., Snowden (1998) found texture segmentation by orientation of texture bars impaired by random color variations of the bars, prediction of reduced impairment by using thinner bars was tested and confirmed (Zhaoping & Snowden 2002).

Potential interactions with other team members of the collaboration (self-centered): Pre-attentive segmentation, V1 saliency map, Li Zhaoping Feature binding, Chen Lin Visual attention, He Sheng Perceptual learning, Lu Zhonglin Mathematical modeling Zhang Jun Artificial vision, Zhang Jiajie Visual physiology, Cheng Kang learning vs. attention engineering Feature coding dynamics and representation Neural mechanisms Top-down and bottom up interactions