Modeling Visual Attention and some other things

Slides:

Advertisements

Similar presentations

A Neural Model for Detecting and Labeling Motion Patterns in Image Sequences Marc Pomplun 1 Julio Martinez-Trujillo 2 Yueju Liu 2 Evgueni Simine 2 John.

Advertisements

Attention-Dependent Hemifield Asymmetries When Judging Numerosity Nestor Matthews & Sarah Theobald Department of Psychology, Denison University, Granville.

A saliency map model explains the effects of random variations along irrelevant dimensions in texture segmentation and visual search Li Zhaoping, University.

Visual attention reveals changing color in moving objects James E. Hoffman and Scott McLean University of Delaware.

Upcoming Stuff: Finish attention lectures this week No class Tuesday next week – What should you do instead? Start memory Thursday next week – Read Oliver.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

Visual search: Who cares?

Features and Object in Visual Processing. The Waterfall Illusion.

December 1, 2009Introduction to Cognitive Science Lecture 22: Neural Models of Mental Processes 1 Some YouTube movies: The Neocognitron Part I:

CHAPTER 12 ADVANCED INTELLIGENT SYSTEMS © 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang.

Studying Visual Attention with the Visual Search Paradigm Marc Pomplun Department of Computer Science University of Massachusetts at Boston

Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.

Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.

黃文中 Introduction The Model Results Conclusion 2.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

Control of Attention in schizophrenia 1.Advance understanding of schizophrenia. Move from description of deficits to explanation of specific mechanisms.

The Function of Synchrony Marieke Rohde Reading Group DyStURB (Dynamical Structures to Understand Real Brains)

Binding problems and feature integration theory. Feature detectors Neurons that fire to specific features of a stimulus Pathway away from retina shows.

 Example: seeing a bird that is singing in a tree or miss a road sign in plain sight  Cell phone use while driving reduces attention and memory for.

 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.

Chapter 9 Knowledge. Some Questions to Consider Why is it difficult to decide if a particular object belongs to a particular category, such as “chair,”

Unit 4: Perceptual Organization and Interpretation

Volume 53, Issue 1, Pages 9-16 (January 2007)

Attention Narrows Position Tuning of Population Responses in V1

Brain States: Top-Down Influences in Sensory Processing

Interacting Roles of Attention and Visual Salience in V4

“What Not” Detectors Help the Brain See in Depth

One-Dimensional Dynamics of Attention and Decision Making in LIP

Volume 73, Issue 3, Pages (February 2012)

The Brain as an Efficient and Robust Adaptive Learner

Using Time-Varying Motion Stimuli to Explore Decision Dynamics

Intelligent Leaning -- A Brief Introduction to Artificial Neural Networks Chiung-Yao Fang.

Perceptual Echoes at 10 Hz in the Human Brain

Capabilities of Threshold Neurons

Alteration of Visual Perception prior to Microsaccades

Volume 40, Issue 6, Pages (December 2003)

The Neural Basis of Perceptual Learning

The Naïve Bayes (NB) Classifier

Confidence as Bayesian Probability: From Neural Origins to Behavior

Attentional Modulations Related to Spatial Gating but Not to Allocation of Limited Resources in Primate V1 Yuzhi Chen, Eyal Seidemann Neuron Volume.

Volume 73, Issue 3, Pages (February 2012)

Karin Mogg, Brendan P. Bradley Trends in Cognitive Sciences

Volume 36, Issue 5, Pages (December 2002)

Nicholas J. Priebe, David Ferster Neuron

Liu D. Liu, Christopher C. Pack Neuron

Consequences of the Oculomotor Cycle for the Dynamics of Perception

Attention Increases Sensitivity of V4 Neurons

Brain States: Top-Down Influences in Sensory Processing

Katherine M. Armstrong, Jamie K. Fitzgerald, Tirin Moore Neuron

Ethan S. Bromberg-Martin, Masayuki Matsumoto, Okihide Hikosaka Neuron

Mapping and Cracking Sensorimotor Circuits in Genetic Model Organisms

Patrick Kaifosh, Attila Losonczy Neuron

Human vision: function

Consequences of the Oculomotor Cycle for the Dynamics of Perception

Normalization Regulates Competition for Visual Awareness

Computer Vision Lecture 19: Object Recognition III

The Normalization Model of Attention

Short-Term Memory for Figure-Ground Organization in the Visual Cortex

Albert V. van den Berg, Jaap A. Beintema Neuron

The Brain as an Efficient and Robust Adaptive Learner

Learning Sensorimotor Contingencies

The functional architecture of attention

Modeling Cognition with Neural Networks

Volume 27, Issue 2, Pages (August 2000)

Recurrent neuronal circuits in the neocortex

Volume 34, Issue 4, Pages (May 2002)

Visual Perception: One World from Two Eyes

Valerio Mante, Vincent Bonin, Matteo Carandini Neuron

Maxwell H. Turner, Fred Rieke Neuron

Patrick Kaifosh, Attila Losonczy Neuron

Presentation transcript:

Modeling Visual Attention and some other things Developing and Evolving Neural Networks by David Northmore (24 February 2014)

Zenon Pylyshyn Multiple Object Tracking FINSTs Interface between world of stimuli and world of concepts. Tracking depends on “...the figure's identity over time, or its persisting individuality. ... we have a mechanism that allows preconceptual tracking of a primitive perceptual individuality”. (?*#!?) How could a FINST be implemented in the brain - by a neural network? Uses of a pointer Direct eye movements Selection of features Conjunction of features Serial operations Save to working memory Construct subjective “panorama”

IAC modeling Using simple, yet plausible neural architecture. IAC architecture inspired by Jets & Sharks model of McClelland Turing machine analogy: if IAC can do it, brain certainly can too Zen of modeling – insight, enlightenment etc. More of us should try it Turing computer Other models: Itti & Koch (2001), Niebur et al. (1993), Kazanovich Y, Borisyuk R. (2006) Yilmaz (2012) these employ “advanced features” like oscillations, synchrony or computer science

Model Units IAC units: simple 1st order dynamics, as used here Synapses: excitatory & inhibitory Activation/Output Time Excitatory Inhibitory Input Input MaxA MinA RestA Activation Output Computational units from Interactive Activation and Competitive (IAC) models (McClelland & Rumelhart, 1991). Update rule for activation, A: ∆A := (MaxA-A)*ExcitInp – (A-MinA)*InhibInp – (A-RestA)*DecayA A := A + ∆A Output = |A|+ I also used no-delay, linear threshold units: “AUX units”, usually for inhibitory layers.

Connections Connection specs Excitatory Inhibitory Scene: bitmap Units all arranged in 2-D layers. Typically 14 x 10 Connections made topographically from layer to layer, generic, quasi developmental rules. Two params: connection radius & weight Gaussian spatial distribution of weights Scene: bitmap Input layer: analyzes color, motion, or shape Connection specs Connect ColorLayer to Layer0 8 //radius 0.5 //weight Connect Layer0 to Layer1 10 //radius 0.25 // weight 30 //radius -0.1 //weight Connect Layer1 to Layer1 1000 //radius -0.05 // weight Excitatory Inhibitory

A Pointer Network Gain layer Inhibitory layer Pointer layer Latch onto one stimulus Ignore others Deal with motion Attended stimulus processed advantageously e.g. amplified, faster RT, better recognition etc. This 3-layer pointer network “wants to work” Gain layer Inhibitory layer Pointer layer Attended/tracked object Distractor objects Response amplified compared to distractors This connection improves faithfulness of tracking by inhibiting distractors

Probe Stimulus Experiments Flashed probe stimuli showed detection enhancement at a target, and detection inhibition at a non-target in both human subjects performing MOT (Pylyshyn, 2006) and in the 3-layer pointer model tracking a single target. Probe stimulus Gain layer responses in the 3-layer pointer to probe stimuli Pylyshyn (2006). Inhibited Enhanced Probe stimulus

Multiple Object Tracking Tracking of N objects can be achieved by arranging N pointer layers connecting to a common gain layer and inhibitory layer. However, it seems more efficient to employ all pointer resources all the time by partitioning them according to tracking demand. This is achieved with a “pointer slab” consisting of several layers (6 shown here). Different patterns of interlayer inhibitories partition the slab for tracking 1-6 targets. Devoting more pointer resources to a given target should increase the accuracy or robustness of tracking, especially with noise present. Intra-layer inibitories Inter-layer inibitories Long-range Input Scene Pointer slab L1 L2 L3 L4 L5 L6 Gain layers omitted for clarity Tracking 2 targets with 6 distractors, by enabling only long-range inhibitories. Top half of the slab tracks one target; bottom half another. 500 time steps Randomly moving discs Squares show pointers Pointer error vs. time L1 L2 L3 L4 L5 L6 Tracking 4 targets with 4 distractors, by enabling half the short-range inhibitory, plus medium- and long-range connections. Note sublayers 1,2,3 & 6 track different targets, 4 & 5 track the same target. Tracking of different numbers of targets, requires intricate adjustment of intra- and inter-layer inhibition, presumably by top-down control guided by a representation of “intended targets”, and modified by perceived performance. It didn’t seem fruitful to pursue MOT further, without understanding top-down control, or change detection (next slide), which determines what gets to be tracked.

Change Detection Change grabs attention, attention grabs change. Here I compare two networks that detect change e.g. in the color of a moving object. A seemingly minor alteration in the spread of one connection has important consequences. Current state Change (onset) Past state (memory) Change grabs attention, attention grabs change. Small-field, strictly retinotopic: retina to V1, exquisitely sensitive to change and movement over small regions. Large-field, vaguely retinotopic: change layer signals feature change, uncontaminated by movement. Useful step toward “objecthood”, but subject to crowding. t Movement Change Current state Change (onset) Past state (memory) t Movement Change

Onset detector responses Change Detection Red Red+ RedMem Green Green+ GreenMem Blue Blue+ BlueMem This network has 3 color-detecting layers, 3 color-onset layers (Red+ etc.) and 3 color-memory layers. The latter are IAC units that have a long decay time. Compare the effect of doubling the spread of these inhibitory connections Colors switching | | Colors switching Stationary | Rotating ------------------------------ Onset detector responses Input display – 3 phases: 1. Colored discs are stationary and change color one by one. 2. All discs rotate without switching colors 3. Discs change color while rotating. Crowded condition Colors switching | | Colors switching Stationary | Rotating ------------------------------ In the crowded condition, the memory layers with wide inhibitory spread create inhibitory patches over the change-detector layers that merge together and silence the color change responses during movement. Explanation of Suchow-Alvarez illusion, maybe? See next slide. Color change responses largely silenced

Suchow-Alvarez illusion When fixating the central spot, with the colored discs stationary, changes in their color are easily seen; when the discs rotate about the center, changes in their color are largely unseen. The foregoing model suggests that the color-change detectors are silenced because of crowding of the dots.

Visual Short-term Memory - Luck & Vogel, 1997 “...visual working memory stores up to 4 integrated objects.” Once I had the memory units and change detectors working, it was only a small step to implement the paradigm widely used to study visual short term memory (see next slide). In Luck & Vogel’s experiments a set of stimuli (e.g. left above) is presented to a subject, followed by a blank screen for a period of time, followed by the same set of stimuli in which one feature has been changed. The subject had to indicate whether a change had occurred. Performance was nearly perfect for up to 4 stimulus objects, and declined with more objects. Accuracy was unaffected by which feature dimension was changed, or whether change occurred in one or two feature dimensions, leading the authors to conclude that “visual working memory stores integrated object percepts rather than individual features”. However the next network, which does not deal in “objects”, at least explicitly, yields very similar results to Luck & Vogel.

Visual Short-term Memory in a Model Network Vertical Vert+ VertMem Horizontal Horiz+ HorizMem White Black White+ Black+ WhiteMem BlackMem Feature detectors Onset detectors Memory units Change? The network was presented in sequence with (a) a variable number of objects differing in orientation and contrast, (b) a blank screen, and (c) a test screen in which one feature had been changed on 50% of the trials. The entire set of onset detectors was “questioned” as whether a change had occurred. Stimulus position was varied from trial to trial. a b c Model Luck & Vogel No change Performance of the model resembled that of Luck & Vogel’s subjects. In the model, increasing the number of objects produces a crowding effect due to the spread of inhibition over the onset detecting layers exerted by the memory units. Crowding the stimuli still further led to an overall degradation of performance by the model (not shown). Crowding effects may need to be considered in VSTM experiments.

Salience Processing Bottom-up attentional capture. Jan Theeuwes (2010) In Theeuwes’s paradigm a subject has to search for a singleton shape (e.g. diamond) and reaction time is measured. In the presence of a salient distractor, reaction time is increased. Theeuwes’s interpretation is that salience irresistably captures attention by a bottom-up process; top-down processes can control feature selection, but only later. The following network models the bottom up processes.

Salience Processing in a Model Network Input scene – rotating discs Activity in Green detector layer Activity in Blue detector layer Red Green Pointer Blue RedIAC GreenIAC BlueIAC SalienceIAC InhSalience InhPtr Yellow YellowIAC White WhiteIAC Black BlackIAC Color layers IAC layers Inhibitory units Activity in Salience IAC layer The inhibitory units sum activity of each color. The more activity due to one color the more inhibited the corresponding colorIAC layer. The least prevalent color is represented most strongly here. x The pointer layer amplifies this The slow dynamics of the InhPtr layer gives it a memory for recent pointer location. This “Inhibition of Return” makes the pointer visit other discs after visiting the most salient, in this case, the yellow.

Activity across layers Evolution Target Novel network solutions for tracking targets are obtained by applying a genetic algorithm. The existing network-building system is readily adapted for a GA using “chromosomes” specifying layers and connections. Each individual network is evaluated with a 3-phase display: 1. a rotating target disc alone for the first 1/3 of the time; 2. addition of a distractor disc for the next 1/3; 3. addition of a second distractor disc for the remaining 1/3. Activity across layers Chromosome Visual sensor layer L C

Genetic Algorithm Target Fitness = 1000 – dist – hops – # active units Each layer is evaluated as a possible pointer layer over the 3 testing phases by calculating a Fitness according to the formula below . “Hops” is the number of times a pointer switches nearest object. “# active units” penalizes layers that are active overall. Make Initial Population Evaluate fitness Copy Population Cross-over with p=0.3 on Population Mutate with p=0.05 Temporary Pop Rank Temp Pop by fitness New Population Replace unchanged indivs in Population with top ranked indivs in Temp Pop Crit Fitness achieved? Finish Fitness = 1000 – dist – hops – # active units Target Dist

Cross-over Indiv A L C Indiv B Indiv C Indiv D While other cross-over schemes could be used, this one was chosen because it allows one or more layers with associated connections to be transferred intact.

Best fitness for each layer Some Results In this evolution run, 2 families of individuals excelled. The first to emerge (#82) in the 11th generation used Layer2 as pointer; Layer1 was inactive and Layer0 was an amplifying relay. A more efficient exemplar of the other family (#120) emerged at about generation 20 and employed the same connectional motif as #82, making Layer0 the pointer. The graphs show the progress of evolution. On the left the yellow symbols show the emergence of networks like #82; the blue symbols, networks like #120. On the right is shown the convergence to good solutions involving all IAC units and 2:1 inhibitory: excitatory synapses. #120 Fitness=931.3 #82 Fitness=920.1 L0 L0 L1 L2 Mean & Best Fitness, #s of IAC & AUX units, #s of Excitatory & Inhibitory Synapses Best fitness for each layer InfoFile21.txt popsize=10

How does the evolved network work? Network #120 can be simplified to the network shown here, with even better tracking results (Fitness = 942.2). Tracking performance during the 3-phases of testing Dist 1 2 3 discs // ----- CONNECTIONS ----- Connect White to Layer0 8 //radius 0.25 // gain Connect Layer0 to Layer0 24 0.150 1000 -0.050 L0 Interpretation: The excitatory feedback connection amplifies and spreads the excitation over Layer0 due to the target input. The weak, widespread, inhibitory connection limits the spread of excitation and prevents the distractor inputs driving Layer0 units above their threshold. This network appears to track as well as my 3-layer pointer network. It remains to be seen how well it performs in the foregoing applications Activity over Layer0 “Training” condition Still works well with different numbers of distractors Triumph of Evolution over Intelligent Design!

Conclusions Simple networks can clarify attentional mechanisms, yield insights, and suggest possible brain circuitry. MOT – the basic function is adequately modeled without synchrony etc. but orchestrating inhibition for variable numbers of targets is complex and likely involves top-down control. Networks show promise in clarifying the processing of change and saliency and their roles in attention and VSTM. For discovering network mechanisms, evolution extends imagination.