A Neural Model for Detecting and Labeling Motion Patterns in Image Sequences Marc Pomplun 1 Julio Martinez-Trujillo 2 Yueju Liu 2 Evgueni Simine 2 John Tsotsos 2 1 UMass Boston 2 York University, Toronto, Canada
“ Data Flow Diagram ” of Visual Areas in Macaque Brain Blue: motion perception pathway Green: object recognition pathway
Receptive Fields in Hierarchical Neural Networks neuron A receptive field of A
Receptive Fields in Hierarchical Neural Networks receptive field of A in input layer neuron A in top layer
poor localization crosstalk Problems with Information Routing in Hierarchical Networks
The Selective Tuning Concept (Tsotsos, 1988) processing pyramid inhibited pathways pass pathways: hierarchical restriction of input space unit of interest at top input
top-down, coarse-to-fine WTA hierarchy for selection and localization unselected connections are inhibited WTA achieved through local gating networks Hierarchical Winner-Take-All
unit and connection in the interpretive network unit and connection in the gating network unit and connection in the top-down bias network layer +1 layer -1 layer I Selection Circuits
3D Visualization of the Selective Tuning Network Red: WTA phase 1 activeGreen: WTA phase 2 active Blue: inhibitionYellow: WTA winner
The Motion Perception Pathway MST MT V1 feed- forward feedback input feed- forward feedback
What do We Know about Area V1? cells have small receptive fields each cell has a preferred direction of motion direction of motion activation preferred direction there are three types of motion speed selectivity speed of motion activation low-speed cells medium-speed cells high-speed cells
What do We Know about Area MT? cells have larger receptive fields than in V1 like in V1, each cell has a preferred combination of the direction and speed of motion MT cells also have a preferred orientation of the speed gradient orientation of speed gradient activation preferred orientation of speed gradient without speed gradient with speed gradient
What do We Know about Area MST? cells respond to motion patterns such as –translation (objects shifting positions) –rotation (clockwise and counterclockwise) –expansion (approaching objects) –contraction (receding objects) –spiral motion (combinations of rotation and expansion/contraction) the response of a cell is almost independent on the position of the motion pattern in the visual field
The Motion Hierarchy Model: V1 V1 receives image sequences as input and extracts the direction and speed of motion counterclockwise rotationclockwise rotationcontractionexpansion counterclockwiseclockwise contractionexpansion
The Motion Hierarchy Model: V1 V1 is simulated as 60x60 hypercolumns each column contains 36 cells: one for each combination of direction (12) and speed tuning (3) direction and speed selectivity are achieved with spatiotemporal filters these filters process local information from the last seven images in the sequence example: cells tuned towards upward motion: input pattern: counter-clockwise rotation high-speed cells medium-speed cells low-speed cells
The Motion Hierarchy Model: MT MT is simulated as 30x30 hypercolumns each column contains 432 cells: one for each combination of direction (12) speed (3), and speed gradient tuning (12) problem: how can gradient tuning be realized from activation patterns in V1? –solution: detect gradient differences across the three types of speed selective cells –this solution leads to a simple network structure and remarkably good noise reduction the activation of an MT cell is the product of its activation by direction, speed, and gradient
The Motion Hierarchy Model: MST how can MST cells detect motion patterns such as rotation, expansion, and contraction based on the activation of MT cells? counterclockwiseclockwisecontractionexpansion movementspeed gradient idea: the presence of these motion patterns is indicated by a consistent angle between the local movement and speed gradient
The Motion Hierarchy Model: MST direction of movement orientation of speed gradient
The Motion Hierarchy Model: MST MST cells integrate the activation of MT cells that respond to a particular angle between motion and speed gradient this integration is performed across a large part of the visual field and across all 12 directions therefore, MST can detect 12 different motion patterns we simulate 5x5 MST hypercolumns, each containing 36 neurons (tuned for 12 different motion patterns, 3 different speeds)
direction of movement speed gradient V1 MT MST
Simulation: clockwise rotation direction of movement speed gradient
Simulation: counter- clockwise rotation direction of movement speed gradient
Simulation: receding object direction of movement speed gradient
Attention in the Motion Hierarchy What happens if there are multiple motion patterns in the visual input? Visual attention can be used to determine the type and location of the most salient motion pattern, focus on it by eliminating all interfering information, sequentially inspect all objects in the visual field.
direction of movement speed gradient
direction of movement speed gradient
Conclusions and Outlook the motion hierarchy model provides a plausible explanation for cell properties in areas V1, MT, and MST its use of distinct speed tuning functions in V1 and speed gradient selectivity in MT leads to a relatively simple network structure combined with robust and precise detection of motion patterns visual attention is employed to segregate and sequentially inspect multiple motion patterns
Conclusions and Outlook the model predicts inhibition of visual functions around any attended motion pattern the model also predicts that different motion patterns induce different activation patterns in V1, MT, and MST linear motion activates V1, MT, and MST speed gradients increase MT and MST activation rotation, expansion, and contraction increase MST activation this is currently being tested by fMRI scanning experiments in Magdeburg, Germany
Conclusions and Outlook the model is well-suited for mobile robots to estimate parameters of ego-motion the area MST in the simulated hierarchy is very sensitive to any translational or rotational ego-motion in biological vision, MST is massively connected to the vestibular system in mobile robots, the simulated area MST could interact with position and orientation sensors to stabilize ego- motion estimation
Conclusions and Outlook Future work: lateral interaction across neighboring sets of gating units for improved perceptual grouping simultaneous simulation of both the motion perception and object recognition pathways introduction of working memory for an adequate internal representation of the current visual scene