Class 21, 1999 CBCl/AI MIT Neuroscience II T. Poggio
Class 21, 1999 CBCl/AI MIT Neuroscience Brain Overview?
Class 21, 1999 CBCl/AI MIT The Ventral Visual Pathway modified from Ungerleider and Haxby, 1994
Class 21, 1999 CBCl/AI MIT
Class 21, 1999 CBCl/AI MIT Visual Areas
Class 21, 1999 CBCl/AI MIT Face-tuned cells in IT
Class 21, 1999 CBCl/AI MIT Model of view-invariant recognition: learning from views VIEW ANGLE Poggio, Edelman Nature, A graphical rewriting of mathematics of regularization (GRBF), a learning technique
Class 21, 1999 CBCl/AI MIT Learning to Recognize 3D Objects in IT Cortex Logothetis, Pauls, Poggio 1995 Examples of Visual Stimuli After human psychophysics (Buelthoff, Edelman, Tarr, Sinha,…), which supports models based on view-tuned units...physiology!
Class 21, 1999 CBCl/AI MIT Task Description Stimulus Blue Fixspot Response Stimulus Yellow Fixspot Response Left Lever Right Lever T = Target D = Distractor Logothetis, Pauls, Poggio 1995
Class 21, 1999 CBCl/AI MIT Recording Sites in Anterior IT Logothetis, Pauls, and Poggio, 1995; Logothetis, Pauls, 1995
Class 21, 1999 CBCl/AI MIT Model’s predictions: View-tuned Neurons VIEW ANGLE VIEW- TUNED UNITS
Class 21, 1999 CBCl/AI MIT The Cortex: Neurons Tuned to Object Views Logothetis, Pauls, Poggio 1995
Class 21, 1999 CBCl/AI MIT A View Tuned Cell Logothetis, Pauls, Poggio 1995
Class 21, 1999 CBCl/AI MIT Model’s predictions : View- invariant, Object-specific Neurons View Angle VIEW- INVARIANT, OBJECT- SPECIFIC UNIT
Class 21, 1999 CBCl/AI MIT The Cortex: View-invariant, Object-specific Neurons Logothetis, Pauls, Poggio,1995
Class 21, 1999 CBCl/AI MIT Recognition of Wire Objects
Class 21, 1999 CBCl/AI MIT Generalization Field
Class 21, 1999 CBCl/AI MIT
Class 21, 1999 CBCl/AI MIT
Class 21, 1999 CBCl/AI MIT
Class 21, 1999 CBCl/AI MIT View-dependent Response of an IT Neuron
Class 21, 1999 CBCl/AI MIT Sparse Representations in IT About 400 view tuned cells per object Perhaps 20 view-invariant cells per object In the recording area in AMTS -- a specialized region for paperclips (!) -- we estimate that there are, after training, we estimate that there are, after training, (within an order of magnitude or two) … Logothetis, Pauls, Poggio, 1997
Class 21, 1999 CBCl/AI MIT Previous glimpses: cells tuned to face identity and view Perrett, 1989
Class 21, 1999 CBCl/AI MIT 2. View-tuned IT neurons View-tuned cells in IT Cortex: how do they work? How do they achieve selectivity and invariance? Max Riesenhuber and T. Poggio, Nature Neuroscience, just published
Class 21, 1999 CBCl/AI MIT max Some of our funding is from Honda...
Class 21, 1999 CBCl/AI MIT Model’s View-tuned Neurons VIEW ANGLE VIEW- TUNED UNITS
Class 21, 1999 CBCl/AI MIT Scale-Invariant Responses of an IT Neuron (training on one size only!) Logothetis, Pauls and Poggio, 1995 Scale Invariant Responses of an IT Neuron
Class 21, 1999 CBCl/AI MIT Invariance around training view Invariance while maintaining specificity Spike Rate (Target Response)/ (Mean of Best Distractors) Invariances: Overview Logothetis, Pauls and Poggio, 1995
Class 21, 1999 CBCl/AI MIT Our quantitative model builds upon previous hierarchical models Hubel & Wiesel (1962): Simple to complex to ``higher order hypercomplex cells’’ Fukushima (1980): Alternation of “S” and “C” layers to build up feature specificity and translation invariance, resp. Perrett & Oram (1993): Pooling as general mechanism to achieve invariance
Class 21, 1999 CBCl/AI MIT Model of view tuned cells MAX Riesenhuber and Tommy Poggio, 1999
Class 21, 1999 CBCl/AI MIT Model Diagram “IT” “V4” “V1”... w View-specific learning: synaptic plasticity
Class 21, 1999 CBCl/AI MIT Max (or “softmax”) key mechanism in the model computationally equivalent to selection (and scanning in our object detection system)
Class 21, 1999 CBCl/AI MIT V1: Simple Features, Small Receptive Fields Simple cells respond to bars Hubel & Wiesel, 1959 “Complex Cells”: translation invariance; pool over simple cells of the same orientation (Hubel&Wiesel)
Class 21, 1999 CBCl/AI MIT Two possible Pooling Mechanisms thanks to Pawan Sinha Nn nn
Class 21, 1999 CBCl/AI MIT An Example: Simple to Complex Cells “simple”cells “complex” cell ?
Class 21, 1999 CBCl/AI MIT Simple to Complex: Invariance to Position and Feature Selectivity ? “simple”cells “complex” cell
Class 21, 1999 CBCl/AI MIT 3. Some predictions of the model Scale and translation invariance of view-tuned AIT neurons Response to pseudomirror views Effect of scrambling Multiple objects Robustness to clutter Consistent with K. Tanaka’s simplification procedure More and more complex features from V1 to AIT
Class 21, 1999 CBCl/AI MIT Testing Selectivity and Invariance of Model Neurons Test specificity AND transformation tolerance of view-tuned model neurons Same objects as in Logothetis’ experiment 60 distractors
Class 21, 1999 CBCl/AI MIT Invariances of IT (view-tuned) Model Neuron
Class 21, 1999 CBCl/AI MIT Invariances: Experiment vs. Model (view-tuned cells) *
Class 21, 1999 CBCl/AI MIT MAX vs. Summation
Class 21, 1999 CBCl/AI MIT Response to Pseudo-Mirror Views As in experiment, some model neurons show tuning to pseudo-mirror image
Class 21, 1999 CBCl/AI MIT Robustness to scrambling: model and IT neurons Experiments: Vogels, 1999
Class 21, 1999 CBCl/AI MIT Recognition in Context: Two Objects
Class 21, 1999 CBCl/AI MIT Recognition in Context: some experimental support Sato: Response of IT cells to two stimuli in RF Sato, 1989
Class 21, 1999 CBCl/AI MIT Recognition in Clutter:data How does response of IT neurons change if background is introduced? Missal et al., 1997
Class 21, 1999 CBCl/AI MIT Recognition in Clutter: model average model neuron response recognition rates
Class 21, 1999 CBCl/AI MIT Further Support: Keiji just mentioned his simplification paradigm... Wang et al., 1998
Class 21, 1999 CBCl/AI MIT Consistent behaviour of the model
Class 21, 1999 CBCl/AI MIT Higher complexity and invariances in Higher Areas Kobatake & Tanaka, 1994
Class 21, 1999 CBCl/AI MIT Fujita and Tanaka’s Dictionary of Shapes (about 3000) in posterior IT (columnar organization)
Class 21, 1999 CBCl/AI MIT Similar properties in the model... M. Tarr, Nature Neuroscience
Class 21, 1999 CBCl/AI MIT Layers With Linear Pooling and With Max Pooling Linear pooling: yields more complex features (e.g. from LGN inputs to simple cells and -- perhaps -- from PIT to AIT cells) Max pooling: yields invariant (position, scale) features over a larger receptive field(e.g. from simple to complex V1 cells)
Class 21, 1999 CBCl/AI MIT 4. Hypothetical circuitry for Softmax The max operation is at the core of the model properties Which biophysical mechanisms and circuitry underlies the max operation?
Class 21, 1999 CBCl/AI MIT Softmax circuitry The SOFTMAX operation may arise from cortical microcircuits of lateral inhibition between neurons in a cortical layer. An example:a circuit based on feed forward (or recurrent) shunting presynaptic (or post synaptic) inhibition. Key elements: 1) shunting inhibition 2) nonlinear transformation of the signals (synaptic nonlinearities or active membrane properties). The circuit performs: a gain control operation (as in the canonical microcircuit of Martin and Douglas…) and -- for certain values of the parameters -- a softmax operation:
Class 21, 1999 CBCl/AI MIT Summary Model is consistent with: psychophysics of Buelthoff, Edelman,Tarr Logothetiss’ physiology on view-tuned IT cells One-view invariance ranges in AIT to shift, scale and rotation Response to pseudo-mirror views Sato’s data on 2 stimuli in IT cells Simplification data from Tanaka Clutter experiments in Orban’s lab Scrambling in Orban’s lab Biederman’s psychophysics with different “geons”
Class 21, 1999 CBCl/AI MIT Summary: main points of model Max-like operation, computationally similar to scanning and selecting Hypothetical inhibitory microcircuit for Softmax in cortex Easy grafting of top-down attentional effect on circuitry Segmentation is a byproduct of recognition No binding problem, syncrhonization not needed Model is extension of classical hierarchical H-W scheme Model deals with nice object classes (e.g. faces) and can be extended to object classification (rather then subordinate level recognition). Just a plausibility proof! Experiments wanted (to prove it wrong)!
Class 21, 1999 CBCl/AI MIT Category boundary Prototypes 100% Cat 80% Cat Morphs 60% Cat Morphs 60% Dog Morphs 80% Dog Morphs Prototypes 100% Dog Novel 3D morphing system to create new objects that are linear combinations of 3D prototypes
Class 21, 1999 CBCl/AI MIT Fixation Sample Delay Test (Nonmatch) Delay (Match) Test (Match) 600 ms ms. 500 ms. Object classification task for monkey physiology
Class 21, 1999 CBCl/AI MIT dog100% dog80% dog60% cat60% cat80% cat100% l04.spk 1301 time (msec) spike rate (Hz) Dog activity Cat activity Sample on Delay period Fixation Preliminary results from Prefrontal Cortex Recordings This suggests that prefrontal neurons carry information about the category of objects
Class 21, 1999 CBCl/AI MIT
Class 21, 1999 CBCl/AI MIT Recognition in Context: some experimental support Sato: Response of IT cells to two stimuli in RF Sato, 1989 Summation index for Max is 0 for Sum is 1 Sato finds -0.1 in the average
Class 21, 1999 CBCl/AI MIT Simulation