2/11/20071 ACQ and the Basal Ganglia Jimmy Bonaiuto USC Brain Project 2/12/2007
2/11/20072 Outline Alstermark’s Cat ACQ ACQ → Basal Ganglia Basal Ganglia Model Implementations (NSL) The Search for Executability
2/11/20073 Alstermark’s Cat
2/11/20074 ACQ
2/11/20075 ACQ
2/11/20076 ACQ - Executability -2D Gaussian kernel populations -Food location relative to mouth -Food location relative to paw -Food location relative to tube opening
2/11/20077 Learning Executability - Success or failure is signaled by the match or mismatch between efferent signals and mirror system output
2/11/20078 Learning Desirability - Eligibility signal computed from - Internal state - Mirror system output - Efferent signal
2/11/20079 Priority Simplified form: priority = executability × desirability Leaky integrator form:
2/11/ Action Selection - Winner declared when max CC layer element firing rate is greater or equal to ε 1 (0.9) and all other element firing rates are less than or equal to ε 2 (0.1).
2/11/ ACQ
2/11/ ACQ Selection Properties Contrast- Dependent Latency
2/11/ ACQ Selection Properties -Approximation to Boltzmann equation: T=temperature
2/11/ ACQ – TD Learning No initialized weights Eat initialized Reach-grasp initialized Effects of Desirability Weight Initialization on Mean Trial Length During TD Learning
2/11/ ACQ – Simulation Results Final Desirability Weights Mean Trial Length Mean Unsuccessful Action Attempts
2/11/ ACQ – Simulation Results MF - Eat MF – Grasp Jaw PF – Reach Food PF – Reach Tube
2/11/ Where in the Brain is ACQ? Affordances –Posterior parietal cortex Object-directed motor schemas –Premotor cortex Winner-Take-All –Basal ganglia (Winner-Lose-All) Desirability Learning –Striatum with TD error signal from midbrain dopaminergic system (SNc, VTA) What about Executability?
2/11/ Basal Ganglia Model Implementations (NSL) The following models are implemented in NSL and available for extension or experimentation: –GPR –Brown, Bullock, & Grossberg –RDDR
2/11/ Gurney, Prescott, Redgrave (GPR) -Interlayer winner-lose-all -Control signal calculated from the sum of the cortical signal provides a gain signal to the competition
2/11/ GPR GPi/SNr Str-D1CortexStr-D2STN GPe
2/11/ GPR What does a consideration of the GPR model bring to ACQ? –Intralayer WTA → Interlayer WTA –WTA → WLA Do we need a control (gain) signal? –We may want to explore the possibility of chunking when two actions are activated to similar levels
2/11/ Brown, Bullock, & Grossberg Ventral striatum → ventral pallidum → PPTN Learns to activate SNc given secondary reinforcer Cortex → Striosomes Learns to inhibit SNc response to primary reinforcer Learns timing between primary and secondary reinforcers
2/11/ Brown, Bullock, & Grossberg
2/11/ Brown, Bullock, & Grossberg
2/11/ Brown, Bullock, & Grossberg
2/11/ Brown, Bullock, & Grossberg What does a consideration of the Brown, Bullock, & Grossberg model bring to ACQ? –A neural method of computing the TD error signal –Can we extend it to have multiple primary reinforcers (dimensions of reinforcement)?
2/11/ Reinforcement Driven Dimensionality Reduction (RDDR) Extension of PCA neural network methods to include reinforcement Feedforward connections: normalized multi-Hebbian with reinforcement Lateral connections: normalized anti-Hebbian
2/11/ RDDR - Pretraining
2/11/ RDDR – Mid-training
2/11/ RDDR - Trained
2/11/ RDDR - Retraining
2/11/ RDDR - Retrained
2/11/ RDDR What does a consideration of the Brown, Bullock, & Grossberg model bring to ACQ? –Maybe nothing, but it may be useful in chunking actions in hACQ
2/11/ Where is Executability? We can map ACQ onto the basic BG architecture by modeling an interlayer WLA network with cortico-striatal connection weights encoding desirability and modified via TD learning How does executability fit in?
2/11/ Parietal / Basal Ganglia Projections Petras (1971) – Projections from the inferior and superior parietal lobules to the striatum and thalamus Cavada & Goldman (1991) – Subregions of parietal area 7 project to portions of the striatum bilaterally Flaherty & Graybiel (1991) – Somatotopic projections from S1 to the striatum –Only innervates matrix – not striosomes Graziano & Gross (1993) – Bimodal somatotopic map in putamen Lawrence et al. (2000) –Dorsal stream projects to the anterodorsal striatum
2/11/ ACQ Basal Ganglia Could executability and desirability be represented in segregated regions of the striatum and be combined in the globus pallidus? Or perhaps they are combined in the striatum?
2/11/ References Bar-Gad, I., Morris, G., Bergman, H. (2003) Information processing, dimensionality reduction and reinforcement learning in the basal ganglia. Progress in Neurobiology, 71: 439–473. Brown, J., Bullock, D., Grossberg, S. (1999) How the Basal Ganglia Use Parallel Excitatory and Inhibitory Learning Pathways to Selectively Respond to Unexpected Rewarding Cues. J. Neurosci., 19(23): Cavada, C., Goldman-Rakic, P.S. (1991) Topographic Segregation of Corticostriatal Projections from Posterior Parietal Subdivisions in the Macaque Monkey. Neuroscience, 42(3): Flaherty, A.W., Graybiel, A.M. (1991) Corticostriatal Transformations in the Primate Somatosensory System. Projections from Physiologically Mapped Body-Part Representations. J. Neurophys. 66(4): Graziano, M.S.A., Gross, C.G. (1993) A bimodal map of space: Somatosensory receptive fields in the macaque putamen with corresponding visual receptive fields. Exp Brain Res, 97: Gurney, K., Prescott, T.J., Redgrave, P. (2001) A computational model of action selection in the basal ganglia. I. A new functional anatomy. Biol. Cybern. 84: Lawrence, A.D., Watkins, L.H.A., Sahakian, B.J., Hodges, J.R., Robbins, T.W. (2000) Visual object and visuospatial cognition in Huntington’s disease: implications for information processing in corticostriatal circuits. Brain, 123: Petras, J.M. (1971) Connections of the Parietal Lobe. J. Psychiat. Res., 8: