6/26/20071 ACQ and the Basal Ganglia Jimmy Bonaiuto USC Brain Project 6/26/2007.

Slides:



Advertisements
Similar presentations
Alan Pickering Department of Psychology
Advertisements

5. Major Brain Structures from the Bottom-Up
BASAL NUCLEI (Basal Ganglia).
The Basal Ganglia Maryann Martone, Ph. D. NEU257 2/22/2011.
Control of Attention and Gaze in the Natural World.
Journal club Marian Tsanov Reinforcement Learning.
The Nervous System A network of billions of nerve cells linked together in a highly organized fashion to form the rapid control center of the body. Functions.
Introduction to Neurobiology Lecture 2: “Structure of the Nervous System-Basic concept in neuroanatomy-”
Dopamine, Uncertainty and TD Learning CNS 2004 Yael Niv Michael Duff Peter Dayan Gatsby Computational Neuroscience Unit, UCL.
Introduction: What does phasic Dopamine encode ? With asymmetric coding of errors, the mean TD error at the time of reward is proportional to p(1-p) ->
Reward processing (1) There exists plenty of evidence that midbrain dopamine systems encode errors in reward predictions (Schultz, Neuron, 2002) Changes.
1 Chapter 6 Diencephalon Chris Rorden University of South Carolina Arnold School of Public Health Department of Communication Sciences and Disorders University.
FIGURE 4 Responses of dopamine neurons to unpredicted primary reward (top) and the transfer of this response to progressively earlier reward-predicting.
Assess Prof. Fawzia Al-Rouq Department of Physiology College of Medicine King Saud University Functional Anatomy of the Nervous System.
Dopamine pathways & antipsychotics Pharmacology Instructor Health Sciences Faculty University of Mendoza Argentina Psychiatry Resident Mental Health Teaching.
Thalamus, Hypothalamus,Epithalamus
Anatomy and Physiology Psychology Introduction Anatomy vs. physiology Anatomy vs. physiology Brain is organized in, at best, a semi random pattern.
BIO 132 Neurophysiology Lecture 34 Diffuse Modulatory System.
The Basal Ganglia.
Neural circuits for bias and sensitivity in decision-making Jan Lauwereyns Associate Professor, Victoria University of Wellington, New Zealand Long-term.
Prediction in Human Presented by: Rezvan Kianifar January 2009.
Reinforcement learning and human behavior Hanan Shteingart and Yonatan Loewenstein MTAT Seminar in Computational Neuroscience Zurab Bzhalava.
The Basal Ganglia. I.Functional anatomy A. Input and output components cerebral cortex  BG  thalamus (VA)  frontal lobe. B. Parallel circuits C. Neurotransmitters.
Michael S. Beauchamp, Ph.D. Assistant Professor Department of Neurobiology and Anatomy University of Texas Health Science Center at Houston Houston, TX.
PhD MD MBBS Faculty of Medicine Al Maarefa Colleges of Science & Technology Faculty of Medicine Al Maarefa Colleges of Science & Technology Lecture – 9:
The Basal Ganglia (Lecture 6) Harry R. Erwin, PhD COMM2E University of Sunderland.
Basal Ganglia and Thalamic Connections Bruce Crosson, Ph.D.
Testing computational models of dopamine and noradrenaline dysfunction in attention deficit/hyperactivity disorder Jaeseung Jeong, Ph.D Department of Bio.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 26- Reinforcement Learning for Robots; Brain Evidence.
Show Me the Money! Dmitry Kit. Outline Overview Reinforcement Learning Other Topics Conclusions.
Chapter 16. Basal Ganglia Models for Autonomous Behavior Learning in Creating Brain-Like Intelligence, Sendhoff et al. Course: Robots Learning from Humans.
©2011 McGraw-Hill Higher Education. All rights reserved Chapter 4 Neuromotor Basis for Motor Control Concept: _________________________________________.
1 Psychology 304: Brain and Behaviour Lecture 9. 2 The Structure and Cells of the Nervous System 3. What is the structure of the neuron? 1.What are the.
Chapter 2 Biological Foundations and the Brain. Copyright © 1999 by The McGraw-Hill Companies, Inc. 2 The Genetic Perspective Chromosomes threadlike structures.
2/11/20071 ACQ and the Basal Ganglia Jimmy Bonaiuto USC Brain Project 2/12/2007.
STRUCTURE AND CIRCUITS OF THE BASAL GANGLIA Rastislav Druga Inst. of Anatomy, 2nd Medical Faculty.
Dashed line delineates the category boundary Cat A Cat B Visual WithinAcrossAuditory Within Cat A Cat B Cat C Cat D Cat A Cat B Cat C Cat D Dashed lines.
Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen.
BRAIN AND BEHAVIOR. WHY DO PSYCHOLOGISTS STUDY THE NERVOUS SYSTEM? The nervous system is the direct source of all behavior The nervous system is shaped.
Nucleus Accumbens An Introductory Guide.
Neurochemistry of executive functions
Introduction to Neurobiology Lecture 2: “Structure of the Nervous System; Basic concept in neuroanatomy” 1.
E. Dopaminergic neurons are located in the midbrain and hypothalamus
Dopamine system: neuroanatomy
Comparing Single and Multiple Neuron Simulations of Integrated Dorsal and Ventral Striatal Pathway Models of Action Initiation Selin Metin1, Neslihan Serap.
Zhejiang University Ling Shucai
THE BASAL GANGLIA: Neuroanatomy
Neuroimaging of associative learning
Basal ganglia movement modulation
Basal ganglia function
Dopamine pathways & antipsychotics
Alexander W. Johnson  Trends in Neurosciences 
MOTIVATION.
Neuroimaging of associative learning
Presented by: Rezvan Kianifar January 2009
The Brain on Drugs: From Reward to Addiction
Eleanor H. Simpson, Christoph Kellendonk, Eric Kandel  Neuron 
Neuroimaging of associative learning
Reward Mechanisms in Obesity: New Insights and Future Directions
Brain Reward Circuitry
Schizophrenia, Dopamine and the Striatum: From Biology to Symptoms
Circuitry of self-control and its role in reducing addiction
Neuromodulation of Attention
Juan Mena-Segovia, J. Paul Bolam  Neuron 
BASAL NUCLEI. BASAL NUCLEI Basal Ganglia Functions Compare proprioceptive information and movement commands. Sequence movements. Regulate muscle tone.
Ho Namkung, Sun-Hong Kim, Akira Sawa  Trends in Neurosciences 
Reward Mechanisms in Obesity: New Insights and Future Directions
M.B.B.S,M.C.P.S.(Psych),F.C.P.S (Psych).
Orbitofrontal Cortex as a Cognitive Map of Task Space
Associational cortex introduction
Presentation transcript:

6/26/20071 ACQ and the Basal Ganglia Jimmy Bonaiuto USC Brain Project 6/26/2007

2 Actor-Critic Learning Actor – learns action policy Critic – learns value functions Different actor-critic architectures have been proposed for learning different value functions: –V(s) = State values (most common) –V(a) = Action values –Q(s,a) = State, action pair values

6/26/20073 Actor-Critic Architecture Core Data – recording of midbrain dopaminergic neurons in appetitive learning tasks (Schultz, 1992; Schultz, 1998) (from Barto, 1995)

6/26/20074 Critic – V(s), V(a), or Q(s,a)? How do dopamine cells know about reward value? –Largest striatum input is from cortex (Haber and Gdowski, 2004) –V(s) and Q(s,a) learning may require the ventral striatum, SNc, and/or VTA to receive a copy of the same cortical projections that the dorsal striatum receives (state information) –V(a) may only require a projection from the dorsal striatum or globus pallidus (actor) to the ventral striatum, SNc and/or VTA (critic) –Largest forebrain input to dopamine neurons is striatum (Haber and Gdowski, 2004) -V(a) may be more biologically plausible in terms of connectivity

6/26/20075 Actor-Critic in the Basal Ganglia Dopamine targets (striatum) are site of value and policy learning (Suri & Schultz, 2001) The striatum split into dorsal and ventral divisions (some say dorsolateral and ventromedial) (Voorn et al., 2004) –Ventral striatum – inputs from limbic structures (critic?) –Dorsal striatum – connected with motor and associative cortices (actor?)

6/26/20076 Role of Dopamine (Joel & Weiner, 2000) Dopamine neurons in the ventral tegmental area (VTA) and substantia nigra pars compacta (SNc) –VTA projects to ventral striatum – learning state values –SNc projects to dorsal striatum – policy learning Little difference in VTA and SNc firing (Schultz et al., 1993) –Predicted by TD learning equation since the policy and values are both updated using TD error

6/26/20077 ACQ Reinforcement learning should maximize total utility, not necessarily total reward. Motivations map outcomes to utilities (Niv et al., 2006) Multiple critics – one for each dimension of interoception (hunger, thirst, etc.) –Q(s,a), s =internal state, a=action Actor –Composite policy Desirability – based on internal state Executability – based on environmental state –Eligibility trace from mirror and canonical motor signals ii

6/26/20078 ACQ – Actor/Multiple Critics x=executed action x=recognized action ^

6/26/20079 ACQ - Eligibility Trace = executed action (from efference copy) = recognized action (from mirror system) Action Outcome xxε Not Attempted 0.0 Unsuccessful Unintended Successful ^ Idealized situations (perfect recognition) Realistic implementation would have confidence values between 0.0 and 1.0 for x and x, but the pattern of values for ε would be the same ^

6/26/ ACQ - Weight Modification Desirability and Executability updated using same eligibility and reinforcement signals Requires different weight change rules: Desirability Executability Tonic dopamine level, d, added to TD error – Makes sign of weight change depend on ε(t) Don’t update the value of the last action unless some action is currently recognized Step function of eligibility trace – Makes sign of weight change depend on r(t) ^

6/26/ Multiple Critics – Q(s,a) Is there evidence for multiple critics gated by interoceptive information? –The lateral hypothalamus does project to the SNc, VTA, and the ventral striatum (Saper et al., 1979; Fadel & Deutch, 2002; Brog et al., 1993) –The accumbens shell of the ventral striatum is reciprocally connected with the lateral hypothalamus and has been called a “sensory sentinel” or “visceral striatum” (Kelley, 1999, 2004) –Motivational state, such as food deprivation can influence the magnitude of dopamine release in the ventral striatum (Wilson et al., 1995; Ahn & Phillips, 1999) –Sexual satiety is signaled by serotonin from the lateral hypothalamus to the ventral striatum, which reduces dopamine levels (Lorrain et al., 1999) i

6/26/ Internal State-Dependent Policy Is there evidence for internal state- dependent policies? (Kelley et al., 2005) –Information from the lateral hypothalamus reaches the dorsal striatum through the paraventricular nucleus –Hypothalamic-midline thalamic-striatal projections carry internal state information to cholinergic interneurons of the dorsal striatum These are thought to modulate dorsal striatal output neurons

6/26/ Eligibility Trace from the Mirror System What is the evidence for an eligibility signal from mirror neurons? –People can implicitly learn sequences through action observation (Bird et al., 2005) –The striatum is consistently implicated in implicit sequence learning and the magnitude of activation is correlated with reaction time improvement (Rauch et al., 1997, 1998) –The basal ganglia is active during action observation (Frey & Gerry, 2006) –Projection from ventral premotor cortex (including the arcuate sulcus) to dorsal and ventral striatum in the macaque (McFarland & Haber, 2000)

6/26/ References Ahn S, Phillips AG (1999) Dopaminergic Correlates of Sensory-Specific Satiety in the Medial Prefrontal Cortex and Nucleus Accumbens of the Rat. The Journal of Neuroscience, 19:RC29:1-6. Bird G, Osman M, Saggerson A, Heyes C (2005) Sequence learning by action, observation and action observation. British Journal of Psychology, 96: 371–388. Brog JS, Salyapongse A, Deutch AY, Zahm DS (1993) The patterns of afferent innervation of the core and shell in the Accumbens part of the rat ventral striatum: Immunohistochemical detection of retrogradely transported fluoro-gold. The Journal of Comparative Neurology, 338(2): Fadel J, Deutch AY (2002) Anatomical Substrates of Orexin-Dopamine Interactions: Lateral hypothalamic projections to the ventral tegmental area. Neuroscience, 111(2): Frey SH, Gerry VE (2006) Modulation of Neural Activity during Observational Learning of Actions and Their Sequential Orders. The Journal of Neuroscience, 26(51): Haber SN, Gdowski MJ (2004) The basal ganglia. In: The human nervous system (Paxinos G, Mai JK, eds) Ed 2 pp. 676–738. New York: Elsevier Academic. D. Joel and I. Weiner. The connections of the dopaminergic system with the striatum in rats and primates: An analysis with respect to the functional and compartmental organization of the striatum. Neuroscience, 96:451–474, Kelley AE (1999) Functional Specificity of Ventral Striatal Compartments in Appetitive Behaviors. Annals New York Academy of Sciences. Kelley AE (2004) Ventral striatal control of appetitive motivation: role in ingestive behavior and reward-related learning. Neurosci Biobehav Rev, 27: Kelley AE, Baldo BA, Pratt WE (2005) A proposed hypothalamic-thalamic-striatal axis for the integration of energy balance, arousal, and food reward. J Comp Neurol. 493(1):72-85.

6/26/ References Lorrain DS, Riolo JV, Matuszewich L, Hull EM (1999) Lateral Hypothalamic Serotonin Inhibits Nucleus Accumbens Dopamine: Implications for Sexual Satiety. The Journal of Neuroscience, 19(17): McFarland NR, Haber SN (2000) Convergent Inputs from Thalamic Motor Nuclei and Frontal Cortical Areas to the Dorsal Striatum in the Primate. The Journal of Neuroscience, 20(10): 3798–3813. Niv Y, Joel D, Dayan P (2006) A normative perspective on motivation. Trends in Cognitive Sciences, 10(8): Rauch SL, Whalen PJ, Savage CR, Curran T, Kendrick A, Brown HD, Bush G, Breiter HC, Rosen BR (1997) Striatal Recruitment During an Implicit Sequence Learning Task as Measured by Functional Magnetic Resonance Imaging. Human Brain Mapping 5:124–132. Rauch SL, Whalen PJ, Curran T, McInerney S, Heckers S, Savage CR (1998) Thalamic deactivation during early implicit sequence learning: a functional MRI study. NeuroReport, 9: 865–870. Saper, C.B.; Swanson, L.W.; Cowan, W.M. (1979) An autoradiographic study of the efferent connections of the lateral hypothalamic area in the rat. J Comp Neurol., 183(4): W. Schultz. Activity of dopamine neurons in the behaving primate. Seminars in the Neurosciences, 4:129–138, W. Schultz. Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80:1–27, W. Schultz, P. Apicella, and T. Ljungberg. Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. Journal of Neuroscience, 13:900–913, R. E. Suri and W. Schultz. Temporal difference model reproduces predictive neural activity. Neural Computation, 13:841–862, P. Voorn, L. J. Vanderschuren, H. J. Groenewegen, T. W. Robbins, and C. M. Pennartz. Putting a spin on the dorsal-ventral divide of the striatum. Trends in Neuroscience, 27:468–474, Wilson C, Nomikos GG, Collu M, Fibiger HC (1995) Dopaminergic correlates of motivated behavior: importance of drive. Journal of Neuroscience, 15: