Instrumental Learning & Operant Reinforcement

Slides:



Advertisements
Similar presentations
January 26 th, 2010 Psychology 485.  History & Introduction  Three major questions: What is learned? Why learn? How does learning happen?
Advertisements

Instrumental Conditioning Also called Operant Conditioning.
Associative Learning Operant Conditioning. Foundations Edward Thorndike ( ) –Puzzle Box –Cats became more efficient with each trial –Law of.
PSY402 Theories of Learning Chapter 9, Theories and Applications of Aversive Conditioning.
Aversive Control of Behavior: Punishment & Avoidance Lesson 16.
Operant Conditioning. I. Operant Conditioning A type of learning that occurs when we receive rewards or punishments for our behavior A type of learning.
Lecture 21: Avoidance Learning & Punishment Learning, Psychology 5310 Spring, 2015 Professor Delamater.
Avoidance Conditioning Combining Classical and Operant Conditioning Classical and operant conditioning often take place in the same situation. We saw this.
Aversive Control: Avoidance and Punishment
Copyright © 2005 Pearson Education Canada Inc. Learning Chapter 5.
Instrumental Conditioning: Foundations. Name Game Instrumental: subject instrumental in producing outcome Operant: subject operates on environment to.
Negative Reinforcement
PSY 402 Theories of Learning Chapter 3 – Nuts and Bolts of Conditioning (Mechanisms of Classical Conditioning)
Psychology of Learning EXP4404
OPERANT CONDITIONING DEF: a form of learning in which responses come to be controlled by their consequences.
Learning Part II. Overview Habituation Classical conditioning Instrumental/operant conditioning Observational learning.
Chapter 10 Aversive Control: Avoidance and Punishment.
Operant Conditioning Unit 4 - AoS 2 - Learning. Trial and Error Learning An organism’s attempts to learn or solve a problem by trying alternative possibilities.
Chapter 6: Learning. Classical Conditioning Ivan Pavlov Terminology –Unconditioned Stimulus (UCS): evokes an unconditioned response without previous conditioning.
Learning Prof. Tom Alloway. Definition of Learning l Change in behavior l Due to experience relevant to what is being learned l Relatively durable n Conditioning.
Chapter 6: Learning. Classical Conditioning Ivan Pavlov Terminology –Unconditioned Stimulus (UCS) –Conditioned Stimulus (CS) –Unconditioned Response (UCR)
Learning.
OPERANT CONDITIONING Changing Behavior Through Reinforcement and Punishment.
Chapter 6 Learning. Table of Contents Classical conditioning Ivan Pavlov Terminology –Unconditioned Stimulus (UCS) –Conditioned Stimulus (CS) –Unconditioned.
B.F. SKINNER - "Skinner box": -many responses -little time and effort -easily recorded -RESPONSE RATE is the Dependent Variable.
Chapter 6 Learning. Table of Contents Learning Learning defined on page –Classical conditioning –Operant/Instrumental conditioning –Observational learning.
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Operant Conditioning Unit 4 - AoS 2 - Learning. Trial and Error Learning An organism’s attempts to learn or solve a problem by trying alternative possibilities.
Operant Conditioning Operant Conditioning A type of learning in which behavior is strengthened if followed by reinforcement or diminished if.
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Classical Conditioning
Chapters 5 and 7 Operant Learning. Operant (Instrumental) Learning Stimulus Response Outcome.
Learning. A. Introduction to learning 1. Why do psychologists care about learning? 2. What is and isn’t learning? IS: A relatively permanent change in.
Chapter 6 Learning.
Operant Conditioning E.L. Thorndike and B.F. Skinner.
LEARNING: PRINCIPLES AND APPLICATIONS Operant Conditioning.
College Board - “Acorn Book” Course Description 7-9% Unit VI. Learning 1 VI. Learning.
Learning Principles and Applications
Table of Contents CHAPTER 6 Learning. Table of ContentsLEARNING  Learning  Classical conditioning  Operant/Instrumental conditioning  Observational.
Copyright McGraw-Hill, Inc Chapter 5 Learning.
Innate Behavior Patterns Reflex Tropism –kinesis (undirected) –taxis (directed) Fixed Action Pattern –species-specific; unlearned; goes to completion Reaction.
LEARNING  a relatively permanent change in behavior as the result of an experience.  essential process enabling animals and humans to adapt to their.
Learning Experiments and Concepts.  What is learning?
Assignment #2 Deadline changed to JUNE 4 th Will mostly focus on Ch 7 Talk about that after the midterm on Monday Topics will be announced on Monday.
Learning Chapter 5.
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Chapter 6 Learning and Behavior Learning n A more or less permanent change in behavior that results from experience.
Steven I. Dworkin, Ph.D. 1 Basic Principles of Operant Conditioning Chapter 6.
Unit 6: Learning. How Do We Learn? Learning = a relatively permanent change in an organism’s behavior due to experience. 3 Types:  Classical  Operant.
Def: a relatively permanent change in behavior that results from experience Classical Conditioning: learning procedure in which associations are made.
Chapter 6 FLASH CARD CHALLENGE!!!
Operant Conditioning – Chapter 9 Theories of Learning October 19, 2005 Class #25.
Chapter 2: Behavioral Learning Theory What causes change in behavior?
Learning 7-9% of the AP Psychology exam. Thursday, December 3 Sit with your group from yesterday’s test review!
Table of Contents Chapter 6 Learning. Table of Contents Learning –Classical conditioning –Operant/Instrumental conditioning –Observational learning Ivan.
CLASSICAL CONDITIONING US (unconditioned stimulus - e.g., food in mouth): input to a reflex UR (unconditioned response - e.g., salivation to food): output.
CHAPTER 4 Behavioural views of learning Identify three characteristics that distinguish classical from operant conditioning Describe the Premack principle.
LEARNING * A relatively permanent change in behavior or knowledge resulting from experience.
Learning Principles & Applications 7-9% of AP Exam.
Basic Learning Processes Robert C. Kennedy, PhD University of Central Florida
PSY 402 Theories of Learning Chapter 3 – Nuts and Bolts of Conditioning (Mechanisms of Classical Conditioning)
Chapter 6 LEARNING. Learning Learning – A process through which experience produces lasting change in behavior or mental processes. Behavioral Learning.
Conditioning and Learning Unit 6 Conditioning and Learning Modules
Classical Conditioning Operant Conditioning Learning by Observation
Learning: Principles and Applications
Chapter 5 Learning © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution.
AP Psychology Unit: Learning.
Unit: Learning.
Operant Conditioning Unit 4 - AoS 2 - Learning.
Classical Conditioning
Presentation transcript:

Instrumental Learning & Operant Reinforcement Chapters 5 Instrumental Learning & Operant Reinforcement

Operant Learning Stimulus Response Outcome

Classical vs. Operant Classical Operant Can operate together Requires reflex action Neutral stimulus associated with US Outside of subject’s control Operant Strengthening/weakening of “voluntary” action Subject responds or doesn’t Can operate together

What’s in a Name? Operant learning: subject operates on environment Instrumental conditioning: subject is instrumental in obtaining outcome

Trial and Error Learning E.L. Thorndike Animal intelligence Maze studies

Puzzle Box Cats Cage with mechanism to open door Escape latency Discrete trial procedure

Law of Effect Any behaviour followed by an appetitive stimulus will increase in frequency

Terms Operant (response): any behaviour that operates on the environment to produce an effect Reinforcer: any event that increases the frequency of a behaviour Punisher: any event that decreases the frequency of a behaviour

Operant Learning B.F. Skinner Operant chamber Free operant procedure

Discrete Trial & Free Operant One trial at a time “Apparatus” must be re-set Measure some behaviour e.g., mazes Free Operant can occur at any time Operant can occur repeatedly Response rate e.g., operant chamber

Four Contingencies Positive reinforcement Negative reinforcement Positive punishment Negative punishment

Positive and Negative Positive: presents some stimulus Negative: removes some stimulus

Reinforcers and Punishers Reinforcer: increases a behaviour Punisher: decreases a behaviour

Contingencies Response Rate: Response Causes Stimulus to Be: Increases Decreases Removed Presented Response Causes Stimulus to Be: Positive Reinforcement Positive Punishment Lever press --> Food Lever press --> Shock Negative Reinforcement Negative Punishment Lever press --> Shock off Lever press --> Food removed

Types of Reinforcers Primary Secondary Not dependent on an association with other reinforcers Secondary Initially neutral stimulus Paired with primary reinforcer “Conditioned Reinforcer”

Secondary Reinforcers “Bridging”, “clicker” Secondary extinction without periodic pairings with primary Generally weaker than primary Generalized reinforcer Paired with many other kinds of reinforcers e.g., money

Strength of Operant Learning Can condition practically any behaviour Shaping (successive approximations)

Shaping a Lever Press Gradual process Reinforce more appropriate/precise responses Feedback

Response Chains Sequences of behaviours in specific order Objective: primary reinforcer Conditioned reinforcers Discriminative stimuli

Forward Chaining Start with first response in sequence, then work through to last response in additive steps

Backwards Chaining Often used with “complex” training Start with last response in chain Next, second last response Third last, etc.

Factors in Operant Learning

Contingency Correlation between behaviour & outcome Strong contingency --> better learning Random contingency --> no learning Both reinforcement and punishment

Contiguity Time between behaviour & outcome Shorter = better learning Delays let other behaviours occur, forgetting, extinction (behaviour w/o reinforcement) Learning with delay if stimulus “placeholder” provided (conditioned reinforcer?) More important for punishment

Reinforcer Characteristics Larger reinforcers --> stronger learning Not a linear effect Qualitative differences in reinforcers and punishers Species & individual differences Intensity of punisher

Task Characteristics Some tasks easier to learn than others Species & individual differences Innate and/or prior conditioning

Deprivation Levels Generally, the greater the deprivation, the more effective the reinforcer Reinforcers can satiate Deprivation can provide motivation to engage in punishable behaviours

Extinction Behavioural does not lead to same outcome Response no longer produces same outcome Extinction burst (with reinforcement) Variability of behaviour Aggression and frustration Spontaneous recovery Resurgence

Theories of Reinforcement

Hull’s Drive Reduction Theory Animals have motivational states (drives) Necessary for survival Reinforcers are things that reduce drives Physiological value Reduce physiological state

Drive Reduction Reinforcers Works well with primary reinforcers Many secondary reinforcers have no physiological value Hull: association links secondary to drive Some reinforcers hard to classify as primary or secondary Some increase a physiological state Some necessities undetectable Roller coasters Vitamins Saccharin

Relative Value Theory & Premack Principle Treat reinforcers as behaviours Is it the food, or the behaviour of eating that is the reinforcer? Behavioural probability scale Greater or lesser value of behaviours relative to one another No distinction between primary and secondary

Premack Principle One behaviour will reinforce a second behaviour High probability behaviour reinforces low probability behaviour Baseline probability scale Time Rank order Reinforcement relativity No absolutes Time spent on response Total time Probabilty of response =

Example Behaviours Baseline (30 minutes) Eat ice cream (I), play video game (V), read book (B) Baseline (30 minutes) Student 1: I (2min), V (8min), B (20min) Scale: I -- V -- B Student 2: I (8min), V (20min), B (2min) Scale: B -- I -- V Student 1: V reinforces I, B reinforces V & I Student 2: I reinforces B, V reinforces I & B

Problems Baseline phase Time problems Fair rating? How to compare very different behaviours Time problems What if time not important to behaviour? Behaviour duration? Length of baseline period?

Response Deprivation Theory Deprived behaviours = reinforcing behaviours Drop below baseline level of performance Not relative frequency of one behaviour compared to another (i.e., Premack) Level of deprivation for a behaviour Praise? “Yes”?

Escape and Avoidance

Definitions Escape Avoidance Get away from aversive stimulus that is in progress Avoidance Get away from aversive stimulus before it begins

Shuttle Box Solomon & Wynne (1953) Dogs Chamber with barrier; Shock Light off as signal

Two-Process Theory Classical and operant conditioning Shock = US Fear/pain/jump/twitch/squeal = UR Darkness = CS Fear of dark = CR Fear: heart rate, breathing, stomach cramps, etc. Negative reinforcement Removal of fear (CR) Escape of CS, not avoidance of shock

Support for Two-Process Theory Rescorla & LoLordo (1965) Dog in shuttlebox No signal Response gives “safe time” Pair tone with shock Tone increases rate of response CS can amplify avoidance Conditioned inhibition can reduce avoidance

Problems with Two-Process Theory Avoidance without observable fear Heart rate Not consistent Fear diminishes with avoidance learning

Measuring Fear Kamin, Brimer, and Black (1963) Lever press ---> food Auditory CS ---> avoidance in shuttle box until: 1, 3, 9, 27 avoidances in a row CS in Skinner box; check for suppression of lever press

Results Fear decreases during extended avoidance training But, avoidance still strong Even low fear is enough? Avoidance responses Responding 1 3 9 27

Extinction in Avoidance Behaviour Odd prediction from two-process theory “Yo-yo” effect Avoidance should toggle But! Avoidance is extremely persistent successful avoidance trials # of US received

One-Process Theory Classical conditioning component unnecessary Avoidance, not fear reduction, is reinforcer “Safety”

Sidman Avoidance Task Free-operant avoidance Shock at random intervals Can avoidance be learned if no warning CS? Shock at random intervals Response gives safe time Extensive training --> learn avoidance But, usually never perfect High variability across subjects Two-process theory suggests: Time becomes a CS (time elicits fear)

Herrnstein & Hineline (1966) Rapid and slow shock rate schedules Lever press switches schedules Shocks presented randomly, no signal Responses give shock reduction Reduction in shock is reinforcer

Learned Helplessness Behaviour has no effect on situation Generalizes Laboratory Give inescapable shocks Shuttle box Will not switch sides Expectation that behaviour has no effect

Learned Helplessness in Humans Depression Situations beyond your control Three dimensions Situation: specific or global Attribute: internal or external Time: short-term or long-term

Maier & Seligman (1976) Motivational impairment Cognitive impairment Emotional impairment

Therapeutic Application Confidence building (“can not fail”) Implementation issues Tasks that can be successfully completed Produces immunization Escapable condition … inescapable condition Learned helplessness less likely to develop