Schedules of Reinforcement: Continuous reinforcement: – Reinforce every single time the animal performs the response – Use for teaching the animal the.

Slides:



Advertisements
Similar presentations
Schedules of reinforcement
Advertisements

Steven I. Dworkin, Ph.D. 1 Choice and Matching Chapter 10.
Chapter 22: Differential Reinforcement
Learning Unit 5. Topics in Learning Unit Defining Learning Classical Conditioning Operant Conditioning Cognitive Learning.
Mean = = 83%
The Matching Law Richard J. Herrnstein. Reinforcement schedule Fixed-Ratio (FR) : the first response made after a given number of responses is reinforced.
Common Properties of Differential Reinforcement A target behavior performed in the presence of a particular stimulus is reinforced. The same behavior is.
Schedules of Reinforcement There are several alternate ways to arrange the delivery of reinforcement A. Continuous reinforcement (CRF), in which every.
Developing Behavioral Persistence Through the Use of Intermittent Reinforcement Chapter 6.
Quiz #3 Last class, we talked about 6 techniques for self- control. Name and briefly describe 2 of those techniques. 1.
Copyright © 2011 Pearson Education, Inc. All rights reserved. Developing Behavioral Persistence Through the Use of Intermittent Reinforcement Chapter 6.
Instrumental Learning A general class of behaviors inferring that learning has taken place.
Thinking About Psychology: The Science of
Operant Conditioning. Shaping shaping = successive approximations toward a goal a process whereby reinforcements are given for behavior directed toward.
Learning Operant Conditioning.  Operant Behavior  operates (acts) on environment  produces consequences  Respondent Behavior  occurs as an automatic.
Myers EXPLORING PSYCHOLOGY (6th Edition in Modules) Module 19 Operant Conditioning James A. McCubbin, PhD Clemson University Worth Publishers.
Chapter 8 Operant Conditioning.  Operant Conditioning  type of learning in which behavior is strengthened if followed by reinforcement or diminished.
Psychology 001 Introduction to Psychology Christopher Gade, PhD Office: 621 Heafey Office hours: F 3-6 and by apt. Class WF 7:00-8:30.
More Instrumental (Operant) Conditioning. B.F. Skinner Coined the term ‘Operant conditioning’ Coined the term ‘Operant conditioning’ The animal operates.
PSY402 Theories of Learning Chapter 4 (Cont.) Schedules of Reinforcement.
Schedules of Reinforcement Lecture 14. Schedules of RFT n Frequency of RFT after response is important n Continuous RFT l RFT after each response l Fast.
PSY 402 Theories of Learning Chapter 7 – Behavior & Its Consequences Instrumental & Operant Learning.
Instrumental Learning All Learning where an animal operates on its environment to obtain a reinforcement. Operant (Skinnerian) conditioning.
Reinforcement: Part 2 Week 6: Increasing and Decreasing Behavior.
Lectures 15 & 16: Instrumental Conditioning (Schedules of Reinforcement) Learning, Psychology 5310 Spring, 2015 Professor Delamater.
Last Day To Register  This is the last day to register for the November special election.  To register, go to: Rock the Vote website:
OPERANT CONDITIONING DEF: a form of learning in which responses come to be controlled by their consequences.
 Also called Differentiation or IRT schedules.  Usually used with reinforcement  Used where the reinforcer depends BOTH on time and the number of reinforcers.
Week 5: Increasing Behavior
Ratio Schedules Focus on the number of responses required before reinforcement is given.
Psychology of Learning EXP4404 Chapter 6: Schedules of Reinforcement Dr. Steve.
Shaping.
Chapter 9 Adjusting to Schedules of Partial Reinforcement.
Operant Conditioning Unit 4 - AoS 2 - Learning. Trial and Error Learning An organism’s attempts to learn or solve a problem by trying alternative possibilities.
Operant Conditioning: Schedules and Theories Of Reinforcement.
Chapter 6 Operant Conditioning Schedules. Schedule of Reinforcement Appetitive outcome --> reinforcement –As a “shorthand” we call the appetitive outcome.
Ninth Edition 5 Burrhus Frederic Skinner.
Operant Conditioning: Schedules and Theories of Reinforcement
B.F. SKINNER - "Skinner box": -many responses -little time and effort -easily recorded -RESPONSE RATE is the Dependent Variable.
Operant Conditioning Unit 4 - AoS 2 - Learning. Trial and Error Learning An organism’s attempts to learn or solve a problem by trying alternative possibilities.
Organizational Behavior Types of Intermittently Reinforcing Behavior.
4 th Edition Copyright 2004 Prentice Hall5-1 Learning Chapter 5.
Classical Conditioning
Chapter 13: Schedules of Reinforcement
Chapter 6 Developing Behavioral Persistence Through the Use of Intermittent Reinforcement.
 Continuous reinforcement: ◦ Reinforce every single time the animal performs the response  Use for teaching the animal the contingency ◦ E.g., when.
PSY402 Theories of Learning Chapter 6 – Appetitive Conditioning.
Principles of Behavior Sixth Edition Richard W. Malott Western Michigan University Power Point by Nikki Hoffmeister.
Schedules of Reinforcement and Choice. Simple Schedules Ratio Interval Fixed Variable.
Spontaneous Recovery A Skinnerian interpretation: By Jack Michael.
Schedules of Reinforcement CH 17,18,19. Divers of Nassau Diving for coins Success does not follow every attempt Success means reinforcement.
Schedules of Reinforcement Thomas G. Bowers, Ph.D.
Schedules of Reinforcement or Punishment: Ratio Schedules
Schedules of reinforcement
Operant Conditioning. Learning when an animal or human performs a behavior, and the following consequence increases or decreases the chance that the behavior.
Behavioral Learning Theory : Pavlov, Thorndike & Skinner M. Borland E.P. 500 Dr. Mayton Summer 2007.
MAJOR DISCLAIMER!!!! You should not attend this series until you have gone through the previous 2 series: –Basic Behavior Principles –Function Based Assessment.
Reinforcements. Clinician’s Basic Task Create communication behaviors Increase communication behaviors Both.
Differential Reinforcement
Schedules and more Schedules
Factors Affecting Performance on Reinforcement Schedules
Choice Behavior One.
Operant Conditioning A form of learning in which behavior becomes more or less probable depending on its consequences Associated with B.F. Skinner.
Reinforcement Schedules
Schedules of Reinforcement
PSY402 Theories of Learning
Operant Conditioning Unit 4 - AoS 2 - Learning.
UNIT 4 BRAIN, BEHAVIOUR & EXPERIENCE
Errorless Learning and the Feature Positive Effect
Presentation transcript:

Schedules of Reinforcement: Continuous reinforcement: – Reinforce every single time the animal performs the response – Use for teaching the animal the contingency – Problem: Satiation Solution: only reinforce occasionally – Partial reinforcement – Can reinforce occasionally based on time – Can reinforce occasionally based on amount – Can make it predictable or unpredictable

Partial Reinforcement Schedules Fixed Ratio: every nth response is reinforced Fixed interval: the first response after x amount of time is reinforced Variable ratio: on average of every nth response is reinforced Variable interval: the first response after an average of x amount of time is reinforced

Differential Reinforcement schedules Only reinforce some responses Is a criteria regarding the rate or type of the response Several examples: – DRO – DRL – DRH

DRO: differential reinforcement of other behavior (responses) Use when want to decrease a target behavior and increase anything BUT that response Reinforce any response BUT the target response Often used as alternative to extinction – E.g., SIB behavior – Reinforce anything EXCEPT hitting self

DRH: differential reinforcement of High rates of responding Use when want to maintain a high rate of responding Reinforce as long as the rate of reinforcement remains at or above a set rate for X amount of time Often used to maintain on-task behavior – E.g., data entry: must maintain so many keystrokes/min or begin to lose pay – Use in clinical setting for attention: as long as engaging in X academic behavior at or above a certain rate, then get a reinforcer

DRL: differential reinforcement of LOW rates of responding Use when want to maintain a low rate of responding Reinforce as long as the rate of reinforcement remains at or below a set rate for X amount of time Often used to control inappropriate behavior – E.g., talking out: as long as have only 3 talk outs per school day, then earn points on behavior chart – Use because it is virtually impossible to extinguish behavior, but then control it at lowest rate possible.

Variations of Reinforcement Limited Hold There is a limited time when the reinforcer is available: – Like a “fast pass”: earned the reinforcer, but must pick it up within 5 seconds or it is lost applied when a faster rate of responding is desired with a fixed interval schedule By limiting how long the reinforcer is available following the end of the interval, responding can be speeded up

Time-based Schedules Unlike typical schedules, NO response contingency Passage of time provides reinforcement Fixed Time or Variable Time schedules – FT 60 sec: every 60 seconds a reinforcer is delivered independent of responding – VT 60 sec schedule: on average of every 60 seconds…. Often used to study superstitious behavior Or: used as convenience once responding is established (organism may not pick up that contingency is gone)

Contingency-Shaped vs. Rule- Governed Behaviors Contingency-Shaped Behaviors—Behavior that is controlled by the schedule of reinforcement or punishment. Rule-Governed Behaviors—Behavior that is controlled by a verbal or mental rule about how to behave.

Operant Behavior can involve BOTH Obviously, reinforcement schedules can control responding So can “rules”: heuristics algorithms concepts and concept formation operant conditioning can have rules, for example, the factors affecting reinforcement.

Comparison of Ratio and Interval Schedules: Why different patterns? Similarities: – Both show fixed vs. variable effects – More pausing with fixed schedules…greater post- reinforcement pause – Variable schedules produce faster, steadier responding But: important differences – Reynolds (1975) Compared pecking rate of pigeons on VI vs. VR schedules FASTER responding for VR schedule

Why faster VR than VI responding? Second part of Reynolds (1975) – Used a yoked schedule: On bird on VR, one on VI Yoked the rate of reinforcement – When the bird on VR schedule was 1 response shy of reinforcer, waiting time ended for bird on VI schedule – Thus, both birds got same number of reinforcers Even with this the bird on VI schedule pecked more slowly – Replications support this finding In pigeons, rats, college students Appears to be strong phenomena

If a subject is reinforced for a response that occurs shortl

Explanation 1: IRT reinforcement IRTs: Inter-response times – If a subject is reinforced for responding that occurs shortly after the preceding response, then a short IRT is reinforced, long IRT is not – And vice versa: if reinforced for long IRTs, then make more long IRTs Compare VR and VI schedules: – Short IRTs are reinforced on VR schedules – Long IRTs are more likely reinforced on VI schedules – Even when the rate of reinforcement is controlled!

Explanation 2: Feedback functions Molar vs. molecular explanations of behavior – Molar: Global assessment Animal compares behavior across long time horizon Whole session or even across session assessment – Molecular: Momentary assessment Animal compares next response to last response Moment to moment assessment of setting But which does the animal do? – Answer is, as usual, both – We momentarily maximize – But we also engage in molar maximizing!

Explanation 2: Feedback functions Organisms do not base rate of responding only on rate of reinforcement directly tied to that responding Instead, organisms compare within and across settings Use CONTEXT to compare response rate – Again, momentary in some situations – More molar in others

Explanation 2: Feedback functions Feedback functions: – Reinforcement strengthens the relationship between the response and the reinforcer – Does this by providing information regarding this relationship Feedback function of reward and punishment are critical for developing these contingency rules and more molar patterns of responding

Feedback on VR vs VI schedules Relationship between responding and reinforcement on VR schedule: – More responses = more reinforcers – The way to increase reinforcement rate is to increase response rate – In a sense, organism “is in charge” of its own payoff rate – Faster responding = more reinforcers

Feedback on VR vs VI schedules Relationship between responding and reinforcement on VI schedule: – Passage of time = reinforcer – No way “speed up” the reinforcement rate – In a sense, time “is in charge” of payoff rate – Faster responding does not “pay”, is not optimizing

What happens when combine schedules of reinforcement? Concurrent schedules Conjunctive schedules Chained schedules And so on…..

Concurrent Schedules Two or more basic schedules operating independently at the same time for two or more different behaviors – organism has a choice of behaviors and schedules – You can take notes or daydream (but not really do both at same time) Provides better analog for real-life situations

Concurrent Schedules (cont’d) When similar reinforcement is scheduled for each of the concurrent responses: – response receiving higher frequency of reinforcement will increase in rate – the response requiring least effort will increase in rate – the response providing the most immediate reinforcement will increase in rate Important in applied situations!

Multiple Schedules Two or more basic schedules operating independently and ALTERNATING such that 1 is in effect when the other is not – organism is presented with first one schedule and then the other – You can go to Psy 463 or you attend P462, but you can’t go to both at the same time Organism makes comparisons ACROSS the schedules – Which is more reinforcing? – More responding for richer schedule Again, provides better analog for real-life situations

Chained Schedules Two or more basic schedule requirements are in place, – one schedule occurring at a time – but in a specified sequence Usually a cue that is presented to signal specific schedule – present as long as the schedule is in effect Reinforcement for responding in the 1 st component is the presentation of the 2 nd Reinforcement does not occur until the final component is performed

Conjunctive Schedules The requirements for two or more schedules must be met simultaneously – FI and FR schedule – Must complete the scheduled time to reinforcer, then must complete the FR requirement before get reinforcer Task/interval interactions – When the task requirements are high and the interval is short, steady work throughout the interval will be the result – When task requirements are low and the interval long, many nontask behaviors will be observed

Organism now “compares” across settings With 2 or more schedules of reinforcement in effect, animal will compare the two schedules – Assume that the organism will maximize Get the most reinforcement it can get out of the situations Smart organisms will split their time between the various schedules or form an exclusive choice

Organism now “compares” across settings Conc VI VI schedules: – Two VI schedules in effect at the same time – One is better than the other: conc VI 60 VI 15 VI 60 pays off 1 time per minute VI 15 pays off 4 times per minute What is the MAX amount of reinforcers (on average) an organism can earn per minute? How should organism split its time?

Organism now “compares” across settings Conc VR VR schedules: – Two VR schedules in effect at the same time – One is better (richer) than the other: conc VR 10 VR5 VR 10 pays off after an average of 10 responses VR 5 pays off after an average of 5 responses What is the MAX amount of reinforcers (on average) an organism can earn per minute? How should organism split its time?

Interesting phenomenon: Behavioral Contrast Behavioral contrast – change in the strength of one response that occurs when the rate of reward of a second response, or of the first response under different conditions, is changed. Reynolds, 1966 : Pigeon in operant chamber, pecks a key for food reward. Equivalent Multiple Schedule: – VI 60 second schedule when key is red – VI 60 second schedule when key is green, – Food comes with equal frequency in either case. Then: Schedules Change: – RED light predicts same VI 60 sec schedule – GREEN light predicts EXT in one phase – GREEN light predicts VI 15 sec schedule in next phase

Behavior change in Behavioral Contrast Positive contrast: occurs rate of responding to the red key goes up, even though the frequency of reward in red component remains unchanged. Remember: Phase 1: mult VI 60 (red) VI 60 (green)  mult VI 60 (red) EXT (green) VI 60 for red key did NOT change, only the green key schedule changed Negative contrast: occurs when the rate of responding to the red key goes DOWN even though the frequency of reward in the red component remains unchanged Remember: Phase 1: mult VI 60 (red) VI 60 (green)  mult VI 60 (red) VI 15 (green) VI 60 for red key did NOT change, only the green key schedule changed

Robust phenomenon Contrast effect may occur following changes in the Amount frequency, nature of the reward Occurs with concurrent as well as multiple schedules Shown to occur with various experimental designs and response measures (e.g. response rate, running speed) Shown to occur across many species (can’t say all because not all have been tested!)

Pullman Effect Pullman, Wa Spokane, Wa Seattle

Reinforcement options in Pullman: In Boston – Go out to bars (many, many options) – Take a warm bath (CONSTANT component) In Pullman: – Go out to bar (1 bar, only 1 bar) – Take a warm bath Remember: Pullman is: 100 miles from Spokane 500 miles from Seattle Next “other” city over 100,000: – Minneapolis – Las Vegas What happens to rate of warm bath taking in Pullman compared to Boston?

Why behavioral contrast? Why does the animal change its response rate to the unchanged/constant component? Is this optimizing? – Remember, this is a VI schedule, not a VR schedule – If you use VR schedules, get exclusive choice to easier/faster schedule.