Schedules of Reinforcement and Choice. Simple Schedules Ratio Interval Fixed Variable.

Slides:



Advertisements
Similar presentations
Schedules of reinforcement
Advertisements

Chapter Eight Improving Performance with Feedback, Rewards, and Positive Reinforcement.
Chapter 6 – Schedules or Reinforcement and Choice Behavior
A.P. Psychology Modules 20-22
Design of Experiments Lecture I
Law of Effect Animals improve on performance: –optimize –not just repeat same behavior but make it better and more efficient Adaptation to the environment.
Steven I. Dworkin, Ph.D. 1 Choice and Matching Chapter 10.
Schedules of Reinforcement: Continuous reinforcement: – Reinforce every single time the animal performs the response – Use for teaching the animal the.
Mean = = 83%
The Matching Law Richard J. Herrnstein. Reinforcement schedule Fixed-Ratio (FR) : the first response made after a given number of responses is reinforced.
Schedules of Reinforcement There are several alternate ways to arrange the delivery of reinforcement A. Continuous reinforcement (CRF), in which every.
Developing Behavioral Persistence Through the Use of Intermittent Reinforcement Chapter 6.
Quiz #3 Last class, we talked about 6 techniques for self- control. Name and briefly describe 2 of those techniques. 1.
Operant or Instrumental Conditioning Psychology 3306.
Copyright © 2011 Pearson Education, Inc. All rights reserved. Developing Behavioral Persistence Through the Use of Intermittent Reinforcement Chapter 6.
Operant Conditioning. Shaping shaping = successive approximations toward a goal a process whereby reinforcements are given for behavior directed toward.
Learning Operant Conditioning.  Operant Behavior  operates (acts) on environment  produces consequences  Respondent Behavior  occurs as an automatic.
Myers EXPLORING PSYCHOLOGY (6th Edition in Modules) Module 19 Operant Conditioning James A. McCubbin, PhD Clemson University Worth Publishers.
Chapter 8 Operant Conditioning.  Operant Conditioning  type of learning in which behavior is strengthened if followed by reinforcement or diminished.
Psychology 001 Introduction to Psychology Christopher Gade, PhD Office: 621 Heafey Office hours: F 3-6 and by apt. Class WF 7:00-8:30.
More Instrumental (Operant) Conditioning. B.F. Skinner Coined the term ‘Operant conditioning’ Coined the term ‘Operant conditioning’ The animal operates.
PSY402 Theories of Learning Chapter 4 (Cont.) Schedules of Reinforcement.
Schedules of Reinforcement Lecture 14. Schedules of RFT n Frequency of RFT after response is important n Continuous RFT l RFT after each response l Fast.
Time, Rate and Conditioning or “A model with no free parameters that explains everything in behaviour” C.R. Gallistel John Gibbon Psych. Review 2000, 107(2):
PSY 402 Theories of Learning Chapter 7 – Behavior & Its Consequences Instrumental & Operant Learning.
PSY402 Theories of Learning Chapter 7 – Theories and Applications of Appetitive Conditioning.
Chapter 11 Multiple Regression.
PSY 402 Theories of Learning Chapter 7 – Behavior & Its Consequences Instrumental & Operant Learning.
Lectures 15 & 16: Instrumental Conditioning (Schedules of Reinforcement) Learning, Psychology 5310 Spring, 2015 Professor Delamater.
OPERANT CONDITIONING DEF: a form of learning in which responses come to be controlled by their consequences.
Chapter 7 Operant Conditioning:
 Also called Differentiation or IRT schedules.  Usually used with reinforcement  Used where the reinforcer depends BOTH on time and the number of reinforcers.
Week 5: Increasing Behavior
Ratio Schedules Focus on the number of responses required before reinforcement is given.
Psychology of Learning EXP4404 Chapter 6: Schedules of Reinforcement Dr. Steve.
Chapter 9 Adjusting to Schedules of Partial Reinforcement.
Learning Prof. Tom Alloway. Definition of Learning l Change in behavior l Due to experience relevant to what is being learned l Relatively durable n Conditioning.
Chapter 6 Operant Conditioning Schedules. Schedule of Reinforcement Appetitive outcome --> reinforcement –As a “shorthand” we call the appetitive outcome.
Operant Conditioning: Schedules and Theories of Reinforcement
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Reinforcement Consequences that strengthen responses.
Classical Conditioning
Chapter 13: Schedules of Reinforcement
Chapter 6 Developing Behavioral Persistence Through the Use of Intermittent Reinforcement.
Operant Principles (aka: Behavior Management) Dr. Ayers HPHE 4480 Western Michigan University.
PSY402 Theories of Learning Chapter 6 – Appetitive Conditioning.
Uniprocessor Scheduling Chapter 9. Aim of Scheduling To improve: Response time: time it takes a system to react to a given input Turnaround Time (TAT)
1 Our focus  scheduling a single CPU among all the processes in the system  Key Criteria: Maximize CPU utilization Maximize throughput Minimize waiting.
Info for the final 35 MC = 1 mark each 35 marks total = 40% of exam mark 15 FB = 2 marks each 30 marks total = 33% of exam mark 8 SA = 3 marks each 24.
PRINCIPLES OF APPETITIVE CONDITIONING Chapter 6 1.
Chapter 10 Response Time and Display Rate February 3, 2004.
Schedules of Reinforcement CH 17,18,19. Divers of Nassau Diving for coins Success does not follow every attempt Success means reinforcement.
Schedules of Reinforcement Thomas G. Bowers, Ph.D.
Schedules of Reinforcement or Punishment: Ratio Schedules
Schedules of reinforcement
Schedules of Reinforcement
Working Hypothesis: If created by conditioning, could be eliminated using principles of conditioning Behavior Therapy: treatment based on environmental.
1 Quiz Question: In what way are fixed-ratio (FR) and variable-ratio (VR) reinforcement schedules: (a) similar? (b) different?
Behavioral Economics!. Behavioral Economics Application of economic theory to predict and control behavior – Microeconomics – Assumes law of supply and.
Schedules and more Schedules
Factors Affecting Performance on Reinforcement Schedules
Choice Behavior One.
Behavioral Economics!.
Performance Management
Choice Behavior Two.
Schedules of Reinforcement
PSY402 Theories of Learning
Schedules of Reinforcement
Schedules of Reinforcement
Behavioral Economics! Yep, another theory used to predict reinforcement effects This theory = Nobel Prize for Richard Thaler!
Presentation transcript:

Schedules of Reinforcement and Choice

Simple Schedules Ratio Interval Fixed Variable

Fixed Ratio CRF = FR1 Partial/intermittent reinforcement Post reinforcement pause

Causes of FR PRP Fatigue hypothesis Satiation hypothesis Remaining-responses hypothesis –Reinforcer is a discriminative stimulus signaling absence of next reinforcer any time soon

Evidence PRP increases as FR size increases –Does not support satiation Multiple FR schedules –Long and short schedules –PRP longer if next schedule long, shorter if next one short Does not support fatigue FR10FR40 L S L L S S

Fixed Interval Also has PRP Not remaining responses, though Time estimation Minimize cost-to-benefit

Variable Ratio Steady response pattern PRPs unusual High response rate

Variable Interval Steady response pattern Slower response rate than VR

Comparison of VR and VI Response Rates Response rate for VR faster than for VI Molecular theories –Small-scale events –Reinforcement on trial-by-trial basis Molar theories –Large-scale events –Reinforcement over whole session

IRT Reinforcement Theory Molecular theory IRT: Interresponse time Time between two consecutive responses VI schedule –Long IRT reinforced VR schedule –Time irrelevant –Short IRT reinforced

Time b/t responses … i i i i i i i i i i i i i i i i r r r r r Time/number for reinforcement … number seconds Interval number seconds Ratio Random number generator (mean=5) 30 reinforcer deliveries

Response-Reinforcer Correlation Theory Molar theory Response-reinforcer relationship across whole experimental session –Long-range reinforcement outcome –Trial-by-trial unimportant Criticism: too cognitive VI 60 sec VR 60 Responses/minute Reinforcers/hour

Choice 2 key/lever protocol Ratio-ratio Interval-interval Typically VI-VI CODs

Matching Law B = behaviour (responses) R = reinforcement B1B1 B 1 + B 2 R1R1 R 1 + R 2 = B1B1 B2B2 R1R1 R2R2 = or

Bias Spend more time on one alternative than predicted Side preferences Biological predispositions Quality and amount Undermatching, overmatching

Qualities and Amounts Q 1 : quality of first reinforcer Q 2 : quality of second reinforcer A 1 : amount of first reinforcer A 2 : amount of second reinforcer

Undermatching Most common Response proportions less extreme than reinforcement proportions

Overmatching Response proportions are more extreme than reinforcement proportions Rare Found when large penalty imposed for switching –e.g., barrier between keys

Undermatching/Overmatching B1 B1+B2 R1 R1+R Undermatching B1 B1+B2 R1 R1+R Overmatching

Baum’s Variation B1B1 B2B2 R1R1 R2R2 () b s = s = sensitivity of behaviour relative to rate of reinforcement –Perfect matching, s=1 –Undermatching, s<1 –Overmatching, s>1 b = response bias

Matching as a Theory of Choice Animals match because they are evolved to do so. Nice, simple approach, but ultimately wrong. Consider a VR-VR schedule –Exclusively choose one alternative Whichever is lower –Matching law can’t explain this

Melioration Theory Invest effort in “best” alternative In VI-VI, partition responses to get best reinforcer:response ratio –Overshooting the goal; feedback loop In VR-VR, keep shifting towards lower schedule; gives best reinforcer:response ratio Mixture of responding important over long run, but trial-by-trial responding shifts the balance

Optimization Theory Optimize reinforcement over long-term Minimum work for maximum gain Respond to both choices to maximize reinforcement

Momentary Maximization Theory Molecular theory Select alternative that has highest value at that moment Short-term vs. long-term benefits

Delay-reduction Theory Immediate or delayed reinforcement Basic principles of matching law, and... Choice directed towards whichever alternative gives greatest reduction in delay to next reinforcer Molar (matching response:reinforcement) and molecular (control by shorter delay) features

Self-Control Conflict between short- and long-term choices Choice between small, immediate reward or larger, delayed reward Self-control easier if immediate reinforcer delayed or harder to get

Value-Discounting Function V = M/(1+KD) –V = value of reinforcer –M = reward magnitude –K = discounting rate parameter –D = reward delay Set M = 10, K = 5 –If D = 0, then V = M/(1+0) = 10 –If D = 10, then V = M/(1+5*10) = 10/51 = 0.196

Reward Size & Delay Set M=5, K=5, D=1 –V = 5/(1+5*1) = 5/6 = Set M=10, K=5, D=5 –V = 10/(1+5*5) = 10/26 = To get same V with D=5 need to set M=21.66

Ainslie-Rachlin Theory Value of reinforcer decreases as delay b/t choice & getting reinforcer increases Choose reinforcer with higher value at the moment of choice Ability to change mind; binding decisions