PRINCIPLES OF APPETITIVE CONDITIONING Chapter 6 1.

Slides:



Advertisements
Similar presentations
Schedules of reinforcement
Advertisements

Operant Conditioning Skinner, positive & negative reinforcement, response cost, punishment and schedules of reinforcement.
Operant & Cognitive Approaches
A.P. Psychology Modules 20-22
Lectures 14: Instrumental Conditioning (Basic Issues) Learning, Psychology 5310 Spring, 2015 Professor Delamater.
Chapter 8 Learning.  Learning  relatively permanent change in an organism’s behavior due to experience.
Developing Behavioral Persistence Through the Use of Intermittent Reinforcement Chapter 6.
Operant or Instrumental Conditioning Psychology 3306.
Copyright © 2011 Pearson Education, Inc. All rights reserved. Developing Behavioral Persistence Through the Use of Intermittent Reinforcement Chapter 6.
Instrumental Learning A general class of behaviors inferring that learning has taken place.
Learning Operant Conditioning.  Operant Behavior  operates (acts) on environment  produces consequences  Respondent Behavior  occurs as an automatic.
Myers EXPLORING PSYCHOLOGY (6th Edition in Modules) Module 19 Operant Conditioning James A. McCubbin, PhD Clemson University Worth Publishers.
Chapter 8 Operant Conditioning.  Operant Conditioning  type of learning in which behavior is strengthened if followed by reinforcement or diminished.
Operant Conditioning What the heck is it? Module 16.
Lecture 20: Extinction (Pavlovian & Instrumental) Learning, Psychology 5310 Spring, 2015 Professor Delamater.
Introduction to Operant Conditioning. Operant & Classical Conditioning 1. Classical conditioning forms associations between stimuli (CS and US). Operant.
More Instrumental (Operant) Conditioning. B.F. Skinner Coined the term ‘Operant conditioning’ Coined the term ‘Operant conditioning’ The animal operates.
Schedules of Reinforcement Lecture 14. Schedules of RFT n Frequency of RFT after response is important n Continuous RFT l RFT after each response l Fast.
Copyright © 2005 Pearson Education Canada Inc. Learning Chapter 5.
PSY 402 Theories of Learning Chapter 7 – Behavior & Its Consequences Instrumental & Operant Learning.
Instrumental Conditioning. A Little History Thorndike –discrete trial procedure –outcome dependent on behavior –didn’t find imitative learning –first.
PSY 402 Theories of Learning Chapter 7 – Behavior & Its Consequences Instrumental & Operant Learning.
Instrumental Conditioning. A Little History Thorndike –discrete trial procedure –outcome dependent on behavior –didn’t find imitative learning –first.
Instrumental Condtioning & PREE. What is instrumental conditioning?  Modification of behavior by its consequences  Outcome is dependent upon the behavior.
Last Day To Register  This is the last day to register for the November special election.  To register, go to: Rock the Vote website:
OPERANT CONDITIONING DEF: a form of learning in which responses come to be controlled by their consequences.
Learning the Consequences of Behavior
Psychology of Learning EXP4404 Chapter 6: Schedules of Reinforcement Dr. Steve.
Chapter 9 Adjusting to Schedules of Partial Reinforcement.
Learning Prof. Tom Alloway. Definition of Learning l Change in behavior l Due to experience relevant to what is being learned l Relatively durable n Conditioning.
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Classical Conditioning
Chapter 13: Schedules of Reinforcement
Learning. LEARNING  Learning  relatively permanent change in an organism’s behavior due to experience.
Unit 6 (C): Operant Conditioning
Operant Conditioning E.L. Thorndike and B.F. Skinner.
PSY402 Theories of Learning Chapter 6 – Appetitive Conditioning.
Copyright McGraw-Hill, Inc Chapter 5 Learning.
Module 10 Operant & Cognitive Approaches. OPERANT CONDITIONING Operant conditioning –Also called instrumental conditioning –Kind of learning in which.
PSY402 Theories of Learning Chapter 6 (Cont.) Factors Affecting Appetitive Learning.
Instrumental/Operant Conditioning. Thorndike’s Puzzle Box.
Operant conditioning (Skinner – 1938, 1956)
Module 10 Operant & Cognitive Approaches. OPERANT CONDITIONING Operant conditioning –Also called _________________________________ –Kind of learning in.
Module 10 Operant & Cognitive Approaches. OPERANT CONDITIONING Operant Conditioning –also called instrumental conditioning –kind of learning in which.
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Chapter 6 Learning and Behavior Learning n A more or less permanent change in behavior that results from experience.
Schedules of Reinforcement CH 17,18,19. Divers of Nassau Diving for coins Success does not follow every attempt Success means reinforcement.
Schedules of reinforcement
Working Hypothesis: If created by conditioning, could be eliminated using principles of conditioning Behavior Therapy: treatment based on environmental.
Instrumental Conditioning II. Delay of Reinforcement Start DelayChoice Correct Incorrect Grice (1948) Goal Reward or No Reward.
Chapter 8 Learning. A relatively permanent change in an organism’s behavior due to experience. learning.
Chapter 6 FLASH CARD CHALLENGE!!!
Behavioral Learning Theory : Pavlov, Thorndike & Skinner M. Borland E.P. 500 Dr. Mayton Summer 2007.
CHS AP Psychology Unit 6: Learning (Behaviorism) Essential Task 6.3: Predict the effects of operant conditioning with specific attention to (primary, secondary,
Extinction of Conditioned Behavior Chapter 9 Effects of Extinction Extinction and Original Learning What is learned during Extinction.
PSY402 Theories of Learning Monday February 10, 2003.
Operant Conditioning Chapter 6.
Learning Principles & Applications 7-9% of AP Exam.
Thinking About Psychology: The Science of Mind and Behavior Charles T. Blair-Broeker Randal M. Ernst.
Chapter 6 LEARNING. Learning Learning – A process through which experience produces lasting change in behavior or mental processes. Behavioral Learning.
Chapter 5 Learning © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution.
Schedules of Reinforcement
Module 20 Operant Conditioning.
PSY402 Theories of Learning
PSY402 Theories of Learning
UNIT 4 BRAIN, BEHAVIOUR & EXPERIENCE
PSY402 Theories of Learning
Operant Conditioning.
Operant & Cognitive Approaches
Presentation transcript:

PRINCIPLES OF APPETITIVE CONDITIONING Chapter 6 1

Early Contributors 2  Thorndike’s Contribution  Emphasized Laws of Behavior Demonstrated trial by trial learning S-R learning  Skinner’s contribution  Emphasized Contingency A specified relationship between behavior and reinforcement in a given situation The environment “sets” the contingencies S(R->O)

A “Faux” Distinction 3  Instrumental conditioning A conditioning procedure in which the environment constrains the opportunity for reward (discrete trial)  Operant conditioning When a specific response produces reinforcement, and the frequency of the response determines the amount of reinforcement obtained (continuous responding, schedules of reinforcement)

Thorndikes’ Law of Effect S-R associations are stamped in by reward (satisfiers) StimulusResponse 4

SR Thorndike: “What is learned?” Reinforcement “stamps in” this connection Habit Learning 5

SR O Is that it? ? ? Pavlovian Association Instrumental Association 6

“O” matters 7  The Importance of Past Experience  Depression/Negative Contrast The effect in which a shift from high to low reward magnitude produces a lower level of response than if the reward magnitude had always been low.  Elation/Positive Contrast  The effect in which a shift from low to high reward magnitude produces a greater level of responding than if the reward magnitude had always been high.

Negative and Positive Contrast 8

Logic of Devaluation Experiment Max Min R-O or Goal Directed: Controlled by the current value of the reinforced, and so it should be reduced to zero after devaluation. S-R or Habit: Responding that is not controlled by the current value of the reward, and so it is insensitive to reinforcer devaluation. Responding NormalDevalued 9

R-O Association (aka the instrumental association) Phase 1DevaluationTest Push Left  PelletPellet+LiClRight? Push Right  SucroseSucrose+LiClLeft? Number of Pushes Left Pushes Right Pushes Devalued Pellet Devalued Sucrose 10

Summary of Devaluation  Neutered male rats lower but do not eliminate their responding previously associated with access to a “ripe” female rat.  Rats satiated for reward#1 preferentially lower responding to get reward#1 more than reward#2.  Goal devaluation effects tend to shrink with continued training and goal-directed responding is replaced by habit learning. 11

S-O Association (aka Pavlovian Association) Stage 1 Right  PelletTone  PelletTone: Left? Right? Left  SucroseLight  SucroseLight: Left? Right? Number of Presses ToneLight Left Right TestStage 2 12

Skinner’s Contributions  Automatic  Easy measurements that can be compared across species 13

Three Terms Define the Contingency  Three term contingency  Discriminative stimulus (S+ or S-)  Operant (R)  Consequence (O) 14

Operant Strengthened Bite Groom Lick Rear Push Lever Reinforcer Light-On Skinner Box S+ RO 15

Techniques and Concepts  Shaping: Successive approximations  Require closer and closer appoximations to the target behaviour  Secondary Reinforcers:  Stimuli accompanying reinforcer delivery  Marking:  Feedback that a response had occurred 16

Shaping 17  Shaping (or successive approximation procedure) Select a highly occurring operant behavior, then slowly changing the contingency until the desired behavior is learned

Training a Rat to Bar Press 18  Step 1: reinforce for eating out of the food dispenser  Step 2: reinforce for moving away from the food dispenser  Step 3: reinforce for moving in direction of bar  Step 4: reinforce for pressing bar

Appetive Reinforcers 19  Primary reinforcer An activity whose reinforcing properties are innate  Secondary reinforcer An event that has developed its reinforcing properties through its association with primary reinforcers

Primary Reward Magnitude 20  The Acquisition of an Instrumental or Operant Response The greater the magnitude of the reward, the faster the task is learned The differences in performance may reflect motivational differences

Magnitude 21

Primary Reward and Degraded Contingency = bar press= food Perfect contingency Strong Responding Degraded contingency Weak Responding 22

Strength of Secondary Reinforcers 23  Several variables affect the strength of secondary reinforcers The magnitude of the primary reinforcer The greater the number of primary-secondary pairings, the stronger the reinforcing power of the secondary reinforcer The time elapsing between the presentation of the secondary reinforcer and the primary reinforcer affects the strength of the secondary reinforcer

Primary-Secondary Pairings 24

Schedules of Reinforcement 25  Schedules of reinforcement A contingency that specifies how often or when we must act to receive reinforcement

Schedules of Reinforcement  Fixed Ratio  Reinforcement is given after a given number of responses  Short pauses  Variable Ratio  After a varying number of responses 26

Schedules of Reinforcement  Fixed Interval  First response after a given interval is rewarded  FI Scallop  Variable Interval  Like FI but varies with a given average  Scallop disappears 27

Fixed Interval Schedule 28  Fixed interval schedule Reinforcement is available only after a specified period of time and the first response emitted after the interval has elapsed is reinforced  Scallop effect Experience - the ability to withhold the response until close to the end of the interval increases with experience The pause is longer with longer FI schedules

Variable Interval Schedules 29  Variable interval schedule An average interval of time between available reinforcers, but the interval varies from one reinforcement to the next contingency Characterized by steady rates of responding The longer the interval, the lower the response rate Scallop effect does not occur on VI schedules Encourages S-R habit learning

Some Other Schedules  DRL, Differential reinforcement for low rates of responding  DRH, Differential reinforcement for high rates of responding  DR0, Different reinforcement of anything but the target behavior 30

Compound Schedules 31  Compound schedule A complex contingency where two or more schedules of reinforcement are combined

$5 today$50 waitVI-30VI-60 Schedule this….  Concurrent schedules permit the subject to alternate between different schedules; or to repeatedly choose between working on different schedules AB 32

Matching Law  B1/(B1+B2) = R1/(R1+R2)  B stands for numbers of a certain behavior  R stands for numbers of a reinforcers earned 33

Sniffy the Rat ScheduleBehavior B1/(B1+B2)R1/(R1+R2) “1” vs “2” VI-30 vs VI-1025% VI-10 vs VI-3075% VI-10 vs VI % VI-50 vs VI % VI-30 vs VI-3050% VI-10 vs VI-1050% 34

Typical Result 35

Deviations From Matching  Bias represents a preference for responding on one response more than the other that has nothing to do with the schedules programmed  one pigeon key requires more force to close its contact than the other, so that the pigeon has to peck harder  one food hopper delivers food more quickly than another 36

Sensitivity  Overmatching -- the relative rate of responding is more extreme than predicted by matching. The subject appears to be “too sensitive" to the schedule differences.  Undermatching -- the relative rate of responding on a key is less extreme than predicted by matching. The subject appears to by “insensitive" to the schedule differences. 37

Overmatching

Poor Self-Control small LARGE AB Direct Choice (Concurrent Schedule) 39

Self-Control and Overmatching  Concurrent Choice  Human and nonhumans often chose a immediate small reward over a larger delayed reward (delayed rewards are “discounted”) 40

Another Example of Impulsivity “Free” reinforcers given every 20s Lever press advances delivery of the first pellet, and deletes the second pellet So, if you press at 2 seconds, you get a pellet immediately, but you get no other pellets until the 60 second pellet is available. 20s40s60s 41

Delay of Reinforcement  Delayed reinforcers are steeply discounted  Loss of self-control and impulsivity Reinforcer Potency Delay 42

small A LARGE B AB Concurrent Chain (Pre-committment) 43

Behavioral Methods for Self Control  Pre-commitment Self-Exclusion Contracts Self-Exclusion Contracts  Distraction  Modeling  Shaping Waiting Reduce delay for small Reduce delay for small Increase delay for large Increase delay for large 44

The Discontinuance of Reinforcement 45  Extinction The elimination or suppression of a response caused by the discontinuation of reinforcement or the removal of the unconditioned stimulus  When reinforcement is first discontinued, the rate of responding remains high Under some conditions, it even increases

 Stronger Learning ≠ Slower Extinction  Partial Reinforcement Extinction Effect or PREE Extinction Paradox 46

Importance of Consistency of Reward 47  Extinction is slower following partial rather than continuous reinforcement  Partial reinforcement extinction effect (PREE): the greater resistance to extinction of an instrumental or operant response following intermittent rather than continuous reinforcement during acquisition One of the most reliable phenomena in psychology

Acquisition with Differing Percentages Speed Day 100% 80/50/30% 48

Extinction with Differing Percentages Speed Day 80% 50% 30% 100% 49

Explanations  Mowrer-Bitterman Discrimination Hypothesis  Amsel’s Frustration Theory (Emotional)  Capaldi’s Sequential Theory (Cognitive) 50

Theios Experiment (not just discrimination) PHASE 1PHASE 2EXT G1100%0% G2100% 0% G350%100%0% G450%-0% 51

Extinction Trials Speed G1, G2 100% PHASE 1PHASE 2EXT G1100%0% G2100% 0% G350%100%0% G450%-0% G3, G4 50% 52

Amsel’s Frustration Theory 53

Amsel’s Frustration Theory 100% Reinforcement Group 54

Amsel’s Frustration Theory 50% Reinforcement Group 55

Amsel (Percentage Reinforcement) Extinction Trials Speed 100% 50% 56

Amsel’s Frustration Theory EXT BETWEEN SUBJECT GROUP 1 T  F 100% T- GROUP 2 N  F 50% N- WITHIN SUBJECT TRIALS 1,3,6…. T  F 100% T- TRIALS 2,4,5…. N  F 50% N- PREE Reversed PREE 57

Influence of Reward Magnitude 58  The influence of reward magnitude on resistance is dependent upon the amount of acquisition training.  With extended acquisition, a small consistent reward may produce more resistance to extinction than a large reward (absence more frustrating).

Reward Magnitude and Percentage 59

Sequential Theory 60  Sequential theory  If reward follows nonreward, the animal will associate the memory of the nonrewarded experience with the operant or instrumental response  During extinction, the only memory present after the first nonrewarded experience is that of the nonrewarded experience

61  Animals receiving continuous reward do not experience nonrewarded responses and so they do not associate nonrewarded responses with later reward  Thus, the memory of receiving a reward after persistence in the face of nonreward becomes a cue for continued responding  # of N-R transitions  N length  Variability of N-length

62  What is the significance of PRE? It encourages organisms to persist even though every behavior is not reinforced In the natural environment, not every attempt to attain a desired goal is successful PRE is adaptive because it motivates animals not to give up too easily