Download presentation
Presentation is loading. Please wait.
Published byAlban Watts Modified over 9 years ago
1
Instrumental Conditioning: Motivational Mechanisms
2
Contingency-Shaped Behaviour Uses three-term contingency Reinforcement schedule (e.g., FR10) imposes contingency Seen in non-humans and humans
3
Rule Governed Behaviour Particularly in humans Behaviour can be varied and unpredictable Invent rules or use (in)appropriate rules across conditions (e.g., language) Age-dependent, primary vs. secondary reinforcers, experience
4
Role of Response in Operant Conditioning Thorndike –Performance of response necessary Tolman –Formation of expectation McNamara, Long & Wike (1956) –Maze –Running rats or riding rats (cart) –Association what is needed
5
Role of the Reinforcer Is reinforcement necessary for operant conditioning? Tolman & Honzik (1930) Latent learning –Not necessary for learning –Necessary for performance
6
Results Day 11 Average Errors Days food no food no food until day 11
7
Associative Structure in Instrumental Conditioning Basic forms of association –S = stimulus, R = response, O = outcome S-R Thorndike, Law of Effect Role of reinforcer: stamps in S-R association No R-O association acquired
8
Hull and Spence Law of Effect, plus a classical conditioning process Stimulus evokes response via Thorndike’s S-R association Also, S-O association creates expectancy of reward Two-process approach –Classical and instrumental are different
9
One-Process or Two-Processes? Are instrumental and classical the same (one process) or different (two processes)? Omission control procedure –US presentation depends on non- occurrence of CR –No CR, then CS ---> US –CR, then CS ---> no US
10
Omission Control CS US CR Trial with a CR CS US CR Trial without a CR
11
Gormenzano & Coleman (1973) Eyeblink with rabbits US=shock, CS=tone Classical group: 5mA shock each trial, regardless of response Omission group: making eyeblink CR to CS prevents delivery of US
12
One-process prediction: –CR acquisition faster and stronger for Omission group –Reinforcement for CR is shock avoidance –In Classical group CR will be present because it somehow reduces shock aversiveness BUT… –CR acquisition slower in Omission group –Classical conditioning extinction (not all CSs followed by US) Supports Two-process theory
13
Classical in Instrumental Classical conditioning process provides motivation Stimulus substitution S acquires properties of O –r g = fractional anticipatory goal response Response leads to feedback –s g = sensory feedback r g -s g constitutes expectancy of reward
14
Timecourse S RO r g - s g Through stimulus substitution S elicits r g -s g, giving motivational expectation of reward
15
Prediction According to r g -s g CR should occur before operant response; but doesn’t always Dog lever pressing on FR33 ---> PRP Low lever presses early, then higher; but salivation only later Lever pressing salivation Time from start of trial Magnitude
16
Modern Two-Process Theory Classical conditioning in instrumental Neutral stimulus ---> elicits motivation Central Emotional State (CES) CES is a characteristic of the nervous system (“mood”) CES won’t produce only one response –Bit annoying re: prediction of effect
17
Prediction Rate of operant response modified by presentation of CS CES develops to motivate operant response CS from classical conditioning also elicits CES Therefore, giving CS during instrumental conditioning should alter CES that motivates instrumental response
18
“Explicit” Predictions Emotional states US CS Appetitive Aversive (e.g., food)(e.g., shock) CS+ Hope Fear CS-Disappointment Relief
19
Behavioural predictions Aversive US Instrumental scheduleCS+(fear)CS-(relief) Positive reinforcementdecreaseincrease Negative reinforcementincreasedecrease
20
R-O and S(R-O) Earlier interpretations had no response- reinforcement associations Intuitive explanation, though Perform response to get reinforcer
21
Colwill & Rescorla (1986) R-O association Devalue reinforcer post-conditioning Does operant response decrease? Bar push right or left for different reinforcers –Food or sucrose devalued reinforcer normal reinforcer Mean responses/min. Blocks of Ext. Trials Testing of Reinforcers
22
Interpretation Can’t be S-R –No reinforcer in this model Can’t be S-O –Two responses, same stimuli (the bar), but only one response affected Conclusion –Each response associated with its own reinforcer –R-O association
23
Hierarchical S-(R-O) R-O model lacks stimulus component Stimulus required to activate association Really, Skinner’s (1938) three term contingency Old idea; recent empirical testing
24
Colwill & Delameter (1995) Rats trained on pairs of S+ Biconditional discrimination problem –Two stimuli –Two responses –One reinforcer Match the correct response to the stimuli to be reinforced Training, reinforcer devaluation, testing
25
Training –Tone: lever --> food; chain --> nothing –Noise: chain --> food; lever --> nothing –Light: poke --> sucrose; handle --> nothing –Flash: handle --> sucrose; poke --> nothing Aversion conditioning Testing: marked reduction in previously reinforced response –Tone: lever press vs. chain –Noise: chain vs. lever –Light: poke vs. handle –Flash: handle vs. poke
26
Analysis Can’t be S-O –Each stimulus associated with same reinforcer Can’t be R-O –Each response reinforced with same outcome Can’t be S-R –Due to devaluation of outcome Each S activates a corresponding R-O association
27
Reinforcer Prediction, A Priori Simple definition –A stimulus that increases the future probability of a behaviour –Circular explanation Would be nice if we could predict beforehand
28
Need Reduction Approach Primary reinforcers reduce biological needs Biological needs: e.g., food, water Not biological needs: e.g., sex, saccharin Undetectable biological needs: e.g., trace elements, vitamins
29
Drive Reduction Clark Hull Homeostasis –Drive systems Strong stimuli aversive Reduction in stimulation is reinforcer –Drive is reduced Problems –Objective measurement of stimulus intensity –Where stimulation doesn’t change or increases!
30
Trans-situationality A stimulus that is a reinforcer in one situation will be a reinforcer in others Subsets of behaviour –Reinforcing behaviours –Reinforcable behaviours Often works with primary reinforcers Problems with other stimuli
31
Primary and Incentive Motivation Where does motivation to respond come from? Primary: biological drive state Incentive: from reinforcer itself
32
But… Consider: What if we treat a reinforcer not as a stimulus or an event, but as a behaviour in and of itself Fred Sheffield (1950s) Consummatory-response theory –E.g., not the food, but the eating of food that is the reinforcer –E.g., saccharin has no nutritional value, can’t reduce drive, but is reinforcing due to its consumability
33
Premack’s Principle Reinforcing responses occur more than the responses they reinforce H = high probability behaviour L = low probability behaviour If L ---> H, then H reinforces L But, if H ---> L, H does not reinforce L “Differential probability principle” No fundamental distinction between reinforcers and operant responses
34
Premack (1965) Two alternatives –Eat candy, play pinball –Phase I: determine individual behaviour probability (baseline) Gr1: pinball (operant) to eat (reinforcer) Gr2: eating candy (operant) to play pinball (reinforcer) –Phase II (testing) T1: play pinball (operant) to eat (reinforcer) –Only Gr1 kids increased operant T2: eat (operant) to play pinball (reinforcer) –Only Gr2 kids increased operant
35
Premack in Brief Any activity… …could be a reinforcer … if it is more likely to be “preferred” than the operant response.
36
Response Deprivation Hypothesis Restriction to reinforcer response Theory: –Impose response deprivation –Now, low probability responses can reinforce high probability responses Instrumental procedures withhold reinforcer until response made; in essence, deprived of access to reinforcer Reinforcer produced by operant contingency itself
37
Behavioural Regulation Physiological homeostasis Analogous process in behavioural regulation Preferred/optimal distribution of activities Stressors move organism away from optimum behavioural state Respond in ways to return to ideal state
38
Behavioural Bliss Point Unconstrained condition: distribute activities in a way that is preferred Behavioural bliss point (BBP) Relative frequency of all behaviours in unconstrained condition Across conditions –BBP shifts Within condition –BBP stable across time
39
Imposing a Contingency Puts pressure on BBP Act to defend challenges to BBP But requirements of contingency (may) make achieving BBP impossible Compromise required Redistribute responses so as to get as close to BBP as possible
40
Minimum Deviation Model Behavioural regulation Due to imposed contingency: Redistribute behaviour Minimize deviation of responses from BBP –Get as close as you can
41
Time running Time drinking 10203040 40 30 20 10 restricted running restricted drinking
42
Strengths of BBP Theory Reinforcers: not special stimuli or responses No difference between operant and reinforcer Explains new allocation of behaviour Fits with findings on cognition for cost:benefit optimization
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.