Psychology of Learning EXP4404 Chapter 5: Reinforcement Dr. Steve
Topics Covered Law of Effect Operant (Instrumental) Conditioning Reinforcement Escape-Avoidance Learning Operant vs. Classical Conditioning Primary vs. Secondary Reinforcers Shaping & Chaining Variables Affecting Reinforcement Neuromechanics of Reinforcement Extinction Theories of Reinforcement
Law of Effect E. L. Thorndike Puzzle Box for studying “animal intelligence” Trial and Error Learning Law of Effect www.youtube.com/watch?v=BDujDOLre-8
Operant Learning Operant (Instrumental) Conditioning – B. F. Skinner’s Operant Chamber
Operant Learning Reinforcement Reinforcer Positive Reinforcer Negative Reinforcer Note that positive and negative in this context do not mean good and bad, but rather presented or removed. Positive Reinforcement sometimes called reward learning, but it is not the same. A reward may not be reinforcing (e.g., bonus may not strengthen behavior, reward of public recognition may serve as punisher) Note that negative reinforcement is often confused with punishment, chart shows the difference
Operant Learning Escape-Avoidance Learning Learned helplessness occurs when organism is prevented from behaving in a manner that would allow it to escape an aversive stimulus. Animals experiencing learned helplessness act very similarly to depressed humans (may sit in corner and whimper)
Operant Learning Learned Helplessness If Sparky the dog receives shock regardless of is behavior (jumping over barrier which previously worked), then he “gives up” and may cower in the corner and just accept shock. “Broke his spirit”
Operant vs. Classical Conditioning Identify OC and CC in this picture sequence.
Primary vs. Secondary Reinforcers Primary Reinforcers Secondary Reinforcers
Secondary Reinforcers Clicker Training
Operant Learning Shaping Bird shaping video Do class exercise Frohm vs. Skinner - Skinner tells the story with obvious relish and enjoyment and the audience laughs along with him. Then he tells the other in which he himself is the prankster: at a panel discussion in which he had participated, another renowned expert on human behavior of the 1960s was "talking too much for good communication," so Skinner glanced at him every time he gestured agitatedly with one of his hands toward Skinner. By the end of the discussion, the psychologist's wrist watch nearly flew off his wrist he was gesticulating so wildly, says Skinner. In his autobiograpy, Fred Skinner revealed the identity of his victim: Erich Fromm Bird shaping video
Operant Learning Chaining Forward chaining Backward chaining Funny example of Chaining Behavior
Variables Affecting Reinforcement Contingency Contiguity Contingency – reinforcer needs to frequently follow behavior during learning Contiguity – longer delays between R & S leads to wrong behaviors being reinforced, worse learning takes place. However, in humans R-S interval could be extended through internal rehearsal (articulatory loop or subvocalization)
Variables Affecting Reinforcement Reinforcer Characteristics Crespi Effect (aka, Contrast Effect) Negative Effect Positive Effect Size of reinforcer to benefit relationship is not linear, the more you increase the magnitude the smaller the proportional payoff Negative effect occurs frequently in laboratory experiments, positive effect rarely shown
Variables Affecting Reinforcement Token Economies
Variables Affecting Reinforcement Task Characteristics Motivating Operation Competing Reinforcers Task characteristics – easier to condition predisposed behaviors – at SeaWorld teach behaviors, not “tricks” Competing reinforcers – child is rewarded by parents for doing homework, but rewarded by friends for and by task itself for playing instead.
Operant Learning Reward Pathway The original biological function of the reward pathway (RP) – running from a brain stem ventral tegmental area (VTA) via the nucleus accumbens (NA) to the prefrontal cortex – is to issue feelings of reward for fulfillment of actions aimed to warrant or enhance chances of survival such as eating, drinking and reproduction. Without the RP organisms would lose interest in life and be rendered incapable of maintaining self-support as life-saving or reproductive activities would not be re-inforced by the act of reward. Therefore, the RP is the built-in motivator designed to maximize chances of survival and as such, maintain the process of biological evolution. With the historical arrival of culture and its (initial) biologically advantageous and useful role in survival, the RP had to take up an additional task of rewarding actions aimed at developing and maintaining culture. I contend that the gene-meme co-evolution could not have been sustained if such actions were not accompanied by the release of feelings rewards similar to a purely biological evolution. The RP was therefore obliged to take up the dual task of rewarding biological as well as culturally oriented constructive behavior.
Operant Learning Latent Learning
Extinction Extinction Effects of extinction include: Extinction burst Extinction curve shown.
Extinction Effects of extinction include (cont’d): Spontaneous Recovery Resurgence Behavior is Extinguished Graph shows resurgence of disk pecking behavior when wing flapping behavior undergoes extinction
Theories of Reinforcement Hull’s Drive-Reduction Theory P(R) = (D x H x K x V) – I P(R) = probability of response D = Drive or length of deprivation H = Habit or how often response has been reinforced K = Incentive (pull factor) quantity or quality of goal V = Stimulus intensity (push factor) – greater better I = Inhibition- reduces probability of response (fatigue) “Forgetting” may be function of reduction in D, K, or V (note: H cannot decrease) or increase in I Food and water reduce hunger and thirst, but how do secondary reinforcers reduce a drive? Problem with theory is that many secondary reinforcers due not reduce drives and did not gain their reinforcing properties through association with primary reinforcers that do reduce drives
Theories of Reinforcement Premack Principle – aka – Relative Value Theory Figure 5-20. Relative value and reinforcement. Water reinforces running in rats deprived of water, but Premack showed that running reinforces drinking in exercise-deprived rats.
Theories of Reinforcement Response Deprivation Theory – aka Equilibrium Theory
Theories of Reinforcement Theories of Negative Reinforcement (Avoidance) Mowrer’s Two-Process Theory Avoidance is therefore merely escape from the CS Problems with 2 process theory: Fear of the CS lessens as organism learns to avoid it, so if there is no longer a feared stimulus, what’s reinforcing avoidance? If behavior is no longer reinforced (no more fear to alleviate), why doesn’t the behavior extinguish completely?
Theories of Reinforcement Theories of Negative Reinforcement (Avoidance) One-Process Theory