Operant Conditioning A form of learning in which behavior becomes more or less probable depending on its consequences Associated with B.F. Skinner
Four possible consequences to behavior Something good could start or be presented Something good could stop or be taken away Something bad could start or be presented Something bad could stop or be taken away In operant conditioning, the word “positive” means that something starts or is added. The word “negative” means that something is taken away or stopped.
Reinforcement Increase in probability of a response caused by stimuli (consequence) that follows it Goal is to increase the behavior Reinforcers make behavior occur more frequently, make it more likely to occur, or make it occur more strongly.
Punishment Decrease in the probability of a response caused by the stimuli (consequence) that follow it Goal is to decrease the behavior—make it occur more weakly, less often, or less likely to occur at all.
Two types of reinforcement Positive reinforcement: response is strengthened because it’s followed by presentation of a desirable stimulus Negative reinforcement: response is strengthened because it’s followed by removal of an aversive stimulus
Two types of punishment Positive punishment: aversive event following a response; decreases the tendency to make the response Negative punishment: removal of a desirable stimulus following a response; decreases the tendency to make the response
Reinforcement/Punishment Chart Reinforcement (behavior increases) Punishment (behavior decreases) Positive (something added) Positive reinforcement (Something good added increases behavior) Positive punishment (Something unpleasant added decreases behavior) Negative (something taken away) Negative reinforcement (Something bad taken away increases behavior) Negative punishment (Something good taken away decreases behavior)
Is punishment effective? Effective in decreasing responses, but sometimes it results in suppression of all behavior if punishment is too severe To make it more effective: Apply swiftly Be just severe enough to be effective Be consistent Explain the punishment Reinforce alternative behavior
In general… Consequences—good or bad—must be either immediate (in the case of animals) or clearly explained to the individual in which it is made clear that the consequence is linked to the behavior.
Shaping A way to teach someone a new behavior that he’d never perform naturally Person or animal is reinforced for successive approximations of a target behavior, and finally, for the behavior itself Person or animal receives a small reward for each small step toward a final goal, rather than just a reward for the target response
Schedules of reinforcement How often are you reinforced for your behavior? Could be continuous or partial. Continuous: you get a reinforcement/reward every time you perform the behavior Partial: you get a reinforcement after only a certain amount of time has passed (interval schedule) or after a certain number of responses (ratio)
Fixed ratio Provides reinforcement after a specific number of desired responses High, steady response rate Brief post-reinforcement pause Example: Getting paid every time you sell another 5 magazine subscriptions
Fixed Ratio Graph
Fixed interval Provides reinforcement after a set period of time Produces a drop in responding immediately after reinforcement (post-reinforcement pause) and a gradual increase in responding at the time for the next reinforcement Scalloping: greatly exaggerated post-reinforcement pause Worst schedule to use in education, but it’s often used (e.g., test every second Friday)
Fixed Interval Graph Notice the scalloping pattern.
Variable ratio schedule Provides reinforcement after an unpredictable number of desired resp onses High steady rates of responding Highest burnout rate Produces behaviors that are highly resistant to extinction, persist long after the reinforcement isn’t available any more. Called the partial reinforcement effect.
Graph of variable ratio schedule
Variable interval schedule Provides reinforcement for the first desired response made after varying periods of time Slow, steady rate of responding Less chance of burnout Best schedule, especially for education (e.g., unannounced pop quizzes) Example: fishing (waiting for fish to bite)
Variable Interval Schedule
Summary of Schedules Ratio schedules—faster response rates than interval schedules because number of responses, not passage of times, determines reinforcement Variable schedules—steadier response rates than fixed schedules because people don’t know when their payoff will be.
A comparison of the schedules of reinforcement Notice that the ratio schedules produce faster, higher response rates than the interval schedules. Variable schedules produce steadier response rates fixed schedules. There are no scalloping patterns or postreinforcement pauses with variable schedules.
Premack Principle What’s an effective reinforcer for one person might not be an effective reinforcer for another Example: Giving kids M&Ms to turn knobs; sometimes turning knobs is more fun than getting M&Ms.
Observational Learning Occurs when a behavior is learned by observing the consequences that others receiver for performing it; MODELING. Also called “Social Learning” Bandura, Ross, & Ross (1963): Bobo doll study
Footage of the Bobo doll study http://video.google.com/videoplay?docid=-2953790276071699877#docid=-4586465813762682933
Questions about observational learning Why would a child imitate a behavior if it isn’t reinforced? Because imitation aids in survival and adaptation, and it’s favored by natural selection. Do other species use observational learning? Yes, it’s even been demonstrated in cockroaches.
Processes that must take place for observational learning to occur Attention to model’s actions Retention of model’s actions Reproduction of model’s actions Motivation to perform model’s actions You may acquire a behavior but not perform it if you see the performance will result in negative consequences.