Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Slides:



Advertisements
Similar presentations
Welcome! Please write down your homework: –Test next class. Ch. 8 and all review chapters –Notecards due next class.
Advertisements

Associative Learning Operant Conditioning. Foundations Edward Thorndike ( ) –Puzzle Box –Cats became more efficient with each trial –Law of.
Learning – Operant Conditioning AP Psychology Chapter 6.
Operant Conditioning Module 16 Demo Activity HO 16.1 Pkt. p. 7 See outline in pkt. p. 6 ½ DVD: Discovering Psychology: Disc 2: “Learning”
Operant Conditioning What is Operant Conditioning?
Learning Operant Conditioning.  Operant Behavior  operates (acts) on environment  produces consequences  Respondent Behavior  occurs as an automatic.
Operant Conditioning. I. Operant Conditioning A type of learning that occurs when we receive rewards or punishments for our behavior A type of learning.
Operant Conditioning What the heck is it? Module 16.
Operant Conditioning Big Question: Is the organism learning associations between events that it does not control (classical) OR is it learning associations.
Learning Part 2. Operant Conditioning Operant Conditioning - Associate own actions with consequences Actions followed by reinforcers increase Actions.
OPERANT CONDITIONING DEF: a form of learning in which responses come to be controlled by their consequences.
Operant Conditioning Unit 4 - AoS 2 - Learning. Trial and Error Learning An organism’s attempts to learn or solve a problem by trying alternative possibilities.
What is Operant Conditioning? Module 16: Operant Conditioning.
Learning Long lasting change in behavior due to experience.
Learning Chapter. Operant Conditioning Module 20.
Learning Theories Goal  How do we acquire behaviors through operant conditioning?
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
What is Operant Conditioning?. Operant Conditioning A type of learning in which the frequency of a behavior depends on the consequence that follows that.
Operant Conditioning Unit 4 - AoS 2 - Learning. Trial and Error Learning An organism’s attempts to learn or solve a problem by trying alternative possibilities.
Operant Conditioning Mr. Koch AP Psychology Forest Lake High School.
Learning (Part II) 7-9% of AP Exam Classical Conditioning UCS + UCR + N, etc… Acquisition Extinction Biological Predisposition Pavlov Watson Operant Conditioning.

Operant Conditioning  B.F. Skinner ( ) elaborated Thorndike’s Law of Effect developed behavioral technology.
Operant Conditioning Operant Conditioning A type of learning in which behavior is strengthened if followed by reinforcement or diminished if.
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Learning Classical Conditioning Ivan Pavlov Studied Digestion of Dogs. Dogs would salivate before they were given food (triggered by sounds, lights etc…)
Classical Conditioning
I. Pavlov John Watson Classical Conditioning B.F. Skinner A. Bandura Operant Conditioning Observational Learning.
Operant Conditioning E.L. Thorndike and B.F. Skinner.
LEARNING: PRINCIPLES AND APPLICATIONS Operant Conditioning.
Operant Conditioning A type of learning in which behavior is strengthened if followed by reinforcement or diminished if followed by punishment.
+ Operant Conditioning AP Psychology: Learning. + What is Operant Conditioning? Type of learning in which the frequency of a behavior increases or decreases.
Copyright McGraw-Hill, Inc Chapter 5 Learning.
OPERANT CONDITIONING. Learning in which a certain action is reinforced or punished, resulting in corresponding increases or decreases in behavior.
Operant Conditioning. Operant Conditioning – A form of learning in which voluntary responses come to be controlled by their consequences. What does this.
Operant conditioning (Skinner – 1938, 1956)
Thinking About Psychology: The Science of Mind and Behavior Charles T. Blair-Broeker Randal M. Ernst.
Learning Long lasting change in behavior due to experience.

CP PSYCHOLOGY CP PSYCHOLOGY CHAPTER 2 Learning Theories.
Def: a relatively permanent change in behavior that results from experience Classical Conditioning: learning procedure in which associations are made.
Operant Conditioning Module 27. Edward Thorndike Puzzle box o See how animals learned Theory of Instrumental Learning o Explain how individuals learn.
Learning 7-9% of the AP Psychology exam. Thursday, December 3 Sit with your group from yesterday’s test review!
Module 27 Operant Conditioning
Operant Conditioning Type of learning in which the frequency of a behavior depends on the consequence that follows that behavior. Another form of learning.
CLASSICAL VS. OPERANT CONDITIONING  With classical conditioning you can teach a dog to salivate, but you cannot teach it to roll over. Why?  Classical.
CHS AP Psychology Unit 6: Learning (Behaviorism) Essential Task 6.3: Predict the effects of operant conditioning with specific attention to (primary, secondary,
Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!
Operant Conditioning Overview
Chapter 8 pt. 2: Operant Conditioning and Observational Learning
Operant Conditioning. Activity WHO WANTS TO BE OUR VOLUNTEER?
Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!
Operant Conditioning. A type of learning in which the frequency of a behavior depends on the consequence that follows that behavior. The frequency will.
Thinking About Psychology: The Science of Mind and Behavior 2e Charles T. Blair-Broeker Randal M. Ernst.
3 types of Learning 1. Classical 2. Operant 3. Social This Is our second type of Learning.
Thinking About Psychology: The Science of Mind and Behavior Charles T. Blair-Broeker Randal M. Ernst.
Chapter 6 LEARNING. Learning Learning – A process through which experience produces lasting change in behavior or mental processes. Behavioral Learning.
Operant Conditioning Module 15. Operant Conditioning A type of learning in which the frequency of a behavior depends on the consequence that follows that.
Long lasting change in behavior due to experience.
Module 20 Operant Conditioning.
Operant Conditioning The learning is NOT passive.
The Learner is NOT passive. Learning based on consequence!!!
Operant Conditioning.
UNIT 4 BRAIN, BEHAVIOUR & EXPERIENCE
The Learner is NOT passive. Learning based on consequence!!!
II. Operant Conditioning
Module 27 – Operant Conditioning 27
Operant Conditioning What the heck is it?
Presentation transcript:

Operant Conditioning The Learner is NOT passive. Learning based on consequence!!!

Operant Conditioning Learning controlled by a connection to the consequence of one’s behavior Consequences of behavior determine whether it will be repeated in future Vs. Classical Conditioning Behavior is… CC: elicited, automatic, reflexive OC: emitted, voluntary, complex behaviors Reward is… CC: provided independent of actions OC: dependent on behavior Where as classical conditioning involved reflexive behaviors, Operant conditioning involves voluntary behaviors connected to a response or consequence These behaviors are complex, like folding laundry, washing dishes, hailing a cab

B.F. Skinner The most influential behaviorist and proponent of Operant Conditioning. Nurture guy through and through. Used a Skinner Box (Operant Conditioning Chamber) to prove his concepts.

Skinner Operant box—non-reflexive behaviors could be altered by learning Skinner Box: -Developed by B. F. Skinner, innovator of radical behaviorism -Recorded sustained periods of conditioning ----Allowed researcher to continue conditioning process for extended periods of time and record behaviors without having to actually be present -Bar delivers food when pressed, light signals food A recording device prints out a cumulative record of the animal’s activity. Skinner box usually contains a bar that delivers food when pressed, and usually a light to signal when the food (reward) is coming.

Chaining Behaviors Subjects are taught a number of responses successively in order to get a reward. Click picture to see a rat chaining behaviors. Click to see a cool example of chaining behaviors.

Thorndike’s Puzzle and The Law of Effect Edward Thorndike Locked cats in a cage Behavior changes because of its consequences. If a response is rewarded, that response is more likely to occur If consequences are unpleasant, the Stimulus- Reward connection will weaken. (LOE) Called the whole process instrumental learning. Instrumental behaviors As you saw in the video, animals learned to escape through trial and error, they did not have insight. When the cat did the right thing, he was rewarded by escaping to the food. Over time, the cat learns what behaviors produce rewards, and do these behaviors more quickly. Behaviors that do not produce rewards (i.e. cannot escape) are done less frequently. The chart shows the learning curve for the cat. At first it took several minutes for the cat to escape the box, but with successive trials, it took less and less time. Click picture to see a better explanation of the Law of Effect.

Thorndike Operant Conditioning: -X-Axis: number of trials in the puzzle box -Y-Axis: time it took cats to escape the box -Time went from about 4 minutes to 30 seconds

Operant Conditioning Reinforcement Punishment Increases probability of response Positive: desirable stimulus is added Negative: undesirable stimulus is removed Punishment Decreases probability of response Positive: adding something bad Negative: removing something good Example: Child’s Behavior -Behavior to Increase: cleaning room -Positive Reinforcement: give kid dessert -Negative Reinforcement: take away vegetables -Behavior: hitting his sibling -Positive Punishment: spank the kid -Negative Punishment: take away his toys [“Positive” means adding something] [“Negative” means removing something]

Reinforcement When an event increases the likelihood that a response will occur again Positive Adding something good Designed to increase behavior Negative Removing something bad Reinforcement: -Math Terms: positive is plus sign (+), means adding something -Math Terms: negative is minus sign (-), means taking something away Example: Studying -Positive Reinforcement: adding something good (good grade) -Negative Reinforcement: removing something bad (nagging) -Both outcomes are satisfying and will increase the behavior

Types of reinforcers Primary vs. secondary Immediate & delayed Primary: inherently satisfying to most people Secondary: gain value from conditioning Immediate & delayed Usually needs to be immediate, but humans can handle delayed reinforcers Important for self-control Primary reinforcer: e.g. food Secondary reinforcer: e.g. money, poker chips – not inherently exciting for people, but becomes associated with ability to purchase stuff – so can be rewarded

What type of learning was this an example of? Rat basketball What type of learning was this an example of? Can you explain what helped the rats learn to score a basket? Consider the type of learning as you watch this clip.

Punishment/Consequence When an event decreases the likelihood that a response will occur again Two types: Positive & Negative Positive ≠ Good. POSITIVE = ADD Adding something bad Designed to decrease behavior Negative ≠ Bad. NEGATIVE = SUBTRACT Removing something good Punishment usually involves an aversive event that leads to a decreased probability that the response will occur again. Most people are familiar with punishment, yelling, spanking, are all examples of punishment. Book doesn’t differentiate between positive and negative punishment but I will.

Importance of reinforcement Punishment signals undesirable behavior but doesn’t inform of desired behavior Punished behavior is suppressed Punishment teaches stimulus discrimination Punishment (esp. physical) teaches fear & aggression Ignore behavior that one wants to punish; look for what to reinforce Suppressed but not forgotten – but suppression negatively reinforces the parent’s punishing behavior Stimulus discrimination – learn not to swear in front of mom – not that swearing is wrong Shows kids that aggression is a way to solve problems Some advocate --

Punishment tends to be ineffective It tells the organism what not to do, rather than what to do Creates anxiety that can interfere with future learning Encourages subversive behavior (sneakiness) Provides a model for aggressive behavior Only true for some races/cultures Review punishment cons. 18

Neg. reinforcement ≠ punishment Pos vs neg reinforcement – both encourage continuation of behavior – pos – add something good, neg – remove something bad Pos Punishment – give something bad – discourages beh Neg Reinforcement – Take away something bad – encourages beh

The Decision Tree How to solve operant conditioning problems Should the behavior increase or decrease? Is something being added or taken away? Review decision tree Increase. (Reinforcement) Decrease. (Punishment) Added. (Positive) Removed. (Negative)

Review Positive Negative Punishment decreases behavior Reinforcement ADD something unfavorable SUBTRACT something desirable Reinforcement increases behavior ADD something desirable SUBTRACT something unfavorable Review chart

Applications of Operant Conditioning

Behavior Modification Started with Thorndike Altering individual behavior (frequency) through positive and negative reinforcement and positive and negative punishment Adaptive behaviors Reduction of behavior through its extinction and punishment A.K.A. – Applied Behavior Analysis or Positive Behavior Support (PBS) A child is riding with an adult, and the child is thirsty. So, the child asks to stop and get a drink. The adult says no, the child asks again, and again, and again... Finally, the adult gives in, saying, "All right, just this once." Big mistake, right? Why? The adult has now put the child on a partial schedule, guaranteeing a repetition of the same behavior later on. Instead, the adult should have said, "All right, I'll get you a drink IF you don't ask for one for the next 10 (time may have to vary, depending on the child) minutes." Then, the adult is providing the child with positive reinforcement for being quiet. Ending a Relationship?????

Behavior Modification Reinforcement provides a system of rewards and punishments to change negative behavior into positive responses. Provides rewards when someone acts in a positive manner. Rewards can range from a compliment to granting a special privilege to the patient whose behavior becomes desirable. A negative consequence might be the result of unwanted behavior, with the removal of a favorite object or taking away a privilege. Cognitive behavior modification techniques focus on thought patterns that affect behavior, Involve teaching a patient to recognize thoughts that may be unrealistic or distort reality. Keeping a journal, role-playing, and being asked to defend thoughts that defy reality. Eating disorders, anxiety disorder, OCD, Panic attacks Aversion behavior modification techniques center on the premise that all behavior is learned and can be unlearned. (aka CC) Electrical shock treatment is one example of adverse stimuli used to treat deviant behavior. (Mild) medication given to alcoholics that might make them ill if they drink while using the drug. The token system provides immediate rewards while setting goals for future conduct. Distribute a token or similar object each time a patient or student exhibits positive behavior. Tokens can be amassed and later exchanged for a prize or privilege, or lost due to unwanted behavior. This form of behavior modification is commonly used in mental institutions and prisons to help control individuals who show violent tendencies.

Premack principle A less frequently performed behavior can be increased by reinforcing it with a more frequent behavior Eat your vegetables before you can have dessert! Review premack principle. For those interested, this is a raspberry, white chocolate and coconut cupcake. Tastespotting.com. No tastespotting until your studying is done.

Operant Conditioning in Daily Life To train a dog to get your slippers, you would have to reinforce him in small steps. First, to find the slippers. Then to put them in his mouth. Then to bring them to you and so on…this is shaping behavior. Do we wait for the subject to deliver the desired behavior? Sometimes, we use a process called shaping. Shaping is reinforcing small steps on the way to the desired behavior. To get Barry to become a better student, you need to do more than give him a massage when he gets good grades. You have to give him massages when he studies for ten minutes, or for when he completes his homework. Small steps to get to the desired behavior.

Shaping Reinforcing responses that come successively closer to the desired response Successive approximations Used a lot in animal training – gradually reinforce the behavior you’re looking for By using shaping - a conditioning procedure of reinforcing successively closer approximations of the desired behavior, until the desired behavior happens. We used behaviors that increasingly resemble the desired behavior and reinforced each of these – these are called successive approximations. When a baby learns to walk, first the baby learns to roll over, then stand on his hands and knees, then crawl, then hold on to furniture, and finally to walk. Each of these behaviors are successive approximations to walking, and are rewarded with cheers from his parents.

Shaping Reinforcers gradually increase organism’s actions toward desired end behavior Successive approximations : behaviors closer & closer to end learning goal get rewarded Simply turning toward the lever will be reinforced Only stepping toward the lever will be reinforced Only moving to within a specified distance from the lever will be reinforced Only touching the lever with a part of the body will be reinforced Only touching the lever with a specified paw will be reinforced Only depressing the lever partially with the specified paw will be reinforced Only depressing the lever completely with the specified paw will be reinforced

Schedules of reinforcement How often to you give the reinforcer? Every time or just sometimes you see the behavior. Ratio schedules lead to higher response rate – makes sense, people/animals have control (to some extent) over when the rewards happen -Variable schedules more consistent (straighter lines) -Fixed-Interval schedules have scalloped pattern (increase just before interval) -Variable-Ratio schedules have the highest response rates of all the schedules

Schedules of Reinforcement Continuous reinforcement schedule: Reinforcing a response every time Learning occurs rapidly, extinction occurs rapidly Partial reinforcement schedule: Reinforcing a response only some of the time Slower acquisition, but resistant to extinction Fixed vs. Variable Ratio vs. Interval Fixed ratio: after set # of responses Variable ratio: after unpredictable # of responses Fixed interval: after set amount of time has passed Variable interval: after unpredictable amount of time has passed Review schedules, create examples

Continuous v. Partial Reinforcement Reinforce the behavior EVERYTIME the behavior is exhibited. Usually done when the subject is first learning to make the association. Acquisition comes really fast. But so does extinction. Reinforce the behavior only SOME of the times it is exhibited. Acquisition comes more slowly. But is more resistant to extinction. FOUR types of Partial Reinforcement schedules.

Schedules of reinforcement Continuous vs. partial Shows that partial reinforcement is harder to extinguish than continuous – Why? -- if you have only been reinforced sometimes, you can’t tell as quickly that you aren’t being rewarded Slowest extinction is for variable ratio

Ratio schedules Fixed-ratio (FR) schedules: Reinforcement after a fixed (predictable) number of responses Ex: paid $1 for every 20 apples you pick Variable-ratio (VR) schedules: Reinforcement after a varying (unpredictable) number of responses Induces very high rate of responding Ex: scratch & win lottery tickets Ratio schedules depend on the number of responses given. Let’s consider the class as an organism. In a fixed-ratio schedule, that number is set. So every 5th student that answers a question gets a piece of candy. The faster you respond, the more rewards you get, so it produces a high rate of responding. In a variable-ratio schedule, the number of responses required before a reinforcement is given changes each time. So, maybe the first time, the 5th student who answers a question gets a piece of candy, then the 3rd student after that, then the 7th student, etc. This unpredictability produces a very high rate of responding and makes it difficult to extinguish.

Interval Schedules Fixed-interval (FI) schedule: Reinforcement after a fixed (predictable) amount of time Variable-interval (VI) schedule: Reinforcement after varying (unpredictable) amounts of time Interval schedules depend on the amount of time that elapses between rewards. In a fixed-interval schedule, that amount of time is set. So after every 30 seconds a student is given a piece of candy for answering a question. Response occurs more frequently as anticipated time of reward draws near. If I want consistent performance not so great. In a variable-interval schedule, the amount of time required before a reinforcement is given changes each time. So, maybe the first time, the interval is 30 seconds, the next time 2 minutes, the next time 1 minutes, etc. Produces slow and steady responding. Example: random pop quiz  will study more than if knew exactly when it would be

Reinforcement Schedules Ratio Interval after set number of responses after set amount of time after random number of responses after random amount of time Fixed Variable means that it is impossible to predict, random If you have something that is VI or VR 30, that means that the average of all the responses is every 30 responses/time interval whatever. So one time you might get it right away, another it might take you 59 tries. Variable

Ratio Interval Fixed Variable -Factories: reward (finished) after completing specific number of product, and same number required each time -Slot Machine: reward (chips) after specific number of pulls on average, but number of pull needed between each win changes -Office Work: reward (leaving) after specific time interval, and time interval stays the same each day -Surfing: reward (big wave) after an unknown amount of time, and time between each wave changes

Name that Schedule! A B D C Variable Ratio C. Variable Interval Fixed Ratio D. Fixed Interval A Winning at the slot machines Getting a free flight after accumulating 10,000 flight miles Receiving an allowance every Saturday regardless of chores, as long as you’ve done one chore Random drug testing at your job B D Have students write down what they think the answers are for a minute, then review as a class. C