Operant Conditioning Unit 4 - AoS 2 - Learning.

Slides:



Advertisements
Similar presentations
Operant Conditioning Skinner, positive & negative reinforcement, response cost, punishment and schedules of reinforcement.
Advertisements

3. Operant Conditioning = A form of learning for which the likelihood of a particular response occurring is determined by the consequences of that response.
Classical Conditioning Pavlov’s experiment - psychic secretions. Pavlov was a Russian physiologists who studied digestion. He won the Nobel prize in physiology.
Chapter 13, Unit 4 Psychology.  While CC is useful for explaining learned behaviour, there are many other learned behaviours that CC cannot explain,
Operant Conditioning. I. Operant Conditioning A type of learning that occurs when we receive rewards or punishments for our behavior A type of learning.
Copyright © 2005 Pearson Education Canada Inc. Learning Chapter 5.
Reward and Punishment.  Cats escape from box to get a treat  At first its all trial and error  When successful the behaviour is rewarded  This good.
O PERANT C ONDITIONING Year 12 Psychology Unit 4 Area of Study 1 (chapter 10, page 476)
Learning.
Operant Conditioning Unit 4 - AoS 2 - Learning. Trial and Error Learning An organism’s attempts to learn or solve a problem by trying alternative possibilities.
Chapter 6: Learning. Classical Conditioning Ivan Pavlov Terminology –Unconditioned Stimulus (UCS): evokes an unconditioned response without previous conditioning.
OPERANT CONDITIONING Changing Behavior Through Reinforcement and Punishment.
© 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Operant Conditioning Unit 4 - AoS 2 - Learning. Trial and Error Learning An organism’s attempts to learn or solve a problem by trying alternative possibilities.
OPERANT CONDITIONING.  Many of the behaviours in animals and humans cannot be explained in terms of classical conditioning.  Many complex behaviours.
Operant Conditioning Operant Conditioning A type of learning in which behavior is strengthened if followed by reinforcement or diminished if.
Classical Conditioning
Operant Conditioning E.L. Thorndike and B.F. Skinner.
Copyright McGraw-Hill, Inc Chapter 5 Learning.
Learning Experiments and Concepts.  What is learning?
Operant Conditioning A learning process by which the likelihood of a particular behaviour occurring is determined by the consequences of that behaviour.
Operant Conditioning. Operant Conditioning – A form of learning in which voluntary responses come to be controlled by their consequences. What does this.
B. F. Skinner Radial Behaviorism B.F. Skinner ( ) 1925: Hamilton College (NY): degree in English, no courses in psychology Read about Pavlov’s.
Operant conditioning (Skinner – 1938, 1956)
Operant Conditioning Type of learning in which the frequency of a behavior depends on the consequence that follows that behavior. Another form of learning.
Behavioral Learning Theory : Pavlov, Thorndike & Skinner M. Borland E.P. 500 Dr. Mayton Summer 2007.
Copyright © Allyn and Bacon Chapter 6 Learning This multimedia product and its contents are protected under copyright law. The following are prohibited.
Trial and error learning Thorndike’s puzzle box. Trial and error learning This type of learning occurs when an organism attempts to learn by undertaking.
Trial and error learning Thorndike 1898, 1911
Chapter 6 LEARNING. Learning Learning – A process through which experience produces lasting change in behavior or mental processes. Behavioral Learning.
Operant Conditioning Module 15. Operant Conditioning A type of learning in which the frequency of a behavior depends on the consequence that follows that.
Before Class… Pick up each of the two worksheets on the cart Submit any late work you may have for me Get a RED book from the shelf Begin to work on the.
Operant Conditioning A method of learning that occurs through rewards and punishments for behavior. The theory was developed by B.F. Skinner – who was.
Operant Conditioning The Main Features of Operant Conditioning: Types of Reinforcement and Punishment.
Learning by consequences
Classical Conditioning Operant Conditioning Learning by Observation
Operant Conditioning The Main Features of Operant Conditioning: Types of Reinforcement and Punishment.
Learning Ch. 5.
Learning Chapter 9.
Learning by consequences
Chapter 5 Learning © 2013 by McGraw-Hill Education. This is proprietary material solely for authorized instructor use. Not authorized for sale or distribution.
© 2008 The McGraw-Hill Companies, Inc.
Learning.
Learning by consequences
The Learning Approach (Behaviourism).
Learning.
Module 20 Operant Conditioning.
Operant conditioning.
Operant Conditioning Module 27.
Operant Conditioning The learning is NOT passive.
Chapter 6 Learning.
Operant Conditioning Chapter 9 Section 2.
Operant Conditioning.
UNIT 4 BRAIN, BEHAVIOUR & EXPERIENCE
Chapter 6.
Chapter 7 (C): Operant Conditioning
Chapter 6: Learning.
Ch. 7: Principles of Learning
Operant Conditioning.
Operant Conditioning.
Thorndike Interested in studying animal intelligence
9.2 Operant Conditioning “Everything we do and are is determined by our history of rewards and punishments.” –BF Skinner Operant Conditioning: learning.
Operant Conditioning.
Do-Now: Describe the following phenomena of Classical Conditioning:
Module 27 – Operant Conditioning 27
Learning A.P. Psychology.
Learning Theory SAC Revsion.
Thorndike Interested in studying animal intelligence
Chapter 6: Learning/Conditioning
Operant Conditioning What the heck is it?
Presentation transcript:

Operant Conditioning Unit 4 - AoS 2 - Learning

Trial and Error Learning An organism’s attempts to learn or solve a problem by trying alternative possibilities until a correct solution or desired outcome is achieved. Often involves many attempts (trials) and incorrect choices (errors) Was called instrumental learning,now Operant conditioning - the learner ‘operates’ on the envioronment

Thorndike’s Puzzle Boxes Put hungry cats into a ‘puzzle box’, food outside box outside of reach Cat had to get out of box to get food. The more times a cat was put in the box, the faster it got out (fewer trials) After 7 trials would go straight for lever and get out immediately. Lever pushing now learnt, not random

Thorndike’s Law of Effect A behaviour that is followed by ‘satisfying’ consequences is strengthened (more likely to occur) and A behaviour that is followed by ‘annoying’ consequences is weakened (less likely to occur) Instrumental learning because cat is instrumental in obtaining its release

Operant Conditioning First used by Burrhus Skinner. Operant is a response (or set of responses) that occurs and acts (operates) on the environment to produce some kind of effect. B Behaviour that has consequences ALL behaviour can be explained this way

Operant vs Respondent Respondents are behaviours that are elicited by known or recognised stimuli . Pavlov’s dogs responded by salivating to meat powder, then a bell. Thorndike’s cats made responses not prompted by stimuli. In CC, behaviour has no effect on consequences

Skinner Boxes Small chamber where an animal learns to make a response for which the consequences can be controlled by experimenter. A lever that delivers food / water into a dish. Some have lights / buzzers Some have a flaw that can shock

Reinforcement Reinforcement - applying a positive stimulus OR removing a negative stimulus to subsequently strengthen or increase the likelihood of a particular response that it follow. Reinforcer - any object or event that changes the probability that an operant behaviour will occur again. Interchangeable with reward, but different

Reinforcement Initially, most success if behaviour is continually reinforced. Continuous Reinforcement - reinforcing every correct response after it occurs Partial Reinforcement - process of reinforcing some correct responses but not all of them. Partial may be delivered by different schedules

Fixed-Ratio Schedules When the reinforcer is given after a set (fixed) and unvarying number (ratio) of desired responses have been made eg every third response, one response for every 10 correct responses (1:10) During acquistion phase must be frequent Workers who are paid ‘piecework’ eg commission, amount per basket picked.

Variable-Ratio When the reinforcer is given after an unpredictable number of correct responses. A mean number of correct responses that receive reinforcement. Very effective, fast acquisition and doesn’t cease easily. Poker machines - expected payout, but don’t know when

Fixed-Interval schedule When the reinforcer is delivered after a specific period of time has elapsed since the previous reinforcer, provided the correct response has been made. One correct response is all that is needed, like pressing the crossing button. Often erratic, since we realise time not responses are the factor, so wait until time

Variable-Interval Schedule When the reinforcer is delivered after an irregular period of time has elapsed, provided the correct response has been made. A mean period of time, but unpredictable. Responses before the delivery time are not reinforced even if correct. Fishing, speed cameras, booze busses.

Positive Reinforcement Giving or applying a positive reinforcer after the desired response has been made. Positive reinforcer - provides a satisfying consequence (reward), so strenghtens the likelihood of a response.

Negative Reinforcement Removal or avoidance of an unpleasant stimulus. Negative Reinforcer - any unpleasant stimulus that when removed strengthens the likelihood of a desired response occurring. In negative reinforcemnt the reinforcer is removed or avoided, not given (positive)

Examples Getting and A on your exam (positive reinforcer)can be achieved by studying, so studying will be repeated (increased behaviour) Failing your exam (Negative reinforcer) is avoided by studying, so studying will be repeated (increased behaviour) Both lead to desirable / positive consequence.

Punishment Delivery of an unpleasant stimulus following a response, or removal of a pleasant stimulus. Consequence of punishment is weakening of response, or decrease in probability of response occuring again

Order of presentation For reinforcement and punishment, it must be presented immediately after a desired response not before. The rat needs to press the lever before getting positive reinforcer

Timing Most effective when given immediatley after the response, so they are associated directly. Delay will cause learning to be slow or unsuccessful. Easier in lab than real life. Eg student reports, delayed response.

Appropriateness Reinforcers must provide pleasing consequences, Punishments must provide unpleasant consequences. But how do you know what will please each person? Not all reinforcers will work in all situations. Inappropriate punishers can become reinforcers - eg. attention seekers

Key processes - Acquisition In OC, acquisition is the establhsiment of a response through reinforcement. Speed depends on whether continuous or partial reinforcement. For complex behaviours successive approximations can be reinforced building up to target behaviour.

Acquisition Shaping - a procedure in which reinforcement is given for any response that successively approximates a final target response, Also known as method or successive approximations Eg. Skinner’s pigeon will have to turn more and more to get same reward.

Extinction The gradual decrease in the strength or rate of a conditioned response following consistent non-reinforcement of the response. Eg. When does the pigeon stop turning after it isn’t being fed. May actually increase at first, to try to get the reinforcement. don’t want to stop

Spontaneous Recovery Can also occur with operant conditioning, when the response occurs in absence of reinforcement after extinction has occurred. Likely weaker and temporary

Stimulus Generalisation When the correct response is made ot another stimulus that is similar to the stimulus that was present when the CR was reinforced. Usually at a reduced level (weaker or less often)

Stimulus Generalisation When an oranism makes the correct response to a stimulus and is reinforced, but doesn’t respond to other stimuli, even when similar. Eg.If reinforced for red lights not green lights, will only respond for red.