1 / 63

Instrumental Learning

Instrumental Learning. All Learning where an animal operates on its environment to obtain a reinforcement. Operant (Skinnerian) conditioning. 1. Thorndike and the law of effect. The animal has an increased probability to repeat the behavior that was emitted just before the reward.

Download Presentation

Instrumental Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Instrumental Learning All Learning where an animal operates on its environment to obtain a reinforcement. Operant (Skinnerian) conditioning

  2. 1. Thorndike and the law of effect • The animal has an increased probability to repeat the behavior that was emitted just before the reward. • As Thorndike would say, “the memory becomes stamped in”

  3. Many different type a learning apparatus were tried between 1900-1945 • Escape learning for cats (Thorndike, Guthrie). • Rat Jumping Stand (Myers) • Complex Maze Learning (Tolman, Lashley) • T-maze (Hull, Spence) • Morris water-maze (modern day development)

  4. The Common element • All used a fixed trial presentation method. • Subjects had a fixed experience across animals and the learning varied per animal. • Subjects had a variable experience but had a fixed criterion of learning that must be obtained.

  5. One looked at the time it took to acquire the goal (mazes) or the number of trials to reach the criterion.

  6. Skinner and Operant Behavior The unique feature of operant training is the experimenter waits until the animal does the specified response before a reward is given. This is called free operant behavior.

  7. Reward vs. Reinforcement • A reward is a global state of affairs given to the whole animal (food, electric shock). • A reinforcement is for the specific discreet response done by the animal to obtain the reward.

  8. Primary reinforcers • Eating, drinking & sexing • Addicting drugs • For animals the equivalent of money for humans, e.g., poker chips, marbles. • When used in this way it is called Condition reinforcers

  9. Positive and Negative Reinforcement • Both positive and aversive stimuli can be used to guide behavior. Both are used to increase a desired response. The reinforcement is delivered close in time after the emission of the desired response is accomplished.

  10. Reinforcement & Punishment • Concept – Positive Reinforcement

  11. Description • Increasing the frequency of a behavior by following it with the presentation of a positive reinforcer – a pleasant, positive stimulus or experience

  12. Example • Saying “Good job” after someone works hard to perform a task.

  13. Types of reinforces • Appetitive – usually food • Negative --- shock, air puff; those stimuli that deliver pain or discomfort.

  14. Positive Reinforcement

  15. Concept: • Negative reinforcer

  16. Negative Reinforcement

  17. Note the following • The removal of a negative stimulus is positively reinforcing – the animal will tend to do that behavior that removes itself from the cues associated with the aversive state of affairs.

  18. Reinforcement/Punishment

  19. Shaping • Shaping is the method by which one gets the animal to accomplish the desired response in the first place.

  20. The final behavior desired is broken down into small steps or increments. The accomplishment of the first step leads directly to the next step in the chain.

  21. How to train a monkey to hit a key.

  22. Continuous reinforcement • A reinforcement is given for every desired response. Stop giving the reinforcement the animal stops responding..

  23. Intermittent reinforcement • Intermittent reinforcement is more resistant to extinction than continuous reinforcement.

  24. Appetitive Schedules of reinforcement • Schedules of reinforcement are base on two criteria., number of responses, or the passage of time.

  25. Ratio Schedules (FR) • Fixed ratio schedule delivers a reinforcement after a given number of responses has been formed.

  26. Variable Ratio (VR) • Here the number of responses varies about a mean response rate • Slope is not quite as steep as fixed ratio

  27. Fixed interval (FI) • Here, a reinforcement is delivered after the first response after the passage of a fixed amount of time. • Note the scalloping of the cummulative record.

  28. Variable Interval (VI) • Variable interval is similar to FI schedule except it is the time lapse between the availability of successive reinforcements that is varied. For example, 1, 3, 2, ect. The interval is named after the mean amount of time past. Again the reinforcement is delivered after the first response after the interval has past.

  29. VI • Note that in variable interval schedules one does not see the scalloping one sees in FI schedules. The slope is not as steep as in VR not FR schedules

  30. Differential Reinforcement for Rate • In ratio schedules there is a contingency between the rate of responding and the rate of reinforcement. That is the faster the animal responds the faster it gets a reinforcement. The contingency is not as strong for interval schedules but still there.

  31. Setting up a Differential Rate • One sets up a contingency between the numbers of responses within a given time interval for reinforcement. The key is to control the rate of response per unit time, i.e. control the inter-response time (IRT)

  32. Differential Reinforcement for High Rates (DRH) • Here the animals must respond 10 times in 5 seconds as and example. Each time this criteria is met the animal get reinforced after the last response

  33. Differential Reinforcement for Low Rates of Responding (DRL) • Here the animal must inhibit early responses to meet a criterion of say 10 sec. If the animal responds prior to the 10 sec a clock is reset and the animal must start the wait period over.

  34. Current theory postulates two underlying processes • The animal forms a temporal discrimination. • The animal actively inhibits responding. (uses ancillary responses, not to the requisite key, or bar to pass the time).

  35. DRH/DRL • Respond within a window of time. Must respond after a specific time has past, must not allow an upper time span to be exceeded.

  36. Wyler/Prim study using single neuron

  37. Negative Control of Behavior • Behavior emitted that removes an aversive state of affairs.

  38. Negative reinforcer Description: Increasing the frequency of a behavior by following it with the removal of an unpleasant stimulus or experience

  39. Concept • Avoidance conditioning

  40. Avoidance conditioning • Description: Learning to make a response that avoids an unpleasant stimulus.

  41. Example • You slow your car to the speed limit when you spot a police car, thus avoiding being stopped and reducing the fear of a fine; very resistant to extinction

  42. 1. Escape and Avoidance

  43. The control of Intrinsic behavior • Avoidance tasks the removal of one-self from an environment which has previously been associated with a negative reinforcement.

  44. Sidman Avoidance • Shock-Shock interval (shock every 5 sec)

More Related