1 / 62

PRINCIPLES OF APPETITIVE CONDITIONING

PRINCIPLES OF APPETITIVE CONDITIONING. Chapter 6. Early Contributors. Thorndike’s Contribution Emphasized Laws of Behavior Demonstrated trial by trial learning S-R learning Skinner’s contribution Emphasized Contingency

hymanl
Download Presentation

PRINCIPLES OF APPETITIVE CONDITIONING

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PRINCIPLES OF APPETITIVE CONDITIONING Chapter 6

  2. Early Contributors • Thorndike’s Contribution • Emphasized Laws of Behavior • Demonstrated trial by trial learning • S-R learning • Skinner’s contribution • Emphasized Contingency • A specified relationship between behavior and reinforcement in a given situation • The environment “sets” the contingencies S(R->O)

  3. A “Faux” Distinction • Instrumental conditioning • A conditioning procedure in which the environment constrains the opportunity for reward (discrete trial) • Operant conditioning • When a specific response produces reinforcement, and the frequency of the response determines the amount of reinforcement obtained (continuous responding, schedules of reinforcement)

  4. Thorndikes’ Law of Effect 4 S-R associations are stamped in by reward (satisfiers) Stimulus Response

  5. Thorndike: “What is learned?” 5 Reinforcement “stamps in” this connection Habit Learning S R

  6. Is that it? 6 Instrumental Association S R ? O ? Pavlovian Association

  7. “O” matters 7 • The Importance of Past Experience • Depression/Negative Contrast • The effect in which a shift from high to low reward magnitude produces a lower level of response than if the reward magnitude had always been low. • Elation/Positive Contrast • The effectin which a shift from low to high reward magnitude produces a greater level of responding than if the reward magnitude had always been high.

  8. Negative and Positive Contrast 8

  9. Logic of Devaluation Experiment 9 Max R-O or Goal Directed: Controlled by the current value of the reinforced, and so it should be reduced to zero after devaluation. Responding S-R or Habit: Responding that is not controlled by the current value of the reward, and so it is insensitive to reinforcer devaluation. Min Normal Devalued

  10. R-O Association (aka the instrumental association) Left Pushes Right Pushes 10 Phase 1 Devaluation Test Push LeftPellet Pellet+LiCl Right? Push RightSucrose Sucrose+LiCl Left? Number of Pushes Devalued Pellet Devalued Sucrose

  11. Summary of Devaluation 11 • Neutered male rats lower but do not eliminate their responding previously associated with access to a “ripe” female rat. • Rats satiated for reward#1 preferentially lower responding to get reward#1 more than reward#2. • Goal devaluation effects tend to shrink with continued training and goal-directed responding is replaced by habit learning.

  12. S-O Association (aka Pavlovian Association) 12 Stage 1 Stage 2 Test RightPellet TonePellet Tone: Left? Right? LeftSucrose LightSucrose Light: Left? Right? Left Number of Presses Right Tone Light

  13. Skinner’s Contributions 13 • Automatic • Easy measurements that can be compared across species

  14. Three Terms Define the Contingency 14 • Three term contingency • Discriminative stimulus (S+ or S-) • Operant (R) • Consequence (O)

  15. Operant Strengthened 15 S+ R O Bite Groom Light-On Skinner Box Lick Reinforcer Rear Push Lever

  16. Techniques and Concepts 16 • Shaping: Successive approximations • Require closer and closer appoximations to the target behaviour • Secondary Reinforcers: • Stimuli accompanying reinforcer delivery • Marking: • Feedback that a response had occurred

  17. Shaping • Shaping (or successive approximation procedure) • Select a highly occurring operant behavior, then slowly changing the contingency until the desired behavior is learned

  18. Training a Rat to Bar Press • Step 1: reinforce for eating out of the food dispenser • Step 2: reinforce for moving away from the food dispenser • Step 3: reinforce for moving in direction of bar • Step 4: reinforce for pressing bar

  19. Appetive Reinforcers 19 • Primary reinforcer • An activity whose reinforcing properties are innate • Secondary reinforcer • An event that has developed its reinforcing properties through its association with primary reinforcers

  20. Primary Reward Magnitude • The Acquisition of an Instrumental or Operant Response • The greater the magnitude of the reward, the faster the task is learned • The differences in performance may reflect motivational differences

  21. Magnitude 21

  22. Degraded contingency Primary Reward and Degraded Contingency 22 = bar press = food Perfect contingency Strong Responding Weak Responding

  23. Strength of Secondary Reinforcers 23 • Several variables affect the strength of secondary reinforcers • The magnitude of the primary reinforcer • The greater the number of primary-secondary pairings, the stronger the reinforcing power of the secondary reinforcer • The time elapsing between the presentation of the secondary reinforcer and the primary reinforcer affects the strength of the secondary reinforcer

  24. Primary-Secondary Pairings 24

  25. Schedules of Reinforcement 25 • Schedules of reinforcement • A contingency that specifies how often or when we must act to receive reinforcement

  26. Schedules of Reinforcement 26 • Fixed Ratio • Reinforcement is given after a given number of responses • Short pauses • Variable Ratio • After a varying number of responses

  27. Schedules of Reinforcement 27 • Fixed Interval • First response after a given interval is rewarded • FI Scallop • Variable Interval • Like FI but varies with a given average • Scallop disappears

  28. Fixed Interval Schedule • Fixed interval schedule • Reinforcement is available only after a specified period of time and the first response emitted after the interval has elapsed is reinforced • Scallop effect • Experience - the ability to withhold the response until close to the end of the interval increases with experience • The pause is longer with longer FI schedules

  29. Variable Interval Schedules • Variable interval schedule • An average interval of time between available reinforcers, but the interval varies from one reinforcement to the next contingency • Characterized by steady rates of responding • The longer the interval, the lower the response rate • Scallop effect does not occur on VI schedules • Encourages S-R habit learning

  30. Some Other Schedules 30 • DRL, Differential reinforcement for low rates of responding • DRH, Differential reinforcement for high rates of responding • DR0, Different reinforcement of anything but the target behavior

  31. Compound Schedules • Compound schedule • A complex contingency where two or more schedules of reinforcement are combined

  32. Schedule this…. Concurrent schedules permit the subject to alternate between different schedules; or to repeatedly choose between working on different schedules 32 A B $5 today VI-30 $50 wait VI-60

  33. Matching Law B1/(B1+B2) = R1/(R1+R2) B stands for numbers of a certain behavior R stands for numbers of a reinforcers earned 33

  34. Sniffy the Rat 34

  35. Typical Result 35

  36. Deviations From Matching 36 • Bias represents a preference for responding on one response more than the other that has nothing to do with the schedules programmed • one pigeon key requires more force to close its contact than the other, so that the pigeon has to peck harder • one food hopper delivers food more quickly than another

  37. Sensitivity 37 Overmatching -- the relative rate of responding is more extreme than predicted by matching. The subject appears to be “too sensitive" to the schedule differences. Undermatching -- the relative rate of responding on a key is less extreme than predicted by matching. The subject appears to by “insensitive" to the schedule differences.

  38. Overmatching

  39. Poor Self-Control 39 A B small Direct Choice (Concurrent Schedule) LARGE

  40. Self-Control and Overmatching Concurrent Choice Human and nonhumans often chose a immediate small reward over a larger delayed reward (delayed rewards are “discounted”) 40

  41. Another Example of Impulsivity 41 “Free” reinforcers given every 20s Lever press advances delivery of the first pellet, and deletes the second pellet 20s 40s 60s So, if you press at 2 seconds, you get a pellet immediately, but you get no other pellets until the 60 second pellet is available.

  42. Delay of Reinforcement Delayed reinforcers are steeply discounted Loss of self-control and impulsivity 42 Reinforcer Potency Delay

  43. A B 43 A B small Concurrent Chain (Pre-committment) LARGE

  44. Behavioral Methods for Self Control Pre-commitment Self-Exclusion Contracts Distraction Modeling Shaping Waiting Reduce delay for small Increase delay for large 44

  45. The Discontinuance of Reinforcement • Extinction • The elimination or suppression of a response caused by the discontinuation of reinforcement or the removal of the unconditioned stimulus • When reinforcement is first discontinued, the rate of responding remains high • Under some conditions, it even increases

  46. Stronger Learning ≠ Slower Extinction Partial Reinforcement Extinction Effect or PREE Extinction Paradox 46

  47. Importance of Consistency of Reward • Extinction is slower following partial rather than continuous reinforcement • Partial reinforcement extinction effect (PREE): the greater resistance to extinction of an instrumental or operant response following intermittent rather than continuous reinforcement during acquisition • One of the most reliable phenomena in psychology

  48. Acquisition with Differing Percentages 48 100% 80/50/30% Speed Day

  49. Extinction with Differing Percentages 49 Speed 80% 50% 30% 100% Day

  50. Explanations 50 Mowrer-Bitterman Discrimination Hypothesis Amsel’s Frustration Theory (Emotional) Capaldi’s Sequential Theory (Cognitive)

More Related