Contemporary Learning Theory PowerPoint Presentation
Contemporary Learning Theory

Contemporary Learning Theory

Contemporary Learning Theory

Presentation Transcript

    1. Contemporary Learning Theory Dr Pam Blundell Lecture Five

    2. Today Introduce instrumental conditioning How its different from Pavlovian conditioning Describe theoretical accounts of instrumental conditioning

    3. Objectives At the end of this lecture, students should be able to: Describe instrumental conditioning procedures Discuss what is learnt in instrumental conditioning Evaluate formal models of instrumental conditioning

    4. Reading Dickinson p102-122 Pearce Ch 4

    5. Instrumental conditioning Both prediction and control are required for successful adaptation in a changing environment

    6. Instrumental conditioning Instrumental behaviour refers to those actions whose acquisition and maintenance depend upon the fact that the action is instrumental in causing some outcome Allows us (and animals) to control our environment in service of our needs and desires

    7. Instrumental conditioning Consider approach to a food source Hungry chick will learn to approach a bowl Instrumental analysis suggests the animal is sensitive to the contingency between its own behaviour, and access to food Pavlovian account suggests that predictive relationship between bowl and food is important

    8. Instrumental conditioning In a stable environment, we cannot discriminate between those two accounts of the behaviour We need to change the causal structure of the environment to determine what is governing behaviour

    9. Hershberger (1986) Arranged a looking glass world Approach to a food bowl actually increased the distance to the food bowl Pavlovian animal (insensitive to the consequences of its actions) would never be able to adapt Instrumental animal would learn to withdraw Chicks showed little evidence of learning to run away, across 100 minutes training thus not sensitive to instrumental contingencies

    10. Miller & Konorski (1969) Passive dog legflex, in presence of stimulus, paired with food. After a number of pairings, dog began to flex leg in presence of stimulus At odds with the notion of stimulus substitution Termed type II conditioning. But doesnt demonstrate the instrumental character of type II conditioning

    11. Grindley (1932) Guinea pigs Trained to turn head to left of right when buzzer sounded to receive food Reversed the contingency (ie making them turn the head the other way) Animals could perform this So S-O (pavlovian) kept constant (buzzer-food) Behaviour changed

    12. Instrumental conditioning Many instrumental tasks have a strong Pavlovian component Pigeon key peck Runways Mazes Free operant lever pressing in rats is a fairly pure instrumental task

    13. Free operant lever pressing Is sensitive to contingency reversal (David & Bitterman, 1971) Trained rats to lever press Then changed contingency either no contingency, or press postponed delivery of food Postponed group reduced responding more

    14. Bolles, Holtz, Dunn & Hill (1980) Trained rats to press a lever down and push a lever up for food which action randomly determined, so rats tended to alternate Punish one category of responding with shock Suppression only of the response that was punished sensitivity to the consequences of their actions

    16. What is learned in instrumental conditioning? Earliest explanation of instrumental conditioning is the Law of Effect (Thorndike) Association between stimulus and response, strengthened by presentation of a reinforcer

    17. What is learned in instrumental conditioning As is Pavlovian conditioning, this suggests no knowledge of the consequences of the action Instrumental action is simply a habitual response triggered by the training stimuli Drive will potentiate habits

    18. Tolman (1932, 1959) Cognitive theory of instrumental action Belief about consequences of action (mean-end readiness) Value assigned to outcome, interacts with expectancy to produce the behaviour

    19. How to distinguish between the SR and the cognitive account? As in Pavlovian conditioning, examine the effects of changing reward value!

    20. Adams and Dickinson (1981) (see also Colwill & Rescorla 1986) LP1? food1 LP2?0 Food 2 delivered non contingently 4 groups: PF, PS, UF, US (paired, unpaired, food, sucrose) Test lever pressing in extinction

    22. Dickinson & Adams (1981) Supports the cognitive theory of instrumental action BUT some residual responding to the paired When training reintroduced, devalued foods ineffective reinforcers, so residual responding not due to ineffective devaluation Perhaps both SR and cognitive going on??

    24. What is learned in instrumental conditioning Problem with Tolmans account no specification of the psychological mechanism by which expectancies, beliefs, values interact, causing instrumental action

    25. Bidirectional theory Pairing two events not only causes a forward connection between them, but also a backwards connection

    26. Bidirectional theory If E1 is instrumental action and E2 is the reinforcer

    27. Bidirectional theory Gormezano & Tait (1976) E1: Airpuff E2: Water delivery If backward associations form, then water delivery should elicit eyeblink But it doesnt

    28. Bidirectional theory Cant explain punishment Action is reduced by punishment

    29. Associative-cybernetic model Associative: involves the formation of a connection between representation of action and outcome Cybernetic: activation of the outcome representations feedback to modulate performance

    31. Habit memory Array of stimulus detecting units linked to an array of response units Corresponds to URs or pretrained responses

    33. Associative memory Representations of actions, and outcomes Performance of an action activates the representation of that action Contiguous activation of actions in habit and associative memory allows growth of a connection between the habit and associative representations of the action Making a response becomes associated with the outcome of that response

    35. Incentive system Any event in associative memory that has motivational significance has associations with units in the incentive system Innate? But learnable! (see next lecture) Activation of reward units exert a general and indiscriminate excitation on units in the motor system Similarly, activation of punishment units inhibits all units in the motor system

    37. Associative cybernetic model Important associations: Habit response --- associative action (ability to detect and represent the animals own behaviour) Associative action associative outcome (ability to detect and represent contingency between action and outcome) Associative outcome --- reward incentive system (Represents the desires of the animal)

    38. Representations of instrumental actions Habit response --- associative action (ability to detect and represent the animals own behaviour) Shettleworth (1975) compared sensitivity of a variety of behaviours to food conditioning in hamsters Rearing could be conditioned Face washing and scratching couldnt

    39. Morgan & Nicholas (1979) Presented two levers into operant chamber, following rat either face washing or rearing Animal had to make one response if it had just reared, or the other if it had just washed They could learn this Second group, scratching and face washing Couldnt learn this Scratching and face washing are poorly represented (no associative memory of scratching in ass-cyb model)

    41. Heyes Observational learning Seeing a conspecific carrying out an action also activates associative representation of the action Observer rats pushed pole in same direction as demonstrator rats

    42. Instrumental learning Associative action associative outcome (ability to detect and represent contingency between action and outcome) In instrumental conditioning, it is the causal relationship that determines behaviour. BUT can instrumental behaviour be explained simply by a sensitivity to the temporal contiguity of events?

    43. Contiguity

    44. Contiguity Animals are certainly sensitive to the contiguity between action and outcome BUT learning is still maintained, even with a 30s gap between action and outcome Perhaps the sensitivity to contiguity is due to a difficulty in discriminating a causal relationship in which A?O, from a noncontingent schedule in which outcome occurs frequently, but independent of behaviour

    45. Contingency Hammond (1980) Varied P(O/A) and P(O/-A)

    47. Contingency With P(O/-A)=0, higher pressing with higher P(O/A) As P(O/-A) decreases, so does responding So outcomes following no response dont act as a delayed reinforcer (which should increase responding), and animals appear sensitive to the causal relationships

    48. Alternatively Perhaps non contingent reinforcers enhanced Pavlovian approach behaviours at the expense of the instrumental responses?

    49. Dickinson & Mulatero (1989) (also see Dickinson, Campos, Varga, & Balleine 1996) L1? food; L2 ? sucrose Present non contiguous food Only L1 responding reduced Rats are sensitive to the contingencies

    50. However We can still claim contiguity as a crucial element of conditioning, as we can explain the results of Hammond very easily Context food associations will be higher in the non continguous groups, which will block learning the actionfood associations (recall blocking).

    51. Signalling the noncontiguous outcomes increases the rate of instrumental lever pressing (context is overshadowed by the signal, Dickinson & Charnock 1985)

    52. In summary Simple contiguity based learning process provides as account of the sensitivity of instrumental performance to variations in the casual effectiveness of an action BUT effect of schedules

    53. Ratio vs interval schedules Ratio schedules: the more you press the more you earn! FR15. VR20. RR20 Interval schedules: rate of responding independent of how much reinforcement received. FI15. VI25. RI30.

    54. Dawson & Dickinson 1990 Trained rats to chain pull on either RR20 or RI schedules. IRI on RI schedules was determined by a yoked animal on a RR20 schedule Temporal distribution of reinforcers was matched

    56. P(O/A) higher on RI schedule But RR schedule produces more responding Sensitive to the causal relationship between performance and reward rates Further evidence from Dickinson (1983) which found Ratio schedules more sensitive to reward devaluation

    58. Summary Instrumental action mediated by two systems Habit Associative cybernetic system Next time well discuss incentive learning, and Pavlovian instrumental interactions!