What do reinforcers and punishers do? “Learning provides the knowledge, and reinforcers provide the goals to cause the organism to act on that knowledge” (Anderson, 2000, p.119).
Rationality and optimality • Individuals choose rationally within the limits of their knowledge. • Choices that are less than optimal may result from biological or learned preparedness that is inappropriate for a particular situation: Choosing sweet foods over nutritious foods, for example.
Rational analysis • Individuals combine the probabilities of particular outcomes with the value of those outcomes to determine the best course of action. For example: • If p(food) is high and the value of food is also high, respond to get food. • If p(food) combined with value of food is lower than p(warmth) combined with value of warmth, respond to get warmth. • If the products are equal, alternate.
Comparing rational analysis and Hull • Hull: E = (H x D x K) - I (Recall that H is a product of reinforcement history) • Rational analysis: E = H x (DxK) (that is, p(reward) x value of reward • If two responses are mutually exclusive, both Hull and rational analysis predict the same choice behavior.
But what if the same response produces good and bad consequences? • If a bar press produces food 67% of the time and shock 33%, what will the subject do? • It depends on the value of the consequence. • The likelihood of a response is the sum of the products of p(reward) multiplied by value • E = (.67 x 10) + (.33 x -25) = -1.55 • The bar press will not happen until the value of food increases sufficiently.
Does reward affect learning? • Rational analysis suggests that reward only influences choice, not learning probabilities. • The surprising conclusion: Learning does not depend on reward.
Human research: Within groups • Within-group studies show that differential reward does affect learning (eg. Harley, 1965). Learn these words, for one cookie each: interpolate lexicon musical domicile cyberspace And learn these words, for two cookies each: extrapolate dictionary lyrical dwelling hyperlink
Human research: Between groups • Between-group studies show that differential reward doesnot affect learning. GROUP 1: Learn these words, for one cookie each: interpolate lexicon musical domicile cyberspace GROUP 2: Learn these words, for two cookies each: extrapolate dictionary lyrical dwelling hyperlink
Can contingencies of reward ever affect learning? • Loftus, 1972: When participants know about differential reward, it affects learning. • More time is spent learning material (more fixations) which promises a higher reward. • Thus, reinforcement can affect attention and effort in learning, if and only if the learner knows about the different consequences while learning. • What do review sheets do to students?
An interesting study to do • Begin a study session with no knowledge of differential reinforcement • At some point in the session, inform participants of differential reinforcement • Eg., tell them which items are more likely to be on the exam. • Measure study time (fixation time and/or rehearsal time) for the two sets of items before and after the information is given.
Differential reward effects • Differential reward will affect learning only if it enables the learner to allocate attention differentially. • Different levels of reward have no direct effect on learning. You will learn just as much for a dime as for a dollar, if you do not know the difference in reward.
Another angle on differential reward • Capuchin monkey fairness protests (deWaal & Brosnan, 2003) • Monkeys trained with different foods as rewards for the work of exchanging differently valued tokens. • Eg, a blue token could be exchanged for a piece of cucumber (low value), a red token for a grape (high value). • They readily learned the exchange rules.
But then… • deWaal & Brosnan put the monkeys in pairs, so they could observe each other’s exchanges. • The researchers began giving one monkey a better reward than was justified by the token offered, while rewarding the other monkey by the exchange rules. • The exchange rules monkeys stopped playing the game, or refused the lesser reward. • The effect was greater if one monkey was given handouts. • The monkey given the greater reward never protested.
0 A naive view of contingency consequences
0 A better view of contingency
Application question • What is happening when parents ground a child? • Positive reinforcement? • Omission training? • Punishment? • Negative reinforcement?
Applying reinforcement • Behavior modification 1: Contingency management • The classroom: Catch them being good • Hall, Lund & Jackson (1968): Robbie, 6/7 classes • But it doesn’t always work • Difficult to be consistent • Different strokes for different folks • Systematic reinforcement: The token economy • Immediate secondary reinforcement • Enhanced subjective reward value through choice • Extra reinforcement for group meeting standards recruits peer pressure
Maintaining behavior • Non-reinforcement leads to extinction • But the extinction effect is mitigated by • Partial reinforcement • Reinforcement in multiple settings • Fading
But is reinforcement all good? • It uses bribery to play to human greed • It undermines other motivators • Less powerful reinforcers—the negative contrast effect • Intrinsic motivation (Lepper, Greene & Nisbett, 1973) • Internalized control and self-concept/ego ideal
What factors influence whether reinforcement will undermine behavior? • High intrinsic motivation is more readily undermined • Perceived coercion undermines behavior • Ryan (1982) • Brehm’s (1966) reactance theory • Reinforcing task completion vs. task competence (Enzle & Ross, 1978)
What else can we do to minimize undermining? • Use reinforcement for behaviors with low intrinsic motivation • Use praise rather than material rewards • Reward competence rather than compliance
More methods to minimize undermining • Match reinforcement to the optimal motivation level according to the Yerkes-Dodson law • Use the minimum reinforcement necessary • Start with social reinforcers • Use behavioral contracts
Aversive control of behavior • Punishment • Observational and correlational research vs. experimental research • Punishment is not necessarily corporal. • Avoidance • How can the non-occurrence of a behavior be reinforced?
Principles of punishment • Severity or intensity: Skinner (1938) and Thorndike vs. Boe & Church (1967) and Bucher & Lovaas (1968) • Systematic desensitization (Azrin, Holz & Hake, 1963) • Systematic sensitization • Secondary punishment • Immediacy vs delay (Solomon, Turner & Lessac, 1968) • Delay has no effect on learning to suppress a response. • Delay does reduce perseverance of response suppression. • With humans, delay effects are found in a matter of seconds.
More principles of punishment • Consistency/high differential contingency: FR1 vs. FR1000 (Azrin et al., 1963) • Child observation (Larzelere, Schneider, Larson, & Pike (1996) • Criminal activity (Brennan & Mednick, 1994) • Punishment inoculation by non-contingent application (gratuitous punishment: “That was for nothing. Don’t try anything.”) • Reasoning or verbal explanation improves punishment effects, perhaps by assisting discrimination learning.
Still more principles of punishment • Since many inappropriate responses persist because they are being reinforced, punishment is more effective if alternate routes to the reinforcer are available. • Poorly done punishment can produce suppression of desirable responses (CER), increased anger and aggression, lying as an avoidance behavior, and imitation of punishing with peers and weaker people.
A punishment ideal • Punishment works best when it signals • non-availability of reward for the punished behavior, and • availability of the desired reward if a different behavior is chosen. • Can you think of examples of this approach?
Other aversive approaches • Response cost • Randy and the smiley-face chart