Reinforcement & Punishment: What is an S R ?

Reinforcement & Punishment: What is an SR? Lesson 9

What is an SR? • Thorndike’s Law of Effect • Satisfiers & annoyers • Skinner • determined by how B changes • reinforcer:  B • punisher:  B • Primary reinforcers & punishers • biologically important stimuli ~

What is an SR? (continued) • Secondary reinforcers & punishers • money • praise • How do they become an SR? • Classical Conditioning • Higher order learning ~

Drive Reduction View (50s & 60s) • Similar to Law of Readiness • Relative state of deprivation required • for a basic drive • thought to always be true •  Drive  motivation B  reduction of drive state (SR) ~

But... • Sometimes hard to identify drive • What drive is this? ~

Sensory reinforcement • Sensory stimulus unrelated to biological drive • monkeys learn response • reward is watching toy train • rats learn to bar press • reward = turning on a light • or turning off light ~

Premack Principle • Commonly used in educational setting • impractical or unethical to use food • Thought of reinforcers as responses • press bar  eating response • wider application of I/O conditioning • Differential probability principle • High probability responses reinforce low probability responses ~

Premack Principle • Homme et al (1963) • Unruly 3 year olds • High probability behaviors • ignored teacher • screaming • pushing furniture • Low probability behavior • sitting quietly ~

Premack Principle: Homme et al • Rewarded sitting quietly with... • 3 min of running around screaming • Results: sitting quietly increased • Particular behaviors observed by different kids • different responses effective reinforcers for different kids ~

Premack Principle • Charlop, Kurtz, & Casey (1990) • autistic children • High probability behaviors • echolalia • perseveration • Low probability behaviors • adding up coins • judging objects: same or different ~

food RFT Premack Principle: Charlop et al 100 80 echolalia RFT % correct responses 60 40 # of sessions

Premack Principle: Problems • Fluctuation of response probabilities • e.g., sometimes kid would rather play outside than play video games • Solution: token economies • Does not explain how reinforcer increases response probability ~

Behavioral Regulation Approach • Response deprivation • limit access to a response • does not require high vs. low probability • Behavioral homeostasis • preferred distribution of activities • operant conditioning imposes limits • behavioral bliss point • e.g., time spent studying vs. video games ~

Behavioral Regulation Approach • A behavior is limited below bliss point • disturbance of behavioral homeostasis • analogous to increased biological drive • Contingency set during I/O procedure • establish relationship between responses • B  move toward bliss point (baseline) ~

Behavioral Regulation Approach • Low probability behaviors as reinforcers • observe baseline rate of behavior • limit activity below baseline • Require a response to engage in deprived behavior • contingency • Increase toward bliss point • cost vs. benefits determines how much ~

What Becomes Connected? • Skinner? • refused to consider associations • Thorndike: S-R view (SD-B) • association b/n stimulus context and response • NOT the outcome (SR) • no representation of reinforcer ~

S-R-O (SD-B-SR) view: Tinkelpaugh (1928) • Goal-oriented responding • respond with idea of getting reward • The monkey and the hidden banana • 2 cups, put banana under 1 • task: choose cup with banana • Secretly substituted rotten lettuce • monkey became agitated • Expected banana reward (outcome) ~

S-R vs. S-R-O • Adams & Dickinson (1981) • Taste aversion paradigm • Associate sucrose (sweetner) • w/ lithium chloride (LiCl)  illness • Will rats press bar to get something that makes them sick? ~

S-R vs. S-R-O • Phase 1: • Trained rats to bar press for sucrose • Phase 2: • associate sucrose w/ illness • Phase 3: • Will rats press bar now? • No sucrose delivered ~

S-R vs. S-R-O : Results • Predictions? • If S-R-O • If S-R • Results • Rats did not press bar • Supports S-R-O ~

S-R vs. S-R-O • Use different levels of training • Phase 1: Same procedure but… • some get 100 RFTs • some get 500 RFTs ~

Results & Conclusions • Less training  low response rate • Little training  outcome important • S-R-O • Extensive training  high response rate • outcome less important • response is well established • S-R ~

Parallel learning in humans • Learning a skill • e.g., to drive a car • Early trials • consider consequences • must think about what you are doing • After extensive experience • becomes automatic • after many trials ~

Extrinsic Reward vs Intrinsic Motivation • Early trials • expectation of reinforcer • extrinsic reward • CER = positive affect • Well-established behavior • no expectation of reward • intrinsic motivation • CER = positive affect ~

Reinforcement & Punishment: What is an S R ?