Schedules of reinforcement

1 / 29

# Schedules of reinforcement - PowerPoint PPT Presentation

Schedules of reinforcement. Simple schedules of reinforcement. CRF. FR. VR. FI. VI. Response-rate schedules of reinforcement. DRL. DRH. Why do ratio schedules produce higher rates of responding than interval schedules?. Inter-response time (IRT).

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Schedules of reinforcement

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Schedules of reinforcement

Simple schedules of reinforcement

• CRF
• FR
• VR
• FI
• VI

Response-rate schedules of reinforcement

• DRL
• DRH

Why do ratio schedules produce higher rates of

responding than interval schedules?

Inter-response time (IRT)

Francis sells jewelry to a local gift shop. Each time he

completes 10 pairs of earrings, the shopkeeper pays him

for them. This is an example of a schedule of

reinforcement.

A. Fixed ratio

B. Variable ratio

C. Fixed interval

D. Variable interval

Vernon is practicing his golf putting. On the average, it

takes him four tries before the ball goes in the hole. This

is an example of a schedule of reinforcement

A. Fixed ratio

B. Variable ratio

C. Fixed interval

D. Variable interval

Sandra’s mail is delivered every day at 10:00. She checks

her mailbox several times each morning, but only finds

mail the first time she checks after 10:00. This is an

example of a schedule of reinforcement

A. Fixed ratio

B. Variable ratio

C. Fixed interval

D. Variable interval

Paula is an eager third-grader, and loves to be called on

by her teacher. Her teacher calls on her approximately

once each period, although Paula is never sure when her

turn will come. This is an example of a schedule

of reinforcement

A. Fixed ratio

B. Variable ratio

C. Fixed interval

D. Variable interval

Concurrent schedules

of reinforcement

Two schedules are in effect at the same time and the

subject is free to switch from one response alternative

to the other

Choice Behavior

and the

Matching Law

The Matching Law is a mathematical statement describing

the relationship between the rate of responding and the

rate of reward

• developed by Herrstein
• Relative rate of responding on a particular lever

equals the relative rate of reinforcement on that lever

The Matching Law

Formula: Ra = Fa

(Ra + Rb) (Fa + Fb)

Ra and Rb = # of responses on schedules a and b

Fa and Fb = # (frequency) of reinforcers received as a

consequence of responding on schedules a and b

The Matching Law

Herrnstein found that pigeons matched their responses on

a given key to the relative frequency of reinforcement for

that key

That is, the # of pecks on Key A relative to the # pecks on

key B matched the # of rewards earned on schedule A

relative to the # of rewards earned on schedule B

Have similar formula and see similar results for:

- magnitude of reward

- immediacy/delay of reward

Evaluation of

the Matching Law

The matching law provides an accurate description of

choice behavior in many situations, but there are

exceptions and problems

• overmatching
• undermatching
• bias
• ratio versus interval schedules

Overmatching

• higher rate of responding for the better of the two

schedules than the matching law predicts

• overmatching occurs when it is costly for a subject

to switch to the less preferred response alternative

(e.g., when the two levers are far apart)

Undermatching

• occurs when the subjects responds less than

• absolute versus relative value of the amount or

frequency of reward

• for example, the matching law predicts subjects
• should make same choice when reward magnitudes
• are 5 versus 3, as when the magnitudes are 10
• versus 6, or 100 versus 60
• however, when absolute values are increased, the
• matching law is not always accurate

Experiment by Logue & Chavarro (1987)

• varied absolute reward magnitude but kept ratio at 3:1

for the left key and the right key

• what the authors found was that the proportion of

responses devoted to the better choice declined as the

absolute values of the reward increased

• response on left key = 3 grains/pellets of food
• response on right key = 1 grain/pellet of food
• the matching law worked in this example, but then
• they increased the absolute value of reward
• response on left key = 30 grains/pellets of food
• response on right key = 10 grains/pellets of food
• in this example the animals responded more on the
• right key than the matching law would predict

Bias

• subject may have a special affinity or preference

for one of the choices

• a rat may prefer the R lever over the L lever or a

pigeon may prefer a red key over a green key

Ratio versus interval schedules

• animals do not match when given concurrent ratio

schedules

Theories of Matching

• the matching law is merely a description of behavior
• it does not say why a subject behaves the way it does
• there are two main explanations of why animals match
• maximization
• melioration

Maximization

• subjects attempt to maximize the rate of reinforcement
• animals have evolved to perform in a manner that yields

the greatest rate of reinforcement

• can explain why subjects match with concurrent VI-VI

schedules but not with concurrent ratio schedules

• molecular and molar maximizing theories
• according to molecular theories, animals choose
• whichever response alternative is most likely to be
• reinforced at that time
• according to molar theories, animals distribute their
• choices to maximize reward over the long run

Melioration

• ‘make better’
• melioration mechanisms work on a time scale that is not

molecular or molar

• matching behavior occurs because the subject is

continuously choosing the more promising option – that is,

the schedule with the momentarily higher rate of

reinforcement

• subjects are continuously attempting to better their

current chances of receiving reward by switching to the

other choice

Choice with Commitment

In a standard concurrent schedule of reinforcement, two

(or more) response alternatives are available at the same

time and the subject is free to switch from one to the other

at any time

However, in some (real-life) situations, choosing one

alternative makes other alternatives unavailable

In these cases, the choice may involve assessing complex,

long-range goals

Can study these types of situations in the lab using a

Concurrent-chain schedule of reinforcment

Concurrent-chain schedule

Pecking the left key in the choice link puts into effect reinforcement

schedule A in the terminal link. Pecking the right key in the choice

Self-Control

Concurrent chain schedules have been used to study

‘self-control’ in the lab

e.g., choosing a large delayed reward over an

immediate small reward

With direct choice procedures, animals often lack

self-control. That is, they choose the immediate, but

smaller reward

With concurrent-chain procedures, animals do show

self-control. That is, they choose the larger, but delayed

reward

Direct-choice procedure

Pigeon chooses immediate,

small reward

Concurrent-chain procedure

Pigeon chooses the schedule

with the delayed, larger

reward

Chapter 7

The Associative Structure of

Instrumental Conditioning

Instrumental conditioning permits the development of

several types of associations

The instrumental response (R) occurs in the presence of

distinctive stimuli (S) and results in the delivery of the

outcome (O)

• S-R
• S-O
• R-O

The S-R Association and the Law of Effect

According to Thorndike, animals form an S-R association

• an association between the stimuli present in the
• experimental situation and the instrumental response

Law of Effect

• according to the law of effect, the role of the

reinforcer (or response outcome) is to ‘stamp in’ an

association between the contextual cues (S) and the

instrumental response (R)

• an important implication of the Law of Effect is that

instrumental conditioning does not involve learning

Expectancy of Reward and the S-O Association

Seems intuitive to think that instrumental conditioning

would involve the subject learning to expect the reinforcer

However, Thorndike and Skinner did not talk about the

cognitive notion of an expectancy

The idea that reward expectancy may motivate

instrumental behavior came from developments in

Pavlovian conditioning

In Pavlovian conditioning, animals learn about stimuli

that signal some important event

One way to look for reward expectancy is to consider how

Pavlovian processes might be involved in instrumental

conditioning

Modern Two-Process Theory

The instrumental response is motivated by two factors

• first, the presence of S comes to evoke the response

directly, through a Thorndikian S-R association

• second, the instrumental response comes to be made

in response to the expectancy of reward because of an

S-O association

• through the S-O association, S comes to motivate the

instrumental behavior by activating a central emotional

state

• the implication is that the rate of an instrumental

response will be modified by the presentation of a

classically conditioned stimulus

Modern Two-Process Theory

Studies that evaluate modern two-process theory

employ a transfer-of-control experimental design

• phase 1 = operant conditioning
• phase 2 = Pavlovian conditioning
• phase 3 = transfer phase
• the subjects are allowed to engage in the

instrumental response and the CS from phase 2

is periodically presented to observe its effect on

the rate of the instrumental response

• where have we seen this before???
• CER (Conditioned Emotional Response) procedure

Evidence of R-O Associations

Neither the S-R nor the S-O association involves a direct

link between the R and the outcome, but R-O association

intuitively makes sense

A common technique for assessing R-O associations

involves devaluing the reinforcer after conditioning

to see if this decreases the instrumental response

Read experiment by Colwill & Rescorla (1986) described

on pp. 197-98 of textbook