470 likes | 855 Views
NOTES FOR EXAM 3 BEGIN HERE… Chapter 10: Conditioned Reinforcement. A scenario…. Imagine you are lost… You finally stumble upon a landmark that is familiar to you You become happy because you know how to get home from this spot
 
                
                E N D
NOTES FOR EXAM 3BEGIN HERE… Chapter 10:Conditioned Reinforcement
A scenario… • Imagine you are lost… • You finally stumble upon a landmark that is familiar to you • You become happy because you know how to get home from this spot • This “spot” is both a CS that elicits happiness as well as an SD for the behavior of “getting home.” • There is also a THIRD function of this stimulus • It has also served as a reinforcer for the “stumbling around” behavior that led you to it • In fact, if we consider any series of linked behaviors (like following directions or recipes, etc.), the consequence of completing each step is both a reinforcer for completing that step as well as an SD for completing the NEXT step
Conditioned Reinforcement • Conditioned reinforcement is when behavior is strengthened by consequence events that have an effect because of a learning history. • The critical aspect of this history involves a pairing between an arbitrary event and an already established reinforcer. • Once the arbitrary event increases the frequency of an operant behavior, it is called a conditioned reinforcer.
Chain Schedules and Conditioned Reinforcement • One way to investigate conditioned reinforcement is to construct sequences of behavior. • A chain schedule of reinforcement involves two or more simple schedules (CRF, FI, VI, FR, etc.) each of which is presented sequentially and is signaled by an arbitrary stimulus (each has its own SD). • Only the final or terminal link in this chain results in primary reinforcement.
Multiple Stimulus Functions • An unsignalled chain (or tandem schedule) is a sequence of two schedules (such as an FR150 -> FI 120 seconds) in which distinct SDs do not signal the different components • In equivalent tandem vs. chain schedules, performances will be BETTER on the chain than the tandem • This shows that distinct signals serve as both SDs and conditioned reinforcers.
Homogeneous and Heterogeneous Chains • Operant chains are classified as homogeneous when the topography or form of response is similar in each component, i.e., a similar response requirement is in effect in all components. • A heterogeneous chain requires different responses in each link.
Teaching a backwards Chain • For complex tasks with many steps, often better to teach the final step FIRST and reinforce its completion • After practicing this final unit many times and reinforcing its completion many times, ACESS to this unit of SD -> R -> SR will now serve as am effective conditioned reinforcer for the second to last unit on the chain of behavior • More…
Teaching a backwards Chain • After practicing the second to last and final unit many times, ACESS to the SECOND TO LAST unit of SD -> R -> SR will now serve as am effective conditioned reinforcer for the THIRD to last unit on the chain of behavior • And so on! • Note that we are not doing the behavior in reverse! We are simply completing the final step first in our teaching procedure
Determinants of ConditionedReinforcement Strength • Frequency of Primary Reinforcement paired with the conditioned reinforcer • Variability of Primary Reinforcement paired with the conditioned reinforcer • Establishing Operations • Delay to Primary Reinforcement
Delay Reduction and Conditioned Reinforcement • Delay-reduction hypothesis • Stimuli closer in time to positive reinforcement, or further in time from an aversive event, are more effective conditioned reinforcers. • Stimuli that signal no reduction in time to reinforcement (SΔ) or no period of safety from an aversive event (Save) do not function as conditioned reinforcement.
Concurrent-Chain Schedules of Reinforcement • Previously we talked about choice where the organism is free to switch back and forth between different response alternatives (called CONCURRENT SCHEDULES OF REINFORCEMENT) • But often in the real world, once you choose one response alternative, you lock out the opportunity to do some other behavior for a period of time • that is, choosing one response COMMITS you to that particular response for at least some period of time
Concurrent-Chain Schedules of Reinforcement • How would we study such an idea in the lab? • we could ask: which does a person prefer, working on an FR10 or a VI60s each for some set period of time? • this is a CONCURRENT CHAIN SCHEDULE • It involves two different components (an initial LINK, or menu, and a terminal LINK)
Concurrent-Chain Schedules of Reinforcement • subject is given a "menu" in which it must press a particular key to TURN ON a particular schedule of reinforcement. • There is no reinforcer given for making the initial link choice itself and the subject is given immediate access to whatever reinforcement schedule he chose • Subject must stay on that schedule for some specified time. • Then he can make a choice again. • What is our measure of choice in a concurrent chain schedule? • the proportion of times subject chooses one schedule over another
Concurrent-Chain Schedules of Reinforcement • IF we put in a delay to access to the terminal links, however, then a subject is LESS likely to choose that initial link because there is now an increased delay to reinforcement • For example, in a two-key concurrent-chain procedure with equivalent initial links but different lengths of delay to get to terminal links.
Generalized Conditioned Reinforcement • any event or stimulus paired with or, exchangeable for, many sources of primary reinforcement. • Generalized reinforcement does not depend on deprivation or satiation for any specific reinforcer. • Generalized social reinforcement for human behavior= approval, attention, affection, praise
Tokens, Money and Generalized Reinforcement • Other conditioned reinforcers are economic since they are exchangeable for goods and services. Probably the most important such reinforcement is money. • A token economy is a set of contingencies based on token reinforcement; the contingencies specify when and under what conditions, particular forms of behavior are reinforced with tokens. Tokens are exchangeable for a variety of backup reinforcers.
Chapter 11Correspondence Relations: Imitation and Rule-Governed Behavior
Correspondence Relations • People often do what others do. A child who observes an older sibling raid the cookie jar may engage in similar behavior. • This is a correspondence between the modeled behavior and the replicated behavior. • Technically, behavior of one person sets the occasion for (is an SD for) an equivalent response by the other.
Correspondence Relations Continued • There are other correspondence relations established by our culture. We often receive reinforcement if there is a correspondence between “saying” and “doing”. • A large part of socialization involves reinforcement for “correspondence between what is said and what is done.”
Correspondence Relations Continued • Other people reinforce our behavior if there is consistency (“correspondence”) between spoken words and later performance. • A minister who preaches moral conduct and lives a moral life is valued; when moral words and moral deeds do not match, people become upset and act to correct the inconsistency. (They deliver punishment!)
Imitation • Learning by observation involves doing what others do • The behavior of an observer or learner is regulated by the actions of a model. • imitation requires that the learner emit a response that could only occur by observing a model emit a similar response.
Spontaneous Imitation • Innate or spontaneous imitation is based on evolution and natural selection rather than learning experiences • Implies imitation of others may be an important adaptive behavior.
Immediate vs. Delayed Imitation • Imitation may occur only when the model is present or it may be delayed for some time after the model has been removed. • delayed imitation is more complex since it involves remembering the modeled stimulus (SD), rather than direct stimulus control.
Operant and Generalized Imitation • It is possible to teach “imitation” as an operant behavior • discriminative stimulus is behavior of the model (SDmodel), • operant is a response that matches the modeled stimulus (Rmatch), and reinforcement is verbal praise (Srsocial). • “Matching the model” is reinforced, while “non-correspondent responses” are extinguished.
Operant and Generalized Imitation • If imitation is reinforced and nonimitation is extinguished, imitation of the model will increase. • On the other hand, nonimitation will occur if imitation is extinguished and nonimitation is reinforced. • Learner learns to “do as the model does” regardless of what the form of the model is!
Operant and Generalized Imitation • Donald Baer and his associates provided a behavior analysis of imitation called generalized imitation • involves several modeled stimuli (SDs) and multiple operants (Rmatch). • In each case, what the model does sets the occasion for reinforcement of a similar response by the child; all other responses are extinguished. • This training results in a stimulus class of models and an imitative response class. The child now imitates whichever response that the model performs.
Generalized Imitation • The next step is to test for generalization of the stimulus and response class. • Baer and Sherman (1964) showed that a new-modeled stimulus would set the occasion for a novel imitative response, without any further reinforcement. • Generalized imitation accounts for the appearance of novel imitative acts in children- even when these specific responses were never reinforced.
Rules, Observational Learning, and Self-Efficacy • For Skinner, “following the rules” is behavior under the control of verbal stimuli SDs. • That is, statements of rules, advice, maxims, or laws are discriminative stimuli that set the occasion for behavior. • Rules, as verbal descriptions, may affect observational learning.
Rule-Governed Behavior • A large part of human behavior is regulated by verbal stimuli. • The common property of these kinds of stimuli is that they describe the operating contingencies of reinforcement. • Formally, rules, instructions, advice, and laws are contingency-specifying stimuli, (they describe the SD:R→ Sr relations of everyday life.) • The term rule-governed behavior is used when the listener’s (reader’s) performance is regulated by contingency-specifying stimuli.
Rule-Governed and Contingency-Shaped Behavior • People are said to solve problems either by discovery or by instruction. • From a behavioral perspective the difference is between the direct effects of contingencies (discovery) and the indirect effects of rules (instruction). • When performance is attributed to direct exposure to reinforcement contingencies, behavior is said to be contingency-shaped. • As previously noted, performance set up by constructing and following instructions (and other verbal stimuli) is termed rule-governed behavior.
Rule-Governed and Contingency-Shaped Behavior • The importance of reinforcement contingencies in establishing and maintaining rule-following is clearly seen with ineffective rules and instructions. • When rules describe delayed and improbable events, it is necessary to find other reasons to follow them.
Instructions and Contingencies • In his discussion of rule-governed and contingency-shaped behavior, Skinner (1969) speculated that instructions may affect performance differently than the actual contingencies of reinforcement. • One way to test this idea is to expose humans to reinforcement procedures that are accurately or inaccurately described by the experimenter’s instructions. • If behavior varies with the instructions while the actual contingencies remain the same, this would be evidence for Skinner’s assertion.
Instructions and Contingencies • Instructions are complex discriminative stimuli. • Instructional control is a form of rule-governed behavior.
Language and Verbal Behavior • In contrast with the term language, verbal behavior deals with the performance of a speaker and the environmental conditions that establish and maintain such performance • Verbal behavior refers to the vocal, written and gestural performance of a speaker, writer or communicator. This behavior operates on the listener, reader or observer, who then arranges reinforcement of the verbal performance.
Speaking, Listening and the Verbal Community • Verbal behavior refers to the behavior of the speaker, writer or gesturer. • The verbal community: the practices and customary ways a given culture reinforces the behavior of a speaker
Operant Functions of Verbal Behavior: Mands • A mand is a response class of verbal operants whose form (what is said or written) is regulated by specific establishing operations (deprivation, satiation, etc.) • In lay terms, mands involve asking for something you need to happen • It is commonly said that a mand specifies its own reinforcer as in “Give me a cookie…” but such commands are only a small part of mands.
Operant Functions of Verbal Behavior: Tacts • A tact is a response class of verbal operants whose form (what is said or written) is regulated by specific nonverbal discriminative stimuli • “tact” is derived from “contact” in that tacts are verbal operants that make contact with the environment. • In lay terms, tacts involve pointing something out, commenting about something, labeling or identifying something
Does the form of the Verbal Behavior identify the type? NOPE • Behavior: “Honey, you sure look sexy tonight!” • Is this a tact or a mand? • Identifying the type of verbal behavior depends on the FUNCTION of the behavior! • What function does this statement have?
Training Verbal Operants: Mands • To teach manding, the most direct procedure is to manipulate an establishing operation (“remove the toy”), and then reinforce the verbal response (“can I have the toy?”) with the specified consequence (guess what it is!). • Sometimes called “teaching requesting”
Training Verbal Operants: Tacts • To teach tacting, a speaker must emit a verbal operant whose form (what is said) is a function of a nonverbal discriminative stimulus; reinforcement is non-specific to that stimulus. • A child comes home from preschool and when seeing her mother the child says, “Let me tell you what I learned today…” and the child names several parts of the body and points to where they are. These would be tacts that would likely be reinforced by praise and hugs from the proud parent. (Mother may need to PROMPT that tacting by the child “What did you do in school today?”)
Additional Verbal Relations: Intraverbals • An intraverbal is a verbal operant (what the listener says) controlled by a verbal discriminative stimulus (what the speaker says) but there is no one-to-one relation between the intraverbal and its SD. • If you overhear me saying. “I’ll be damned!” to which you covertly reply “ I sure hope so…” your response is an intraverbal • Teaching a child ABCs: You say “ABCDEFG” and the child says “HIJK-ellamennopee” • “Free association” therapy demonstrates this when the therapist says “Mother” and you say “dominatrix” (haha!)
Additional Verbal Relations: Echoics • An echoic is a verbal operant in response to a verbal SD but with a point-to-point correspondence between the SD and operant. If you swear after hitting your thumb with a hammer (“Damn!”) and your four year-old-son subsequently repeats your expletive, his response is an echoic.
Additional Verbal Relations: Textuals • A textual is a verbal operant in which the verbal SD (written or spoken words made by another) and the response the listener makes correspond to each other but not with a formal PHYSICAL similarity. • In lay terms, you are READING aloud (or to yourself) or TAKING NOTES
Symbolic Behavior and Stimulus Equivalence • Stimulus equivalence occurs when presentation of one class of stimuli occasion responses made to other stimulus classes. • Example: Most Americans will have a specific response to the written or spoken word or image of “Osama Bin Laden.” • The word in any recognizable form or media, or the image of the person whether in cartoon caricature, photograph or video footage, will occasion the same response. • Stimulus equivalence is said to exist when reflexivity, symmetry and transitivity can be shown to be in effect between distinct stimuli.
Basic Equivalence Relations • Reflexivity (also referred to as identity matching or matching to sample); a picture of Bin Laden is matched up with an identical picture of Bin Laden. (A=A) • Symmetry: stimulus A is interchangeable with stimulus B, or A=B and B=A; a picture of Bin Laden is matched up with the phrase “head of Al Queida” and vice versa. • Transitivity consists of showing that stimulus A = B and stimulus B=C and if the learner responds to A as interchangeable or equivalent to C then transitivity is in effect between A, B and C. If stimulus A (a picture of Bin Laden) is equivalent to stimulus B, “head of Al Queida” and B is equivalent to written words OSAMA BIN LADEN as stimulus C; if the picture of Bin Laden (stimulus A) is matched up with the written words OSAMA BIN LADEN (stimulus C) then transitivity is shown.