Shaping (psychology)

Last updated

Shaping is a conditioning paradigm used primarily in the experimental analysis of behavior. The method used is differential reinforcement of successive approximations. It was introduced by B. F. Skinner [1] with pigeons and extended to dogs, dolphins, humans and other species. In shaping, the form of an existing response is gradually changed across successive trials towards a desired target behavior by reinforcing exact segments of behavior. Skinner's explanation of shaping was this:

Contents

We first give the bird food when it turns slightly in the direction of the spot from any part of the cage. This increases the frequency of such behavior. We then withhold reinforcement until a slight movement is made toward the spot. This again alters the general distribution of behavior without producing a new unit. We continue by reinforcing positions successively closer to the spot, then by reinforcing only when the head is moved slightly forward, and finally only when the beak actually makes contact with the spot. ... The original probability of the response in its final form is very low; in some cases it may even be zero. In this way we can build complicated operants which would never appear in the repertoire of the organism otherwise. By reinforcing a series of successive approximations, we bring a rare response to a very high probability in a short time. ... The total act of turning toward the spot from any point in the box, walking toward it, raising the head, and striking the spot may seem to be a functionally coherent unit of behavior; but it is constructed by a continual process of differential reinforcement from undifferentiated behavior, just as the sculptor shapes his figure from a lump of clay. [2]

Successive approximations

The successive approximations reinforced are increasingly closer approximations of the target behavior set by the trainer. As training progresses the trainer stops reinforcing the less accurate approximations. When the trainer stops reinforcing this behavior, the learner will go through an extinction burst in which they perform many behaviors in an attempt to receive that reinforcement. The trainer will pick one of those behaviors that is a closer approximation to the target behavior and reinforce that chosen behavior. [3] The trainer repeats this process with the successive approximations getting closer to the target response until the learner achieves the intended behavior.

For example, to train a rat to press a lever, the following successive approximations might be applied:

  1. simply turning toward the lever will be reinforced
  2. only moving toward the lever will be reinforced
  3. only moving to within a specified distance from the lever will be reinforced
  4. only touching the lever with any part of the body, such as the nose, will be reinforced
  5. only touching the lever with a specified paw will be reinforced
  6. only depressing the lever partially with the specified paw will be reinforced
  7. only depressing the lever completely with the specified paw will be reinforced

The trainer starts by reinforcing all behaviors in the first category, here turning toward the lever. When the animal regularly performs that response (turning), the trainer restricts reinforcement to responses in the second category (moving toward), then the third, and so on, progressing to each more accurate approximation as the animal learns the one currently reinforced. Thus, the response gradually approximates the desired behavior until finally the target response (lever pressing) is established. At first the rat is not likely to press the lever; in the end it presses rapidly.

Shaping sometimes fails. An oft-cited example is an attempt by Marian and Keller Breland (students of B.F. Skinner) to shape a pig and a raccoon to deposit a coin in a piggy bank, using food as the reinforcer. Instead of learning to deposit the coin, the pig began to root it into the ground, and the raccoon "washed" and rubbed the coins together. That is, the animals treated the coin the same way that they treated food items that they were preparing to eat, referred to as “food-getting” behaviors. In the case of the raccoon, it was able to learn to deposit one coin into the box to gain a food reward, but when the contingencies were changed such that two coins were required to gain the reward, the raccoon could not learn the new, more complex rule. After what could be characterized as expressions of frustration, the raccoon resorts to basic “food-getting” behaviors common to its species. These results show a limitation in the raccoon’s cognitive capacity to even conceive of the possibility that two coins could be exchanged for food, irrespective of existing auto-shaping contingencies. Since the Breland's observations were reported many other examples of untrained responses to natural stimuli have been reported; in many contexts, the stimuli are called "sign stimuli", and the related behaviors are called "sign tracking". [4] [5]

Practical applications

Shaping is used in training operant responses in lab animals, and in applied behavior analysis to change human or animal behaviors considered to be maladaptive or dysfunctional. It can also be used to teach behaviors to learners who refuse to do the target behavior or struggle with achieving it. This procedure plays an important role in commercial animal training. Shaping assists in "discrimination", which is the ability to tell the difference between stimuli that are and are not reinforced, and in "generalization", which is the application of a response learned in one situation to a different but similar situation. [6]

Shaping can also be used in a rehabilitation center. For example, training on parallel bars can approximate walking with a walker. [7] Or shaping can teach patients how to increase the time between bathroom visits.

Autoshaping

Autoshaping (sometimes called sign tracking) is any of a variety of experimental procedures used to study classical conditioning. In autoshaping, in contrast to shaping, the reward comes irrespective of the behavior of the animal. In its simplest form, autoshaping is very similar to Pavlov's salivary conditioning procedure using dogs. In Pavlov's best-known procedure, a short audible tone reliably preceded the presentation of food to dogs. The dogs naturally, unconditionally, salivated (unconditioned response) to the food (unconditioned stimulus) given to them, but through learning, conditionally, came to salivate (conditioned response) to the tone (conditioned stimulus) that predicted food. In auto-shaping, a light is reliably turned on shortly before animals are given food. The animals naturally, unconditionally, display consummatory reactions to the food given them, but through learning, conditionally, came to perform those same consummatory actions directed at the conditioned stimulus that predicts food.

Autoshaping provides an interesting conundrum for B.F. Skinner's assertion that one must employ shaping as a method for teaching a pigeon to peck a key. After all, if an animal can shape itself, why use the laborious process of shaping? Autoshaping also contradicts Skinner's principle of reinforcement. During autoshaping, food comes irrespective of the behavior of the animal. If reinforcement were occurring, random behaviors should increase in frequency because they should have been rewarded by random food. Nonetheless, key-pecking reliably develops in pigeons, [8] even if this behavior had never been rewarded.

But, the clearest evidence that auto-shaping is under Pavlovian and not Skinnerian control was found using the omission procedure. In that procedure, [9] food is normally scheduled for delivery following each presentation of a stimulus (often a flash of light), except in cases in which the animal actually performs a consummatory response to the stimulus, in which case food is withheld. Here, if the behavior were under instrumental control, the animal would stop attempting to consume the stimulus, as that behavior is followed by the withholding of food. But, animals persist in attempting to consume the conditioned stimulus for thousands of trials [10] (a phenomenon known as negative automaintenance), unable to cease their behavioral response to the conditioned stimulus even when it prevents them from obtaining a reward.

See also

Related Research Articles

<span class="mw-page-title-main">B. F. Skinner</span> American psychologist and social philosopher (1904–1990)

Burrhus Frederic Skinner was an American psychologist, behaviorist, inventor, and social philosopher. Considered the father of Behaviorism, he was the Edgar Pierce Professor of Psychology at Harvard University from 1958 until his retirement in 1974.

Operant conditioning, also called instrumental conditioning, is a learning process where voluntary behaviors are modified by association with the addition of reward or aversive stimuli. The frequency or duration of the behavior may increase through reinforcement or decrease through punishment or extinction.

<span class="mw-page-title-main">Operant conditioning chamber</span> Laboratory apparatus used to study animal behavior

An operant conditioning chamber is a laboratory apparatus used to study animal behavior. The operant conditioning chamber was created by B. F. Skinner while he was a graduate student at Harvard University. The chamber can be used to study both operant conditioning and classical conditioning.

Classical conditioning is a behavioral procedure in which a biologically potent physiological stimulus is paired with a neutral stimulus. The term classical conditioning refers to the process of an automatic, conditioned response that is paired with a specific stimulus.

<span class="mw-page-title-main">Reinforcement</span> Consequence affecting an organisms future behavior

In behavioral psychology, reinforcement is any consequence that increases the likelihood of an organism's future behavior whenever that behavior is preceded by a particular antecedent stimulus. For example, a rat can be trained to push a lever to receive food whenever a light is turned on. In this example, the light is the antecedent stimulus, the lever pushing is the behavior, and the food is the reinforcement. Likewise, a student that receives attention and praise when answering a teacher's question will be more likely to answer future questions in class. The teacher's question is the antecedent, the student's response is the behavior, and the praise and attention are the reinforcements.

The experimental analysis of behavior is a science that studies the behavior of individuals across a variety of species. A key early scientist was B. F. Skinner who discovered operant behavior, reinforcers, secondary reinforcers, contingencies of reinforcement, stimulus control, shaping, intermittent schedules, discrimination, and generalization. A central method was the examination of functional relations between environment and behavior, as opposed to hypothetico-deductive learning theory that had grown up in the comparative psychology of the 1920–1950 period. Skinner's approach was characterized by observation of measurable behavior which could be predicted and controlled. It owed its early success to the effectiveness of Skinner's procedures of operant conditioning, both in the laboratory and in behavior therapy.

Behaviorism is a systematic approach to understand the behavior of humans and other animals. It assumes that behavior is either a reflex evoked by the pairing of certain antecedent stimuli in the environment, or a consequence of that individual's history, including especially reinforcement and punishment contingencies, together with the individual's current motivational state and controlling stimuli. Although behaviorists generally accept the important role of heredity in determining behavior, they focus primarily on environmental events. The cognitive revolution of the late 20th century largely replaced behaviorism as an explanatory theory with cognitive psychology, which unlike behaviorism examines internal mental states.

<span class="mw-page-title-main">Clicker training</span> Animal training method

Clicker training is a positive reinforcement animal training method based on a bridging stimulus in operant conditioning. The system uses conditioned reinforcers, which a trainer can deliver more quickly and more precisely than primary reinforcers such as food. The term "clicker" comes from a small metal cricket noisemaker adapted from a child's toy that the trainer uses to precisely mark the desired behavior. When training a new behavior, the clicker helps the animal to quickly identify the precise behavior that results in the treat. The technique is popular with dog trainers, but can be used for all kinds of domestic and wild animals.

The law of effect is a psychology principle advanced by Edward Thorndike in 1898 on the matter of behavioral conditioning which states that "responses that produce a satisfying effect in a particular situation become more likely to occur again in that situation, and responses that produce a discomforting effect become less likely to occur again in that situation."

Motivational salience is a cognitive process and a form of attention that motivates or propels an individual's behavior towards or away from a particular object, perceived event or outcome. Motivational salience regulates the intensity of behaviors that facilitate the attainment of a particular goal, the amount of time and energy that an individual is willing to expend to attain a particular goal, and the amount of risk that an individual is willing to accept while working to attain a particular goal.

<span class="mw-page-title-main">Animal training</span> Teaching animals specific responses to specific conditions or stimuli

Animal training is the act of teaching animals specific responses to specific conditions or stimuli. Training may be for purposes such as companionship, detection, protection, and entertainment. The type of training an animal receives will vary depending on the training method used, and the purpose for training the animal. For example, a seeing eye dog will be trained to achieve a different goal than a wild animal in a circus.

Extinction is a behavioral phenomenon observed in both operantly conditioned and classically conditioned behavior, which manifests itself by fading of non-reinforced conditioned response over time. When operant behavior that has been previously reinforced no longer produces reinforcing consequences the behavior gradually stops occurring. In classical conditioning, when a conditioned stimulus is presented alone, so that it no longer predicts the coming of the unconditioned stimulus, conditioned responding gradually stops. For example, after Pavlov's dog was conditioned to salivate at the sound of a metronome, it eventually stopped salivating to the metronome after the metronome had been sounded repeatedly but no food came. Many anxiety disorders such as post traumatic stress disorder are believed to reflect, at least in part, a failure to extinguish conditioned fear.

Instinctive drift, alternately known as instinctual drift, is the tendency of an animal to revert to unconscious and automatic behaviour that interferes with learned behaviour from operant conditioning. Instinctive drift was coined by Keller and Marian Breland, former students of B.F. Skinner at the University of Minnesota, describing the phenomenon as "a clear and utter failure of conditioning theory." B.F. Skinner was an American psychologist and father of operant conditioning, which is a learning strategy that teaches the performance of an action either through reinforcement or punishment. It is through the association of the behaviour and the reward or consequence that follows that depicts whether an animal will maintain a behaviour, or if it will become extinct. Instinctive drift is a phenomenon where such conditioning erodes and an animal reverts to its natural behaviour.

In operant conditioning, punishment is any change in a human or animal's surroundings which, occurring after a given behavior or response, reduces the likelihood of that behavior occurring again in the future. As with reinforcement, it is the behavior, not the human/animal, that is punished. Whether a change is or is not punishing is determined by its effect on the rate that the behavior occurs. This is called motivating operations (MO), because they alter the effectiveness of a stimulus. MO can be categorized in abolishing operations, decrease the effectiveness of the stimuli and establishing, increase the effectiveness of the stimuli. For example, a painful stimulus which would act as a punisher for most people may actually reinforce some behaviors of masochistic individuals.

Errorless learning was an instructional design introduced by psychologist Charles Ferster in the 1950s as part of his studies on what would make the most effective learning environment. B. F. Skinner was also influential in developing the technique, noting that,

...errors are not necessary for learning to occur. Errors are not a function of learning or vice versa nor are they blamed on the learner. Errors are a function of poor analysis of behavior, a poorly designed shaping program, moving too fast from step to step in the program, and the lack of the prerequisite behavior necessary for success in the program.

In behavioral psychology, stimulus control is a phenomenon in operant conditioning that occurs when an organism behaves in one way in the presence of a given stimulus and another way in its absence. A stimulus that modifies behavior in this manner is either a discriminative stimulus or stimulus delta. For example, the presence of a stop sign at a traffic intersection alerts the driver to stop driving and increases the probability that braking behavior occurs. Stimulus control does not force behavior to occur, as it is a direct result of historical reinforcement contingencies, as opposed to reflexive behavior elicited through classical conditioning.

The term conditioned emotional response (CER) can refer to a specific learned behavior or a procedure commonly used in classical or Pavlovian conditioning research. It may also be called "conditioned suppression" or "conditioned fear response (CFR)." It is an "emotional response" that results from classical conditioning, usually from the association of a relatively neutral stimulus with a painful or fear-inducing unconditional stimulus. As a result, the formerly neutral stimulus elicits fear. For example, if seeing a dog is paired with the pain of being bitten by the dog, seeing a dog may become a conditioned stimulus that elicits fear.

James "Jim" A. Dinsmoor was an American experimental psychologist who published work in the field of the experimental analysis of behavior.

The differential outcomes effect (DOE) is a theory in behaviorism, a branch of psychology, that shows that a positive effect on accuracy occurs in discrimination learning between different stimuli when unique rewards are paired with each individual stimulus. The DOE was first demonstrated in 1970 by Milton Trapold on an experiment with rats. Rats were trained to discriminate between a clicker and a tone by pressing the left and right levers. Half of the rats were trained using the differential outcomes procedure (DOP), where the clicker was paired with sucrose and tone with food pellets. The remaining rats were trained with only sucrose or only food pellets. The rats trained with the DOP were significantly more accurate than those trained with only one type of reinforcement. Since then it has been established through a myriad of experiments that the DOE exists in most species capable of learning.

Association in psychology refers to a mental connection between concepts, events, or mental states that usually stems from specific experiences. Associations are seen throughout several schools of thought in psychology including behaviorism, associationism, psychoanalysis, social psychology, and structuralism. The idea stems from Plato and Aristotle, especially with regard to the succession of memories, and it was carried on by philosophers such as John Locke, David Hume, David Hartley, and James Mill. It finds its place in modern psychology in such areas as memory, learning, and the study of neural pathways.

References

  1. Peterson, G.B. (2004) A day of great illumination: B.F. Skinner's discovery of shaping. Journal of the Experimental Analysis of Behavior, 82: 317–28
  2. Skinner, B.F. (1953). Science and human behavior. pp. 92–3. Oxford, England: Macmillan.
  3. Miltenberger, Raymond, G. (2018). Behavior Modification Principles and Procedures (6 ed.). Florida: Cengage Learning Solutions. pp. 163–180. ISBN   978-1-305-10939-1.{{cite book}}: CS1 maint: multiple names: authors list (link)
  4. Shettleworth, Sara J. (2010) "Cognition, Evolution, and Behavior" (2nd Ed) Oxford: New York
  5. Powell, R.; Symbaluk, D.; Honey, P. (2008). Introduction to Learning and Behavior. Cengage Learning. p. 430. ISBN   9780495595281 . Retrieved 2015-10-23.
  6. Barbara Engler: Personality Theories
  7. Miltenberger, R. (2012). Behavior modification, principles and procedures. (5th ed.). Wadsworth Publishing Company.
  8. Brown, P. & Jenkins, H.M. (1968). Auto-shaping of the pigeon's key peck. J. Exper. Analys. Behav. 11: 1–8.
  9. see Sheffield, 1965; Williams & Williams, 1969
  10. Killeen, Peter R. (2003). Complex dynamic processes in sign tracking with an omission contingency (negative automaintenance). Journal of Experimental Psychology. 29(1): 49-61