Premack's principle

Last updated September 03, 2024

The Premack principle, or the relativity theory of reinforcement, states that more probable behaviors will reinforce less probable behaviors.^[1]^[2]

Origin and description

The Premack principle was derived from a study of Cebus monkeys by David Premack. It was found that parameters can be understood in which the monkey operates.^[3] However, it has explanatory and predictive power when applied to humans, and it has been used by therapists practicing applied behavior analysis. The Premack principle suggests that if a person wants to perform a given activity, the person will perform a less desirable activity to get at the more desirable activity; that is, activities may themselves be reinforcers. An individual will be more motivated to perform a particular activity if they know that they will partake in a more desirable activity as a consequence. Stated objectively, if high-probability behaviors (more desirable behaviors) are made contingent upon lower-probability behaviors (less desirable behaviors), then the lower-probability behaviors are more likely to occur. More desirable behaviors are those that individuals spend more time doing if permitted; less desirable behaviors are those that individuals spend less time doing when free to act. Just as "reward" was commonly used to alter behavior long before "reinforcement" was studied experimentally, the Premack principle has long been informally understood and used in a wide variety of circumstances. An example is a mother who says, "You have to finish your vegetables (low frequency) before you can eat any ice cream (high frequency)."

Experimental evidence

David Premack and his colleagues, and others have conducted several experiments to test the effectiveness of the Premack principle in humans. One of the earliest studies was conducted with young children. Premack gave the children two response alternatives, eating candy or playing a pinball machine, and determined which of these behaviors was more probable for each child. Some of the children preferred one activity, some the other. In the second phase of the experiment, the children were tested with one of two procedures. In one procedure, eating was the reinforcing response, and playing pinball served as the instrumental response; that is, the children had to play pinball to eat candy. The results were consistent with the Premack principle: only the children who preferred eating candy over playing pinball showed a reinforcement effect. The roles of responses were reversed in the second test, with corresponding results. That is, only children who preferred playing pinball over eating candy showed a reinforcement effect. This study, among others, helps to confirm the Premack principle in showing that a high-probability activity can be an effective reinforcer for an activity that the subject is less likely to perform.^[4]

An alternative: response deprivation theory

The Premack principle may be violated if a situation or schedule of reinforcement provides much more of the high-probability behavior than of the low-probability behavior. Such observations led the team of Timberlake and Allison (1974) to propose the response deprivation hypothesis.^[5] Like the Premack principle, this hypothesis bases reinforcement of one behavior on access to another. Experimenters observe the extent to which an individual is deprived of or prevented from performing the behavior that is later made contingent on the second behavior. Reinforcement occurs only when the situation is set up so that access to the contingent response has been reduced relative to its baseline level. In effect, the subject must subsequently increase responding to make up for the "deprivation" of the contingent response. Several subsequent experiments have supported this alternative to the Premack principle.^[5]

Application to applied behavior analysis

In applied behavior analysis, the Premack principle is sometimes known as "grandma's rule", which states that making the opportunity to engage in high-frequency behavior contingent upon the occurrence of low-frequency behavior will function as a reinforcer for the low-frequency behavior.^[6] In other words, an individual must "first" engage in the desired target behavior, "then" they get to engage something reinforcing in return. For example, to encourage a child who prefers chocolate candy to eat vegetables (low-frequency behavior), the behaviorist would want to make access to eating chocolate candy (high-frequency behavior) contingent upon consuming the vegetables (low-frequency behavior). In this example, the statement would be, "first eat all of your vegetables; then you can have one chocolate candy." This statement or "rule" serves to make a highly probable behavior or preferred event used contingently to strengthen a low likely or non-preferred event. Its applications are seen in many different settings, from early intervention services, at home, to educational systems.

Citations

↑ Jon E. Roeckelein (1998), Dictionary of Theories, Laws, and Concepts in Psychology, Greenwood, ISBN 0-313-30460-2 p. 384
↑ Tony Ward; D. Richard Laws & Stephen M. Hudson (2003). Sexual deviance: issues and controversies. SAGE. p. 66. ISBN 978-0-7619-2732-7.
↑ Premack, D. (1959). Toward empirical behavior laws: I. Positive reinforcement. Psychological Review, 66(4), 219-233.
↑ Michael Domjan (2010). The principles of learning and behavior in Belmont, CA: Wadsworth. [6]
1 2 Timberlake and Allison, Response deprivation: an empirical approach to instrumental performance, Psychological Review, 1974, 81, 146-164
↑ Cooper, J. O., Heron, T. E., & Heward, W. L. (2014). Applied behavior analysis. Hoboken, NJ: Pearson Education, Inc.

Related Research Articles

Burrhus Frederic Skinner was an American psychologist, behaviorist, inventor, and social philosopher. He was the Edgar Pierce Professor of Psychology at Harvard University from 1958 until his retirement in 1974.

Operant conditioning, also called instrumental conditioning, is a learning process where voluntary behaviors are modified by association with the addition of reward or aversive stimuli. The frequency or duration of the behavior may increase through reinforcement or decrease through punishment or extinction.

In behavioral psychology, reinforcement refers to consequences that increase the likelihood of an organism's future behavior, typically in the presence of a particular antecedent stimulus. For example, a rat can be trained to push a lever to receive food whenever a light is turned on. In this example, the light is the antecedent stimulus, the lever pushing is the operant behavior, and the food is the reinforcer. Likewise, a student that receives attention and praise when answering a teacher's question will be more likely to answer future questions in class. The teacher's question is the antecedent, the student's response is the behavior, and the praise and attention are the reinforcements.

The experimental analysis of behavior is a science that studies the behavior of individuals across a variety of species. A key early scientist was B. F. Skinner who discovered operant behavior, reinforcers, secondary reinforcers, contingencies of reinforcement, stimulus control, shaping, intermittent schedules, discrimination, and generalization. A central method was the examination of functional relations between environment and behavior, as opposed to hypothetico-deductive learning theory that had grown up in the comparative psychology of the 1920–1950 period. Skinner's approach was characterized by observation of measurable behavior which could be predicted and controlled. It owed its early success to the effectiveness of Skinner's procedures of operant conditioning, both in the laboratory and in behavior therapy.

Content theory is a subset of motivational theories that try to define what motivates people. Content theories of motivation often describe a system of needs that motivate peoples' actions. While process theories of motivation attempt to explain how and why our motivations affect our behaviors, content theories of motivation attempt to define what those motives or needs are. Content theory includes the work of David McClelland, Abraham Maslow and other psychologists.

Motivational salience is a cognitive process and a form of attention that motivates or propels an individual's behavior towards or away from a particular object, perceived event or outcome. Motivational salience regulates the intensity of behaviors that facilitate the attainment of a particular goal, the amount of time and energy that an individual is willing to expend to attain a particular goal, and the amount of risk that an individual is willing to accept while working to attain a particular goal.

David Premack was an American psychologist who was a professor of psychology at the University of Pennsylvania. He was educated at the University of Minnesota when logical positivism was in full bloom. The departments of Psychology and Philosophy were closely allied. Herbert Feigl, Wilfred Sellars, and Paul Meehl led the philosophy seminars, while Group Dynamics was led by Leon Festinger and Stanley Schachter.

Self-control is an aspect of inhibitory control, one of the core executive functions. Executive functions are cognitive processes that are necessary for regulating one's behavior in order to achieve specific goals. Defined more independently, self-control is the ability to regulate one's emotions, thoughts, and behavior in the face of temptations and impulses. Thought to be like a muscle, acts of self-control expend a limited resource. In the short term, overuse of self-control leads to the depletion of that resource. However, in the long term, the use of self-control can strengthen and improve the ability to control oneself over time.

Shaping is a conditioning paradigm used primarily in the experimental analysis of behavior. The method used is differential reinforcement of successive approximations. It was introduced by B. F. Skinner with pigeons and extended to dogs, dolphins, humans and other species. In shaping, the form of an existing response is gradually changed across successive trials towards a desired target behavior by reinforcing exact segments of behavior. Skinner's explanation of shaping was this:

We first give the bird food when it turns slightly in the direction of the spot from any part of the cage. This increases the frequency of such behavior. We then withhold reinforcement until a slight movement is made toward the spot. This again alters the general distribution of behavior without producing a new unit. We continue by reinforcing positions successively closer to the spot, then by reinforcing only when the head is moved slightly forward, and finally only when the beak actually makes contact with the spot. ... The original probability of the response in its final form is very low; in some cases it may even be zero. In this way we can build complicated operants which would never appear in the repertoire of the organism otherwise. By reinforcing a series of successive approximations, we bring a rare response to a very high probability in a short time. ... The total act of turning toward the spot from any point in the box, walking toward it, raising the head, and striking the spot may seem to be a functionally coherent unit of behavior; but it is constructed by a continual process of differential reinforcement from undifferentiated behavior, just as the sculptor shapes his figure from a lump of clay.

Delayed gratification, or deferred gratification, is the ability to resist the temptation of an immediate reward in favor of a more valuable and long-lasting reward later. It involves forgoing a smaller, immediate pleasure to achieve a larger or more enduring benefit in the future. A growing body of literature has linked the ability to delay gratification to a host of other positive outcomes, including academic success, physical health, psychological health, and social competence.

A token economy is a system of contingency management based on the systematic reinforcement of target behavior. The reinforcers are symbols or tokens that can be exchanged for other reinforcers. A token economy is based on the principles of operant conditioning and behavioral economics and can be situated within applied behavior analysis. In applied settings token economies are used with children and adults; however, they have been successfully modeled with pigeons in lab settings.

In operant conditioning, the matching law is a quantitative relationship that holds between the relative rates of response and the relative rates of reinforcement in concurrent schedules of reinforcement. For example, if two response alternatives A and B are offered to an organism, the ratio of response rates to A and B equals the ratio of reinforcements yielded by each response. This law applies fairly well when non-human subjects are exposed to concurrent variable interval schedules ; its applicability in other situations is less clear, depending on the assumptions made and the details of the experimental situation. The generality of applicability of the matching law is subject of current debate.

Brain stimulation reward (BSR) is a pleasurable phenomenon elicited via direct stimulation of specific brain regions, originally discovered by James Olds and Peter Milner. BSR can serve as a robust operant reinforcer. Targeted stimulation activates the reward system circuitry and establishes response habits similar to those established by natural rewards, such as food and sex. Experiments on BSR soon demonstrated that stimulation of the lateral hypothalamus, along with other regions of the brain associated with natural reward, was both rewarding as well as motivation-inducing. Electrical brain stimulation and intracranial drug injections produce robust reward sensation due to a relatively direct activation of the reward circuitry. This activation is considered to be more direct than rewards produced by natural stimuli, as those signals generally travel through the more indirect peripheral nerves. BSR has been found in all vertebrates tested, including humans, and it has provided a useful tool for understanding how natural rewards are processed by specific brain regions and circuits, as well the neurotransmission associated with the reward system.

In operant conditioning, punishment is any change in a human or animal's surroundings which, occurring after a given behavior or response, reduces the likelihood of that behavior occurring again in the future. As with reinforcement, it is the behavior, not the human/animal, that is punished. Whether a change is or is not punishing is determined by its effect on the rate that the behavior occurs. This is called motivating operations (MO), because they alter the effectiveness of a stimulus. MO can be categorized in abolishing operations, decrease the effectiveness of the stimuli and establishing, increase the effectiveness of the stimuli. For example, a painful stimulus which would act as a punisher for most people may actually reinforce some behaviors of masochistic individuals.

In behavioral psychology, stimulus control is a phenomenon in operant conditioning that occurs when an organism behaves in one way in the presence of a given stimulus and another way in its absence. A stimulus that modifies behavior in this manner is either a discriminative stimulus or stimulus delta. For example, the presence of a stop sign at a traffic intersection alerts the driver to stop driving and increases the probability that braking behavior occurs. Stimulus control does not force behavior to occur, as it is a direct result of historical reinforcement contingencies, as opposed to reflexive behavior elicited through classical conditioning.

The mathematical principles of reinforcement (MPR) constitute of a set of mathematical equations set forth by Peter Killeen and his colleagues attempting to describe and predict the most fundamental aspects of behavior.

Motivating operation (MO) is a behavioristic concept introduced by Jack Michael in 1982. It is used to explain variations in the effects in the consequences of behavior. Most importantly, an MO affects how strongly the individual is reinforced or punished by the consequences of their behavior. For example, food deprivation is a motivating operation; if an individual is hungry, food is strongly reinforcing, but if they are satiated, food is less reinforcing. In 2003 Laraway suggested subdividing MOs into those that increase the reinforcing or punishing effects of a stimulus, which are termed establishing operations, and MOs that decrease the reinforcing or punishing effects of a stimulus, which are termed abolishing operations.

Insufficient justification is an effect studied in the discipline of social psychology. It states that people are more likely to engage in a behavior that contradicts the beliefs they hold personally when offered a smaller reward compared to a larger reward. The larger reward minimizes the cognitive dissonance generated by acting in contradiction to one's beliefs because it feels easier to justify. The theory of insufficient justification formally states that when extrinsic motivation is low, people are motivated to reduce cognitive dissonance by generating an intrinsic motivation to explain their behavior, and similarly more likely to decline a desired activity when presented with a mild threat versus a more serious threat. Insufficient justification occurs when the threat or reward is actually sufficient to get the person to engage in or to avoid a behavior, but the threat or reward is insufficient to allow the person to conclude that the situation caused the behavior.

Learned industriousness is a behaviorally rooted theory developed by Robert Eisenberger to explain the differences in general work effort among people of equivalent ability. According to Eisenberger, individuals who are reinforced for exerting high effort on a task are also secondarily reinforced by the sensation of high effort. Individuals with a history of reinforcement for effort are predicted to generalize this effort to new behaviors.

Functional behavior assessment (FBA) is an ongoing process of collecting information with a goal of identifying the environmental variables that control a problem or target behavior. The purpose of the assessment is to prove and aid the effectiveness of the interventions or treatments used to help eliminate the problem behavior. Through functional behavior assessments, we have learned that there are complex patterns to people's seemingly unproductive behaviors. It is important to not only pay attention to consequences that follow the behavior but also the antecedent that evokes the behavior. More work needs to be done in the future with functional assessment including balancing precision and efficiency, being more specific with variables involved and a more smooth transition from assessment to intervention.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Jon E. Roeckelein (1998), Dictionary of Theories, Laws, and Concepts in Psychology, Greenwood, ISBN 0-313-30460-2 p. 384

[2] Tony Ward; D. Richard Laws & Stephen M. Hudson (2003). Sexual deviance: issues and controversies. SAGE. p. 66. ISBN 978-0-7619-2732-7.

[3] Premack, D. (1959). Toward empirical behavior laws: I. Positive reinforcement. Psychological Review, 66(4), 219-233.

[4] Michael Domjan (2010). The principles of learning and behavior in Belmont, CA: Wadsworth. [6]

[:0-5] 1 2 Timberlake and Allison, Response deprivation: an empirical approach to instrumental performance, Psychological Review, 1974, 81, 146-164

[6] Cooper, J. O., Heron, T. E., & Heward, W. L. (2014). Applied behavior analysis. Hoboken, NJ: Pearson Education, Inc.

[1]

[2]

[3]

[4]

[5]

[6]