# Perfect Bayesian equilibrium

Last updated
Perfect Bayesian Equilibrium
A solution concept in game theory
Relationship
Subset of Bayesian Nash equilibrium
Significance
Proposed byCho and Kreps[ citation needed ]
Used forDynamic Bayesian games
Example signaling game

In game theory, a Perfect Bayesian Equilibrium (PBE) is an equilibrium concept relevant for dynamic games with incomplete information (sequential Bayesian games). It is a refinement of Bayesian Nash equilibrium (BNE). A PBE has two components - strategies and beliefs:

## Contents

• The strategy of a player in given information-set determines how this player acts in that information-set. The action may depend on the history. This is similar to a sequential game.
• The belief of a player in a given information-set determines what node in that information-set the player believes that they are playing at. The belief may be a probability distribution over the nodes in the information-set (particularly: the belief may be a probability distribution over the possible types of the other players). Formally, a belief system is an assignment of probabilities to every node in the game such that the sum of probabilities in any information set is 1.

The strategies and beliefs should satisfy the following conditions:

• Sequential rationality: each strategy should be optimal in expectation, given the beliefs.
• Consistency: each belief should be updated according to the strategies and Bayes' rule, in every path of positive probability (in paths of zero probability, aka off-the-equilibrium paths, the beliefs can be arbitrary).

A PBE is always an NE but may not be an subgame perfect equilibrium (SPE).

## PBE in signaling games

A signaling game is the simplest kind of a dynamic Bayesian game. There are two players, one of them (the "receiver") has only one possible type, and the other (the "sender") has several possible types. The sender plays first, then the receiver.

To calculate a PBE in a signaling game, we consider two kinds of equilibria: a separating equilibrium and a pooling equilibrium. In a separating equilibrium each sender-type plays a different action, so the sender's action gives information to the receiver; in a pooling equilibrium, all sender-types play the same action, so the sender's action gives no information to the receiver.

Consider the following game: [1]

• The sender has two possible types: either a "friend" (with apriori probability ${\displaystyle p}$) or an "enemy" (with apriori probability ${\displaystyle 1-p}$). Each type has two strategies: either give a gift, or not give.
• The receiver has only one type, and two strategies: either accept the gift, or reject it.
• The sender's utility is 1 if their gift is accepted, -1 if their gift is rejected, and 0 if they do not give any gift.
• If the sender is a friend, then the receiver's utility is 1 (if they accept) or 0 (if they reject).
• If the sender is an enemy, then the receiver's utility is -1 (if they accept) or 0 (if they reject).

To analyze PBE in this game, let's look first at the following potential separating equilibria:

1. The sender's strategy is: a friend gives and an enemy does not give. The receiver's beliefs are updated accordingly: if they receive a gift, they know the sender is a friend; otherwise, they know the sender is an enemy. So receiver's strategy is: accept. This is NOT an equilibrium, since the sender's strategy is not optimal: an enemy sender can increase their payoff from 0 to 1 by sending a gift.
2. The sender's strategy is: a friend does not give and an enemy gives. The receiver's beliefs are updated accordingly: if they receive a gift, they know the sender is an enemy; otherwise, they know the sender is a friend. The receiver's strategy is: reject. Again, this is NOT an equilibrium, since the sender's strategy is not optimal: an enemy sender can increase their payoff from -1 to 0 by not sending a gift.

We conclude that in this game, there is no separating equilibrium.

Now, let's look at the following potential pooling equilibria:

1. The sender's strategy is: always give. The receiver's beliefs are not updated: they still believe in the a-priori probability, that the sender is a friend with probability ${\displaystyle p}$ and an enemy with probability ${\displaystyle 1-p}$. Their payoff from accepting is ${\displaystyle 2p-1}$, so they accept if-and-only-if ${\displaystyle p\geq 1/2}$. So this is a PBE (best-response for both sender and receiver) if-and-only-if the apriori probability for being a friend satisfies ${\displaystyle p\geq 1/2}$.
2. The sender's strategy is: never give. Here, the receiver's beliefs when receiving a gift can be arbitrary, since receiving a gift is an event with probability 0, so Bayes' rule does not apply. For example, suppose the receiver's beliefs when receiving a gift is that the sender is a friend with probability 0.2 (or any other number less than 0.5). The receiver's strategy is: reject. This is a PBE regardless of the apriori probability. Both the sender and the receiver get expected payoff 0, and neither of them can improve the expected payoff by deviating.

To summarize:

• If ${\displaystyle p\geq 1/2}$, then there are two PBEs: either the sender always gives and the receiver always accepts, or the sender always does not give and the receiver always rejects.
• If ${\displaystyle p<1/2}$, then there is only one PBE: the sender always does not give and the receiver always rejects. This PBE is not Pareto efficient, but this is inevitable, since the sender cannot reliably signal their type.

In the following example, the set of PBEs is strictly smaller than the set of SPEs and BNEs. It is a variant of the above gift-game, with the following change to the receiver's utility:

• If the sender is a friend, then the receiver's utility is 1 (if they accept) or 0 (if they reject).
• If the sender is an enemy, then the receiver's utility is 0 (if they accept) or -1 (if they reject).

Note that in this variant, accepting is a dominant strategy for the receiver.

Similarly to example 1, there is no separating equilibrium. Let's look at the following potential pooling equilibria:

1. The sender's strategy is: always give. The receiver's beliefs are not updated: they still believe in the a-priori probability, that the sender is a friend with probability ${\displaystyle p}$ and an enemy with probability ${\displaystyle 1-p}$. Their payoff from accepting is always higher than from rejecting, so they accept (regardless of the value of ${\displaystyle p}$). This is a PBE - it is a best-response for both sender and receiver.
2. The sender's strategy is: never give. Suppose the receiver's beliefs when receiving a gift is that the sender is a friend with probability ${\displaystyle q}$, where ${\displaystyle q}$ is any number in ${\displaystyle [0,1]}$. Regardless of ${\displaystyle q}$, the receiver's optimal strategy is: accept. This is NOT a PBE, since the sender can improve their payoff from 0 to 1 by giving a gift.
3. The sender's strategy is: never give, and the receiver's strategy is: reject. This is NOT a PBE, since for any belief of the receiver, rejecting is not a best-response.

Note that option 3 is a Nash equilibrium! If we ignore beliefs, then rejecting can be considered a best-response for the receiver, since it does not affect their payoff (since there is no gift anyway). Moreover, option 3 is even a SPE, since the only subgame here is the entire game! Such implausible equilibria might arise also in games with complete information, but they may be eliminated by applying subgame perfect Nash equilibrium. However, Bayesian games often contain non-singleton information sets and since subgames must contain complete information sets, sometimes there is only one subgame—the entire game—and so every Nash equilibrium is trivially subgame perfect. Even if a game does have more than one subgame, the inability of subgame perfection to cut through information sets can result in implausible equilibria not being eliminated.

To summarize: in this variant of the gift game, there are two SPEs: either the sender always gives and the receiver always accepts, or the sender always does not give and the receiver always rejects. From these, only the first one is a PBE; the other is not a PBE since it cannot be supported by any belief-system.

### More examples

For further examples, see signaling game#Examples. See also [2] for more examples.

## PBE in multi-stage games

A multi-stage game is a sequence of simultaneous games played one after the other. These games may be identical (as in repeated games) or different.

### Repeated public-good game

 Build Don't Build 1-C1, 1-C2 1-C1, 1 Don't 1, 1-C2 0,0 Public good game

The following game [3] :section 6.2 is a simple representation of the free-rider problem. There are two players, each of whom can either build a public good or not build. Each player gains 1 if the public good is built and 0 if not; in addition, if player ${\displaystyle i}$ builds the public good, they have to pay a cost of ${\displaystyle C_{i}}$. The costs are private information - each player knows their own cost but not the other's cost. It is only known that each cost is drawn independently at random from some probability distribution. This makes this game a Bayesian game.

In the one-stage game, each player builds if-and-only-if their cost is smaller than their expected gain from building. The expected gain from building is exactly 1 times the probability that the other player does NOT build. In equilibrium, for every player ${\displaystyle i}$, there is a threshold cost ${\displaystyle C_{i}^{*}}$, such that the player contributes if-and-only-if their cost is less than ${\displaystyle C_{i}^{*}}$. This threshold cost can be calculated based on the probability distribution of the players' costs. For example, if the costs are distributed uniformly on ${\displaystyle [0,2]}$, then there is a symmetric equilibrium in which the threshold cost of both players is 2/3. This means that a player whose cost is between 2/3 and 1 will not contribute, even though their cost is below the benefit, because of the possibility that the other player will contribute.

Now, suppose that this game is repeated two times. [3] :section 8.2.3 The two plays are independent, i.e., each day the players decide simultaneously whether to build a public good in that day, get a payoff of 1 if the good is built in that day, and pay their cost if they built in that day. The only connection between the games is that, by playing in the first day, the players may reveal some information about their costs, and this information might affect the play in the second day.

We are looking for a symmetric PBE. Denote by ${\displaystyle {\hat {c}}}$ the threshold cost of both players in day 1 (so in day 1, each player builds if-and-only-if their cost is at most ${\displaystyle {\hat {c}}}$). To calculate ${\displaystyle {\hat {c}}}$, we work backwards and analyze the players' actions in day 2. Their actions depend on the history (= the two actions in day 1), and there are three options:

1. In day 1, no player built. So now both players know that their opponent's cost is above ${\displaystyle {\hat {c}}}$. They update their belief accordingly, and conclude that there is a smaller chance that their opponent will build in day 2. Therefore, they increase their threshold cost, and the threshold cost in day 2 is ${\displaystyle c^{00}>{\hat {c}}}$.
2. In day 1, both players built. So now both players know that their opponent's cost is below ${\displaystyle {\hat {c}}}$. They update their belief accordingly, and conclude that there is a larger chance that their opponent will build in day 2. Therefore, they decrease their threshold cost, and the threshold cost in day 2 is ${\displaystyle c^{11}<{\hat {c}}}$.
3. In day 1, exactly one player built; suppose it is player 1. So now, it is known that the cost of player 1 is below ${\displaystyle {\hat {c}}}$ and the cost of player 2 is above ${\displaystyle {\hat {c}}}$. There is an equilibrium in which the actions in day 2 are identical to the actions in day 1 - player 1 builds and player 2 does not build.

It is possible to calculate the expected payoff of the "threshold player" (a player with cost exactly ${\displaystyle {\hat {c}}}$) in each of these situations. Since the threshold player should be indifferent between contributing and not contributing, it is possible to calculate the day-1 threshold cost ${\displaystyle {\hat {c}}}$. It turns out that this threshold is lower than ${\displaystyle c^{*}}$ - the threshold in the one-stage game. This means that, in a two-stage game, the players are less willing to build than in the one-stage game. Intuitively, the reason is that, when a player does not contribute in the first day, they make the other player believe their cost is high, and this makes the other player more willing to contribute in the second day.

### Jump-bidding

In an open-outcry English auction, the bidders can raise the current price in small steps (e.g. in \$1 each time). However, often there is jump bidding - some bidders raise the current price much more than the minimal increment. One explanation to this is that it serves as a signal to the other bidders. There is a PBE in which each bidder jumps if-and-only-if their value is above a certain threshold. See Jump bidding#signaling.

## Related Research Articles

The prisoner's dilemma is a standard example of a game analyzed in game theory that shows why two completely rational individuals might not cooperate, even if it appears that it is in their best interests to do so. It was originally framed by Merrill Flood and Melvin Dresher while working at RAND in 1950. Albert W. Tucker formalized the game with prison sentence rewards and named it "prisoner's dilemma", presenting it as follows:

Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of communicating with the other. The prosecutors lack sufficient evidence to convict the pair on the principal charge, but they have enough to convict both on a lesser charge. Simultaneously, the prosecutors offer each prisoner a bargain. Each prisoner is given the opportunity either to betray the other by testifying that the other committed the crime, or to cooperate with the other by remaining silent. The possible outcomes are:

In game theory, the Nash equilibrium, named after the mathematician John Forbes Nash Jr., is a proposed solution of a non-cooperative game involving two or more players in which each player is assumed to know the equilibrium strategies of the other players, and no player has anything to gain by changing only their own strategy.

In game theory, the best response is the strategy which produces the most favorable outcome for a player, taking other players' strategies as given. The concept of a best response is central to John Nash's best-known contribution, the Nash equilibrium, the point at which each player in a game has selected the best response to the other players' strategies.

In game theory, cheap talk is communication between players that does not directly affect the payoffs of the game. Providing and receiving information is free. This is in contrast to signaling in which sending certain messages may be costly for the sender depending on the state of the world.

In game theory, a signaling game is a simple type of a dynamic Bayesian game.

In game theory, a player's strategy is any of the options which he or she chooses in a setting where the outcome depends not only on their own actions but on the actions of others. A player's strategy will determine the action which the player will take at any stage of the game.

In game theory, a solution concept is a formal rule for predicting how a game will be played. These predictions are called "solutions", and describe which strategies will be adopted by players and, therefore, the result of the game. The most commonly used solution concepts are equilibrium concepts, most famously Nash equilibrium.

An extensive-form game is a specification of a game in game theory, allowing for the explicit representation of a number of key aspects, like the sequencing of players' possible moves, their choices at every decision point, the information each player has about the other player's moves when they make a decision, and their payoffs for all possible game outcomes. Extensive-form games also allow for the representation of incomplete information in the form of chance events modeled as "moves by nature".

In game theory, a Bayesian game is a game in which players have incomplete information about the other players. For example, a player may not know the exact payoff functions of the other players, but instead have beliefs about these payoff functions. These beliefs are represented by a probability distribution over the possible payoff functions.

Backward induction is the process of reasoning backwards in time, from the end of a problem or situation, to determine a sequence of optimal actions. It proceeds by first considering the last time a decision might be made and choosing what to do in any situation at that time. Using this information, one can then determine what to do at the second-to-last time of decision. This process continues backwards until one has determined the best action for every possible situation at every point in time. It was first used by Zermelo in 1913, to prove that chess has pure optimal strategies.

In game theory, trembling hand perfect equilibrium is a refinement of Nash equilibrium due to Reinhard Selten. A trembling hand perfect equilibrium is an equilibrium that takes the possibility of off-the-equilibrium play into account by assuming that the players, through a "slip of the hand" or tremble, may choose unintended strategies, albeit with negligible probability.

In game theory, folk theorems are a class of theorems about possible Nash equilibrium payoff profiles in repeated games. The original Folk Theorem concerned the payoffs of all the Nash equilibria of an infinitely repeated game. This result was called the Folk Theorem because it was widely known among game theorists in the 1950s, even though no one had published it. Friedman's (1971) Theorem concerns the payoffs of certain subgame-perfect Nash equilibria (SPE) of an infinitely repeated game, and so strengthens the original Folk Theorem by using a stronger equilibrium concept subgame-perfect Nash equilibria rather than Nash equilibrium.

In game theory, a correlated equilibrium is a solution concept that is more general than the well known Nash equilibrium. It was first discussed by mathematician Robert Aumann in 1974. The idea is that each player chooses their action according to their observation of the value of the same public signal. A strategy assigns an action to every possible observation a player can make. If no player would want to deviate from the recommended strategy, the distribution is called a correlated equilibrium.

In game theory, the purification theorem was contributed by Nobel laureate John Harsanyi in 1973. The theorem aims to justify a puzzling aspect of mixed strategy Nash equilibria: that each player is wholly indifferent amongst each of the actions he puts non-zero weight on, yet he mixes them so as to make every other player also indifferent.

Sequential equilibrium is a refinement of Nash Equilibrium for extensive form games due to David M. Kreps and Robert Wilson. A sequential equilibrium specifies not only a strategy for each of the players but also a belief for each of the players. A belief gives, for each information set of the game belonging to the player, a probability distribution on the nodes in the information set. A profile of strategies and beliefs is called an assessment for the game. Informally speaking, an assessment is a perfect Bayesian equilibrium if its strategies are sensible given its beliefs and its beliefs are confirmed on the outcome path given by its strategies. The definition of sequential equilibrium further requires that there be arbitrarily small perturbations of beliefs and associated strategies with the same property.

In game theory, a subgame perfect equilibrium is a refinement of a Nash equilibrium used in dynamic games. A strategy profile is a subgame perfect equilibrium if it represents a Nash equilibrium of every subgame of the original game. Informally, this means that if the players played any smaller game that consisted of only one part of the larger game, their behavior would represent a Nash equilibrium of that smaller game. Every finite extensive game with perfect recall has a subgame perfect equilibrium.

Quantal response equilibrium (QRE) is a solution concept in game theory. First introduced by Richard McKelvey and Thomas Palfrey, it provides an equilibrium notion with bounded rationality. QRE is not an equilibrium refinement, and it can give significantly different results from Nash equilibrium. QRE is only defined for games with discrete strategies, although there are continuous-strategy analogues.

The intuitive criterion (IC) is a technique for equilibrium refinement in signaling games. It aims to reduce possible outcome scenarios by first restricting the type group to types of agents who could obtain higher utility levels by deviating to off-the-equilibrium messages and second by considering in this sub-set of types the types for which the off-the-equilibrium message is not equilibrium dominated.

A Rubinstein bargaining model refers to a class of bargaining games that feature alternating offers through an infinite time horizon. The original proof is due to Ariel Rubinstein in a 1982 paper. For a long time, the solution to this type of game was a mystery; thus, Rubinstein's solution is one of the most influential findings in game theory.

Network games of incomplete information represent strategic network formation when agents do not know in advance their neighbors, i.e. the network structure and the value stemming from forming links with neighboring agents. In such a setting, agents have prior beliefs about the value of attaching to their neighbors; take their action based on their prior belief and update their belief based on the history of the game. While games with a fully known network structure are widely applicable, there are many applications when players act without fully knowing with whom they interact or what their neighbors’ action will be. For example, people choosing major in college can be formalized as a network game with imperfect information: they might know something about the number of people taking that major and might infer something about the job market for different majors, but they don’t know with whom they will have to interact, thus they do not know the structure of the network.

## References

1. James Peck. "Perfect Bayesian Equilibrium" (PDF). Ohio State University. Retrieved 2 September 2016.
2. Zack Grossman. "Perfect Bayesian Equilibrium" (PDF). University of California. Retrieved 2 September 2016.
3. Fudenberg, Drew; Tirole, Jean (1991). Game theory. Cambridge, Massachusetts: MIT Press. ISBN   9780262061414.