Superrationality

Last updated

In economics and game theory, a participant is considered to have superrationality (or renormalized rationality) if they have perfect rationality (and thus maximize their utility) but assume that all other players are superrational too and that a superrational individual will always come up with the same strategy as any other superrational thinker when facing the same problem. Applying this definition, a superrational player playing against a superrational opponent in a prisoner's dilemma will cooperate while a rationally self-interested player would defect.

Contents

This decision rule is not a mainstream model in game theory and was suggested by Douglas Hofstadter in his article, series, and book Metamagical Themas [1] as an alternative type of rational decision making different from the widely accepted game-theoretic one. Hofstadter provided this definition: "Superrational thinkers, by recursive definition, include in their calculations the fact that they are in a group of superrational thinkers." [1]

Unlike the supposed "reciprocating human", the superrational thinker will not always play the equilibrium that maximizes the total social utility and is thus not a philanthropist.

Prisoner's dilemma

The idea of superrationality is that two logical thinkers analyzing the same problem will think of the same correct answer. For example, if two people are both good at math and both have been given the same complicated problem to do, both will get the same right answer. In math, knowing that the two answers are going to be the same doesn't change the value of the problem, but in the game theory, knowing that the answer will be the same might change the answer itself.

The prisoner's dilemma is usually framed in terms of jail sentences for criminals, but it can be stated equally well with cash prizes instead. Two players are each given the choice to cooperate (C) or to defect (D). The players choose without knowing what the other is going to do. If both cooperate, each will get $100. If they both defect, they each get $1. If one cooperates and the other defects, then the defecting player gets $150, while the cooperating player gets nothing.

The four outcomes and the payoff to each player are listed below.

Player B cooperatesPlayer B defects
Player A cooperatesBoth get $100Player A: $0
Player B: $150
Player A defectsPlayer A: $150
Player B: $0
Both get $1

One valid way for the players to reason is as follows:

  1. Assuming the other player defects, if I cooperate I get nothing and if I defect I get a dollar.
  2. Assuming the other player cooperates, I get $100 if I cooperate and $150 if I defect.
  3. So whatever the other player does, my payoff is increased by defecting, if only by one dollar.

The conclusion is that the rational thing to do is to defect. This type of reasoning defines game-theoretic rationality and two game-theoretic rational players playing this game both defect and receive a dollar each.

Superrationality is an alternative method of reasoning. First, it is assumed that the answer to a symmetric problem will be the same for all the superrational players. Thus the sameness is taken into account before knowing what the strategy will be. The strategy is found by maximizing the payoff to each player, assuming that they all use the same strategy. Since the superrational player knows that the other superrational player will do the same thing, whatever that might be, there are only two choices for two superrational players. Both will cooperate or both will defect depending on the value of the superrational answer. Thus the two superrational players will both cooperate since this answer maximizes their payoff. Two superrational players playing this game will each walk away with $100.

Note that a superrational player playing against a game-theoretic rational player will defect, since the strategy only assumes that the superrational players will agree. A superrational player playing against a player of uncertain superrationality will sometimes defect and sometimes cooperate, based on the probability of the other player being superrational.[ citation needed ]

Although standard game theory assumes common knowledge of rationality, it does so in a different way. The game-theoretic analysis maximizes payoffs by allowing each player to change strategies independently of the others, even though in the end, it assumes that the answer in a symmetric game will be the same for all. This is the definition of a game-theoretic Nash equilibrium, which defines a stable strategy as one where no player can improve the payoffs by unilaterally changing course. The superrational equilibrium in a symmetric game is one where all the players' strategies are forced to be the same before the maximization step. (There is no agreed-upon extension of the concept of superrationality to asymmetric games.)

Some argue[ who? ] that superrationality implies a kind of magical thinking in which each player supposes that their decision to cooperate will cause the other player to cooperate, even though there is no communication. Hofstadter points out that the concept of "choice" doesn't apply when the player's goal is to figure something out, and that the decision does not cause the other player to cooperate, but rather the same logic leads to the same answer independent of communication or cause and effect. This debate is over whether it is reasonable for human beings to act in a superrational manner, not over what superrationality means, and is similar to arguments about whether it is reasonable for humans to act in a 'rational' manner, as described by game theory (wherein they can figure out what other players will or have done by asking themselves, what would I do if I was them, and applying backward induction and iterated elimination of dominated strategies).

Probabilistic strategies

For simplicity, the foregoing account of superrationality ignored mixed strategies: the possibility that the best choice could be to flip a coin, or more generally to choose different outcomes with some probability. In the prisoner's dilemma, it is superrational to cooperate with probability 1 even when mixed strategies are admitted, because the average payoff when one player cooperates and the other defects are the same as when both cooperate and so defecting increases the risk of both defecting, which decreases the expected payout. But in some cases, the superrational strategy is mixed.

For example, if the payoffs in are as follows:

CC – $100/$100
CD – $0/$1,000,000
DC – $1,000,000/$0
DD – $1/$1

So that defecting has a huge reward, the superrational strategy is defecting with a probability of 499,900/999,899 or a little over 49.995%. As the reward increases to infinity, the probability only approaches 1/2 further, and the losses for adopting the simpler strategy of 1/2 (which are already minimal) approach 0. In a less extreme example, if the payoff for one cooperator and one defector was $400 and $0, respectively, the superrational mixed strategy world be defecting with probability 100/299 or about 1/3.

In similar situations with more players, using a randomising device can be essential. One example discussed by Hofstadter is the platonia dilemma: an eccentric trillionaire contacts 20 people, and tells them that if one and only one of them send him or her a telegram (assumed to cost nothing) by noon the next day, that person will receive a billion dollars. If they receive more than one telegram or none at all, no one will get any money, and communication between players is forbidden. In this situation, the superrational thing to do (if it is known that all 20 are superrational) is to send a telegram with probability p=1/20—that is, each recipient essentially rolls a 20-sided die and only sends a telegram if it comes up "1". This maximizes the probability that exactly one telegram is received.

Notice though that this is not the solution in the conventional game-theoretical analysis. Twenty game-theoretically rational players would each send in telegrams and therefore receive nothing. This is because sending telegrams is the dominant strategy; if an individual player sends telegrams they have a chance of receiving money, but if they send no telegrams they cannot get anything. (If all telegrams were guaranteed to arrive, they would only send one, and no one would expect to get any money).


The question of whether to cooperate in a one-shot Prisoner's Dilemma in some circumstances has also come up in the decision theory literature sparked by Newcomb's problem. Causal decision theory suggests that superrationality is irrational, while evidential decision theory endorses lines of reasoning similar to superrationality and recommends cooperation in a Prisoner's Dilemma against a similar opponent. [2] [3]

Program equilibrium has been proposed as a mechanistic model of superrationality. [4] [5] [6]

See also

Related Research Articles

An evolutionarily stable strategy (ESS) is a strategy that is impermeable when adopted by a population in adaptation to a specific environment, that is to say it cannot be displaced by an alternative strategy which may be novel or initially rare. Introduced by John Maynard Smith and George R. Price in 1972/3, it is an important concept in behavioural ecology, evolutionary psychology, mathematical game theory and economics, with applications in other fields such as anthropology, philosophy and political science.

Game theory is the study of mathematical models of strategic interactions among rational agents. It has applications in many fields of social science, used extensively in economics as well as in logic, systems science and computer science. Traditional game theory addressed two-person zero-sum games, in which a participant's gains or losses are exactly balanced by the losses and gains of the other participant. In the 21st century, game theory applies to a wider range of behavioral relations, and it is now an umbrella term for the science of rational decision making in humans, animals, as well as computers.

The prisoner's dilemma is a game theory thought experiment that involves two rational agents, each of whom can cooperate for mutual benefit or betray their partner ("defect") for individual reward. This dilemma was originally framed by Merrill Flood and Melvin Dresher in 1950 while they worked at the RAND Corporation. Albert W. Tucker later formalized the game by structuring the rewards in terms of prison sentences and named it the "prisoner's dilemma".

In game theory, the Nash equilibrium, named after the mathematician John Nash, is the most common way to define the solution of a non-cooperative game involving two or more players. In a Nash equilibrium, each player is assumed to know the equilibrium strategies of the other players, and no one has anything to gain by changing only one's own strategy. The principle of Nash equilibrium dates back to the time of Cournot, who in 1838 applied it to competing firms choosing outputs.

The game of chicken, also known as the hawk-dove game or snowdrift game, is a model of conflict for two players in game theory. The principle of the game is that while the ideal outcome is for one player to yield, individuals try to avoid it out of pride, not wanting to look like "chickens." Each player taunts the other to increase the risk of shame in yielding. However, when one player yields, the conflict is avoided, and the game essentially ends.

In the platonia dilemma introduced in Douglas Hofstadter's book Metamagical Themas, an eccentric trillionaire gathers 20 people together, and tells them that if one and only one of them sends them a telegram by noon the next day, that person will receive a billion dollars. If they receive more than one telegram, or none at all, no one will get any money, and cooperation between players is forbidden. In this situation, the superrational thing to do is to send a telegram with probability 1/20.

In game theory, the centipede game, first introduced by Robert Rosenthal in 1981, is an extensive form game in which two players take turns choosing either to take a slightly larger share of an increasing pot, or to pass the pot to the other player. The payoffs are arranged so that if one passes the pot to one's opponent and the opponent takes the pot on the next round, one receives slightly less than if one had taken the pot on this round, but after an additional switch the potential payoff will be higher. Therefore, although at each round a player has an incentive to take the pot, it would be better for them to wait. Although the traditional centipede game had a limit of 100 rounds, any game with this structure but a different number of rounds is called a centipede game.

In game theory, a player's strategy is any of the options which they choose in a setting where the optimal outcome depends not only on their own actions but on the actions of others. The discipline mainly concerns the action of a player in a game affecting the behavior or actions of other players. Some examples of "games" include chess, bridge, poker, monopoly, diplomacy or battleship. A player's strategy will determine the action which the player will take at any stage of the game. In studying game theory, economists enlist a more rational lens in analyzing decisions rather than the psychological or sociological perspectives taken when analyzing relationships between decisions of two or more parties in different disciplines.

In game theory, the stag hunt, sometimes referred to as the assurance game, trust dilemma or common interest game, describes a conflict between safety and social cooperation. The stag hunt problem originated with philosopher Jean-Jacques Rousseau in his Discourse on Inequality. In the most common account of this dilemma, which is quite different from Rousseau's, two hunters must decide separately, and without the other knowing, whether to hunt a stag or a hare. However, both hunters know the only way to successfully hunt a stag is with the other's help. One hunter can catch a hare alone with less effort and less time, but it is worth far less than a stag and has much less meat. But both hunters would be better off if both choose the more ambitious and more rewarding goal of getting the stag, giving up some autonomy in exchange for the other hunter's cooperation and added might. This situation is often seen as a useful analogy for many kinds of social cooperation, such as international agreements on climate change.

<span class="mw-page-title-main">Solution concept</span> Formal rule for predicting how a game will be played

In game theory, a solution concept is a formal rule for predicting how a game will be played. These predictions are called "solutions", and describe which strategies will be adopted by players and, therefore, the result of the game. The most commonly used solution concepts are equilibrium concepts, most famously Nash equilibrium.

In game theory, strategic dominance occurs when one strategy is better than another strategy for one player, no matter how that player's opponents may play. Many simple games can be solved using dominance. The opposite, intransitivity, occurs in games where one strategy may be better or worse than another strategy for one player, depending on how the player's opponents may play.

The chain store paradox is an apparent game theory paradox describing the decisions a chain store might make, where a "deterrence strategy" appears optimal instead of the backward induction strategy of standard game theory reasoning.

In game theory, the outcome of a game is the ultimate result of a strategic interaction with one or more people, dependant on the choices made by all participants in a certain exchange. It represents the final payoff resulting from a set of actions that individuals can take within the context of the game. Outcomes are pivotal in determining the payoffs and expected utility for parties involved. Game theorists commonly study how the outcome of a game is determined and what factors affect it.

In game theory, a subgame perfect equilibrium is a refinement of a Nash equilibrium used in dynamic games. A strategy profile is a subgame perfect equilibrium if it represents a Nash equilibrium of every subgame of the original game. Informally, this means that at any point in the game, the players' behavior from that point onward should represent a Nash equilibrium of the continuation game, no matter what happened before. Every finite extensive game with perfect recall has a subgame perfect equilibrium. Perfect recall is a term introduced by Harold W. Kuhn in 1953 and "equivalent to the assertion that each player is allowed by the rules of the game to remember everything he knew at previous moves and all of his choices at those moves".

<span class="mw-page-title-main">Peace war game</span>

Peace war game is an iterated game originally played in academic groups and by computer simulation for years to study possible strategies of cooperation and aggression. As peace makers became richer over time it became clear that making war had greater costs than initially anticipated. The only strategy that acquired wealth more rapidly was a "Genghis Khan", a constant aggressor making war continually to gain resources. This led to the development of the "provokable nice guy" strategy, a peace-maker until attacked. Multiple players continue to gain wealth cooperating with each other while bleeding the constant aggressor. The Hanseatic League for trade and mutual defense appears to have originated from just such concerns about seaborne raiders.

In game theory, the traveler's dilemma is a non-zero-sum game in which each player proposes a payoff. The lower of the two proposals wins; the lowball player receives the lowball payoff plus a small bonus, and the highball player receives the same lowball payoff, minus a small penalty. Surprisingly, the Nash equilibrium is for both players to aggressively lowball. The traveler's dilemma is notable in that naive play appears to outperform the Nash equilibrium; this apparent paradox also appears in the centipede game and the finitely-iterated prisoner's dilemma.

<span class="mw-page-title-main">Simultaneous game</span>

In game theory, a simultaneous game or static game is a game where each player chooses their action without knowledge of the actions chosen by other players. Simultaneous games contrast with sequential games, which are played by the players taking turns. In other words, both players normally act at the same time in a simultaneous game. Even if the players do not act at the same time, both players are uninformed of each other's move while making their decisions. Normal form representations are usually used for simultaneous games. Given a continuous game, players will have different information sets if the game is simultaneous than if it is sequential because they have less information to act on at each step in the game. For example, in a two player continuous game that is sequential, the second player can act in response to the action taken by the first player. However, this is not possible in a simultaneous game where both players act at the same time.

Program equilibrium is a game-theoretic solution concept for a scenario in which players submit computer programs to play the game on their behalf and the programs can read each other's source code. The term was introduced by Moshe Tennenholtz in 2004. The same setting had previously been studied by R. Preston McAfee, J. V. Howard and Ariel Rubinstein.

Behavioral game theory seeks to examine how people's strategic decision-making behavior is shaped by social preferences, social utility and other psychological factors. Behavioral game theory analyzes interactive strategic decisions and behavior using the methods of game theory, experimental economics, and experimental psychology. Experiments include testing deviations from typical simplifications of economic theory such as the independence axiom and neglect of altruism, fairness, and framing effects. As a research program, the subject is a development of the last three decades.

Subjective expected relative similarity (SERS) is a normative and descriptive theory that predicts and explains cooperation levels in a family of games termed Similarity Sensitive Games (SSG), among them the well-known Prisoner's Dilemma game (PD). SERS was originally developed in order to (i) provide a new rational solution to the PD game and (ii) to predict human behavior in single-step PD games. It was further developed to account for: (i) repeated PD games, (ii) evolutionary perspectives and, as mentioned above, (iii) the SSG subgroup of 2×2 games. SERS predicts that individuals cooperate whenever their subjectively perceived similarity with their opponent exceeds a situational index derived from the game's payoffs, termed the similarity threshold of the game. SERS proposes a solution to the rational paradox associated with the single step PD and provides accurate behavioral predictions. The theory was developed by Prof. Ilan Fischer at the University of Haifa.

References

  1. 1 2 Hofstadter, Douglas (June 1983). "Dilemmas for Superrational Thinkers, Leading Up to a Luring Lottery". Scientific American . 248 (6). – reprinted in: Hofstadter, Douglas (1985). Metamagical Themas. Basic Books. pp. 737–755. ISBN   0-465-04566-9.
  2. Lewis, David (1979). "Prisoners' Dilemma is a Newcomb Problem". Philosophy and Public Affairs. 8 (3): 235–240. doi:10.1093/0195036468.003.0011. JSTOR   2265034.
  3. Brams, Steven J. (1975). "Newcomb's Problem and Prisoners' Dilemma". The Journal of Conflict Resolution. 19 (4): 596–612.
  4. Howard, J.V. (May 1988). "Cooperation in the Prisoner's Dilemma". Theory and Decision . 24 (3): 203–213. doi:10.1007/BF00148954.
  5. Barasz, M.; Christiano, P.; Fallenstein, B.; Herreshoff, M.; LaVictoire, P.; Yudkowsky, E. (2014). "Robust Cooperation in the Prisoner's Dilemma: Program Equilibrium via Provability Logic". arXiv: 1401.5577 [cs.GT].
  6. Oesterheld, Caspar; Treutlein, Johannes; Grosse, Roger; Conitzer, Vincent; Foerster, Jakob (2023). "Similarity-based Cooperative Equilibrium". Proceedings of the Neural Information Processing Systems (NeurIPS).