Alpha–beta pruning

Last updated
Alpha–beta pruning
Class Search algorithm
Worst-case performance
Best-case performance

Alpha–beta pruning is a search algorithm that seeks to decrease the number of nodes that are evaluated by the minimax algorithm in its search tree. It is an adversarial search algorithm used commonly for machine playing of two-player games (Tic-tac-toe, Chess, Go, etc.). It stops evaluating a move when at least one possibility has been found that proves the move to be worse than a previously examined move. Such moves need not be evaluated further. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision. [1]

Contents

History

Allen Newell and Herbert A. Simon who used what John McCarthy calls an "approximation" [2] in 1958 wrote that alpha–beta "appears to have been reinvented a number of times". [3] Arthur Samuel had an early version for a checkers simulation. Richards, Timothy Hart, Michael Levin and/or Daniel Edwards also invented alpha–beta independently in the United States. [4] McCarthy proposed similar ideas during the Dartmouth workshop in 1956 and suggested it to a group of his students including Alan Kotok at MIT in 1961. [5] Alexander Brudno independently conceived the alpha–beta algorithm, publishing his results in 1963. [6] Donald Knuth and Ronald W. Moore refined the algorithm in 1975. [7] [8] Judea Pearl proved its optimality for trees with randomly assigned leaf values in terms of the expected running time in two papers. [9] [10] The optimality of the randomized version of alpha-beta was shown by Michael Saks and Avi Wigderson in 1986. [11]

Core idea

The algorithm maintains two values, alpha and beta, which respectively represent the minimum score that the maximizing player is assured of and the maximum score that the minimizing player is assured of. Initially, alpha is negative infinity and beta is positive infinity, i.e. both players start with their worst possible score. Whenever the maximum score that the minimizing player (i.e. the "beta" player) is assured of becomes less than the minimum score that the maximizing player (i.e., the "alpha" player) is assured of (i.e. beta < alpha), the maximizing player need not consider further descendants of this node, as they will never be reached in the actual play.

To illustrate this with a real-life example, suppose somebody is playing chess, and it is their turn. Move "A" will improve the player's position. The player continues to look for moves to make sure a better one hasn't been missed. Move "B" is also a good move, but the player then realizes that it will allow the opponent to force checkmate in two moves. Thus, other outcomes from playing move B no longer need to be considered since the opponent can force a win. The maximum score that the opponent could force after move "B" is negative infinity: a loss for the player. This is less than the minimum position that was previously found; move "A" does not result in a forced loss in two moves.

Improvements over naive minimax

An illustration of alpha-beta pruning. The grayed-out subtrees don't need to be explored (when moves are evaluated from left to right), since it is known that the group of subtrees as a whole yields the value of an equivalent subtree or worse, and as such cannot influence the final result. The max and min levels represent the turn of the player and the adversary, respectively. AB pruning.svg
An illustration of alpha–beta pruning. The grayed-out subtrees don't need to be explored (when moves are evaluated from left to right), since it is known that the group of subtrees as a whole yields the value of an equivalent subtree or worse, and as such cannot influence the final result. The max and min levels represent the turn of the player and the adversary, respectively.

The benefit of alpha–beta pruning lies in the fact that branches of the search tree can be eliminated. This way, the search time can be limited to the 'more promising' subtree, and a deeper search can be performed in the same time. Like its predecessor, it belongs to the branch and bound class of algorithms. The optimization reduces the effective depth to slightly more than half that of simple minimax if the nodes are evaluated in an optimal or near optimal order (best choice for side on move ordered first at each node).

With an (average or constant) branching factor of b, and a search depth of d plies, the maximum number of leaf node positions evaluated (when the move ordering is pessimal) is O(b×b×...×b) = O(bd) – the same as a simple minimax search. If the move ordering for the search is optimal (meaning the best moves are always searched first), the number of leaf node positions evaluated is about O(b×1×b×1×...×b) for odd depth and O(b×1×b×1×...×1) for even depth, or . In the latter case, where the ply of a search is even, the effective branching factor is reduced to its square root, or, equivalently, the search can go twice as deep with the same amount of computation. [12] The explanation of b×1×b×1×... is that all the first player's moves must be studied to find the best one, but for each, only the second player's best move is needed to refute all but the first (and best) first player move—alpha–beta ensures no other second player moves need be considered. When nodes are considered in a random order (i.e., the algorithm randomizes), asymptotically, the expected number of nodes evaluated in uniform trees with binary leaf-values is . [11] For the same trees, when the values are assigned to the leaf values independently of each other and say zero and one are both equally probable, the expected number of nodes evaluated is , which is much smaller than the work done by the randomized algorithm, mentioned above, and is again optimal for such random trees. [9] When the leaf values are chosen independently of each other but from the interval uniformly at random, the expected number of nodes evaluated increases to in the limit, [10] which is again optimal for these kind random trees. Note that the actual work for "small" values of is better approximated using . [10] [9]

An animated pedagogical example that attempts to be human-friendly by substituting initial infinite (or arbitrarily large) values for emptiness and by avoiding using the negamax coding simplifications. Minmaxab.gif
An animated pedagogical example that attempts to be human-friendly by substituting initial infinite (or arbitrarily large) values for emptiness and by avoiding using the negamax coding simplifications.

Normally during alpha–beta, the subtrees are temporarily dominated by either a first player advantage (when many first player moves are good, and at each search depth the first move checked by the first player is adequate, but all second player responses are required to try to find a refutation), or vice versa. This advantage can switch sides many times during the search if the move ordering is incorrect, each time leading to inefficiency. As the number of positions searched decreases exponentially each move nearer the current position, it is worth spending considerable effort on sorting early moves. An improved sort at any depth will exponentially reduce the total number of positions searched, but sorting all positions at depths near the root node is relatively cheap as there are so few of them. In practice, the move ordering is often determined by the results of earlier, smaller searches, such as through iterative deepening.

Additionally, this algorithm can be trivially modified to return an entire principal variation in addition to the score. Some more aggressive algorithms such as MTD(f) do not easily permit such a modification.

Pseudocode

The pseudo-code for depth limited minimax with alpha-beta pruning is as follows: [12]

function alphabeta(node, depth, α, β, maximizingPlayer) isif depth = 0 or node is a terminal node thenreturn the heuristic value of node     if maximizingPlayer then         value := for each child of node do             value := max(value, alphabeta(child, depth  1, α, β, FALSE))             α := max(α, value)             if α ≥ β thenbreak(* β cutoff *)return value     else         value := +∞         for each child of node do             value := min(value, alphabeta(child, depth  1, α, β, TRUE))             β := min(β, value)             if β ≤ α thenbreak(* α cutoff *)return value
(* Initial call *) alphabeta(origin, depth,  , +, TRUE)

Implementations of alpha-beta pruning can often be delineated by whether they are "fail-soft," or "fail-hard". The pseudo-code illustrates the fail-soft variation. With fail-soft alpha-beta, the alphabeta function may return values (v) that exceed (v < α or v > β) the α and β bounds set by its function call arguments. In comparison, fail-hard alpha-beta limits its function return value into the inclusive range of α and β.

Heuristic improvements

Further improvement can be achieved without sacrificing accuracy by using ordering heuristics to search earlier parts of the tree that are likely to force alpha–beta cutoffs. For example, in chess, moves that capture pieces may be examined before moves that do not, and moves that have scored highly in earlier passes through the game-tree analysis may be evaluated before others. Another common, and very cheap, heuristic is the killer heuristic, where the last move that caused a beta-cutoff at the same tree level in the tree search is always examined first. This idea can also be generalized into a set of refutation tables.

Alpha–beta search can be made even faster by considering only a narrow search window (generally determined by guesswork based on experience). This is known as aspiration search. In the extreme case, the search is performed with alpha and beta equal; a technique known as zero-window search , null-window search, or scout search. This is particularly useful for win/loss searches near the end of a game where the extra depth gained from the narrow window and a simple win/loss evaluation function may lead to a conclusive result. If an aspiration search fails, it is straightforward to detect whether it failed high (high edge of window was too low) or low (lower edge of window was too high). This gives information about what window values might be useful in a re-search of the position.

Over time, other improvements have been suggested, and indeed the Falphabeta (fail-soft alpha-beta) idea of John Fishburn is nearly universal and is already incorporated above in a slightly modified form. Fishburn also suggested a combination of the killer heuristic and zero-window search under the name Lalphabeta ("last move with minimal window alpha-beta search").

Other algorithms

Since the minimax algorithm and its variants are inherently depth-first, a strategy such as iterative deepening is usually used in conjunction with alpha–beta so that a reasonably good move can be returned even if the algorithm is interrupted before it has finished execution. Another advantage of using iterative deepening is that searches at shallower depths give move-ordering hints, as well as shallow alpha and beta estimates, that both can help produce cutoffs for higher depth searches much earlier than would otherwise be possible.

Algorithms like SSS*, on the other hand, use the best-first strategy. This can potentially make them more time-efficient, but typically at a heavy cost in space-efficiency. [13]

See also

Related Research Articles

Minimax is a decision rule used in artificial intelligence, decision theory, game theory, statistics, and philosophy for minimizing the possible loss for a worst case scenario. When dealing with gains, it is referred to as "maximin"—to maximize the minimum gain. Originally formulated for n-player zero-sum game theory, covering both the cases where players take alternate moves and those where they make simultaneous moves, it has also been extended to more complex games and to general decision-making in the presence of uncertainty.

A* is a graph traversal and path search algorithm, which is often used in many fields of computer science due to its completeness, optimality, and optimal efficiency. One major practical drawback is its space complexity, as it stores all generated nodes in memory. Thus, in practical travel-routing systems, it is generally outperformed by algorithms which can pre-process the graph to attain better performance, as well as memory-bounded approaches; however, A* is still the best solution in many cases.

In competitive two-player games, the killer heuristic is a technique for improving the efficiency of alpha-beta pruning, which in turn improves the efficiency of the minimax algorithm. This algorithm has an exponential search time to find the optimal next move, so general methods for speeding it up are very useful. Barbara Liskov created the general heuristic to speed tree search in a computer program to play chess endgames for her Ph.D. thesis.

In computer chess programs, the null-move heuristic is a heuristic technique used to enhance the speed of the alpha-beta pruning algorithm.

A transposition table is a cache of previously seen positions, and associated evaluations, in a game tree generated by a computer game playing program. If a position recurs via a different sequence of moves, the value of the position is retrieved from the table, avoiding re-searching the game tree below that position. Transposition tables are primarily useful in perfect-information games. The usage of transposition tables is essentially memoization applied to the tree search and is a form of dynamic programming.

Game tree

In game theory, a game tree is a directed graph whose nodes are positions in a game and whose edges are moves. The complete game tree for a game is the game tree starting at the initial position and containing all possible moves from each position; the complete tree is the same tree as that obtained from the extensive-form game representation.

The expectiminimax algorithm is a variation of the minimax algorithm, for use in artificial intelligence systems that play two-player zero-sum games, such as backgammon, in which the outcome depends on a combination of the player's skill and chance elements such as dice rolls. In addition to "min" and "max" nodes of the traditional minimax tree, this variant has "chance" nodes, which take the expected value of a random event occurring. In game theory terms, an expectiminimax tree is the game tree of an extensive-form game of perfect, but incomplete information.

In computer science, beam search is a heuristic search algorithm that explores a graph by expanding the most promising node in a limited set. Beam search is an optimization of best-first search that reduces its memory requirements. Best-first search is a graph search which orders all partial solutions (states) according to some heuristic. But in beam search, only a predetermined number of best partial solutions are kept as candidates. It is thus a greedy algorithm.

Negamax search is a variant form of minimax search that relies on the zero-sum property of a two-player game.

Principal variation search is a negamax algorithm that can be faster than alpha-beta pruning. Like alpha-beta pruning, NegaScout is a directional search algorithm for computing the minimax value of a node in a tree. It dominates alpha-beta pruning in the sense that it will never examine a node that can be pruned by alpha-beta; however, it relies on accurate node ordering to capitalize on this advantage.

MTD(f) is a minimax search algorithm developed in 1994 by Aske Plaat, Jonathan Schaeffer, Wim Pijls, and Arie de Bruin. Experiments with tournament-quality chess, checkers, and Othello programs show it to be a highly efficient minimax algorithm. The name MTD(f) is an abbreviation for MTD(n,f). It is an alternative to the alpha-beta pruning algorithm.

In game theory, a null move or pass is a decision by a player to not make a move when it is that player's turn to move. Even though null moves are against the rules of many games, they are often useful to consider when analyzing these games. Examples of this include the analysis of zugzwang, and the null-move heuristic in game tree analysis.

Decision tree pruning

In Computer science Pruning means simplifying/compressing and optimizing a Decision tree by removing sections of the tree that are uncritical and redundant to classify instances.

SSS* is a search algorithm, introduced by George Stockman in 1979, that conducts a state space search traversing a game tree in a best-first fashion similar to that of the A* search algorithm.

In computer science, artificial intelligence, and mathematical optimization, a heuristic is a technique designed for solving a problem more quickly when classic methods are too slow, or for finding an approximate solution when classic methods fail to find any exact solution. This is achieved by trading optimality, completeness, accuracy, or precision for speed. In a way, it can be considered a shortcut.

In computer science, B* is a best-first graph search algorithm that finds the least-cost path from a given initial node to any goal node. First published by Hans Berliner in 1979, it is related to the A* search algorithm.

Computer Othello

Computer Othello refers to computer architecture encompassing computer hardware and computer software capable of playing the game of Othello.

In computer science, Monte Carlo tree search (MCTS) is a heuristic search algorithm for some kinds of decision processes, most notably those employed in software that plays board games. In that context MCTS is used to solve the game tree.

Best node search (BNS) is a minimax search algorithm, developed in 2011. Experiments with random trees show it to be the most efficient minimax algorithm. This algorithm does tell which move leads to minmax, but does not tell the evaluation of minimax.

References

  1. Russell, Stuart J.; Norvig, Peter (2010). Artificial Intelligence: A Modern Approach (3rd ed.). Upper Saddle River, New Jersey: Pearson Education, Inc. p. 167. ISBN   978-0-13-604259-4.
  2. McCarthy, John (27 November 2006). "Human Level AI Is Harder Than It Seemed in 1955" . Retrieved 2006-12-20.
  3. Newell, Allen; Simon, Herbert A. (1 March 1976). "Computer science as empirical inquiry: symbols and search". Communications of the ACM. 19 (3): 113–126. doi: 10.1145/360018.360022 .
  4. Edwards, D.J.; Hart, T.P. (4 December 1961). "The Alpha–beta Heuristic (AIM-030)". Massachusetts Institute of Technology. hdl:1721.1/6098.Cite journal requires |journal= (help)
  5. Kotok, Alan (3 December 2004). "MIT Artificial Intelligence Memo 41" . Retrieved 2006-07-01.
  6. Marsland, T.A. (May 1987). "Computer Chess Methods (PDF) from Encyclopedia of Artificial Intelligence. S. Shapiro (editor)" (PDF). J. Wiley & Sons. pp. 159–171. Archived from the original (PDF) on October 30, 2008. Retrieved 2006-12-21.
  7. Knuth, Donald E.; Moore, Ronald W. (1975). "An analysis of alpha-beta pruning". Artificial Intelligence. 6 (4): 293–326. doi:10.1016/0004-3702(75)90019-3. S2CID   7894372.
  8. Abramson, Bruce (1 June 1989). "Control strategies for two-player games". ACM Computing Surveys. 21 (2): 137–161. doi:10.1145/66443.66444. S2CID   11526154.
  9. 1 2 3 Pearl, Judea (1980). "Asymptotic Properties of Minimax Trees and Game-Searching Procedures". Artificial Intelligence . 14 (2): 113–138. doi:10.1016/0004-3702(80)90037-5.
  10. 1 2 3 Pearl, Judea (1982). "The Solution for the Branching Factor of the Alpha-Beta Pruning Algorithm and Its Optimality". Communications of the ACM. 25 (8): 559–64. doi:10.1145/358589.358616. S2CID   8296219.
  11. 1 2 Saks, M.; Wigderson, A. (1986). "Probabilistic Boolean Decision Trees and the Complexity of Evaluating Game Trees". 27th Annual Symposium on Foundations of Computer Science. pp. 29–38. doi:10.1109/SFCS.1986.44. ISBN   0-8186-0740-8. S2CID   6130392.
  12. 1 2 Russell, Stuart J.; Norvig, Peter (2003), Artificial Intelligence: A Modern Approach (2nd ed.), Upper Saddle River, New Jersey: Prentice Hall, ISBN   0-13-790395-2
  13. Pearl, Judea; Korf, Richard (1987), "Search techniques", Annual Review of Computer Science, 2: 451–467, doi:10.1146/annurev.cs.02.060187.002315, Like its A* counterpart for single-player games, SSS* is optimal in terms of the average number of nodes examined; but its superior pruning power is more than offset by the substantial storage space and bookkeeping required.

Bibliography