Maven (Scrabble)

Last updated

Maven is an artificial intelligence Scrabble player, created by Brian Sheppard. It has been used in official licensed Hasbro Scrabble games.

Contents

Algorithms

Game phases

Maven's gameplay is sub-divided into three phases: The "mid-game" phase, the "pre-endgame" phase, and the "endgame" phase.

The "mid-game" phase lasts from the beginning of the game up until there are nine or fewer tiles left in the bag. The program uses a rapid algorithm to find all possible plays from the given rack, and then part of the program called the "kibitzer" uses simple heuristics to sort them into rough order of quality. The most promising moves are then evaluated by "simming", in which the program simulates the random drawing of tiles, plays forward a set number of plays, and compares the points spread of the moves' outcomes. By simulating thousands of random drawings, the program can give a very accurate quantitative evaluation of the different plays. (While a Monte Carlo search, Maven does not use Monte Carlo tree search because it evaluates game trees only 2-ply deep, rather than playing out to the end of the game, and does not reallocate rollouts to more promising branches for deeper exploration; in reinforcement learning terminology, the Maven search strategy might be considered "truncated Monte Carlo simulation". A true MCTS strategy is unnecessary because the endgame can be solved. The shallow search is because the Maven author argues [1] that, due to the fast turnover of letters in one's bag, it is typically not useful to look more than 2-ply deep, because if one instead looked deeper, e.g. 4-ply, the variance of rewards will be larger and the simulations will take several times longer, while only helping in a few exotic situations: "We maintain that if it requires an extreme situation like CACIQUE to see the value of a four-ply simulation then they are not worth doing." As the board value can be evaluated with very high accuracy in Scrabble, unlike games such as Go, deeper simulations are unlikely to change the initial evaluation.)

The "pre-endgame" phase works in almost the same way as the "mid-game" phase, except that it is designed to attempt to yield a good end-game situation.

The "endgame" phase takes over as soon as there are no tiles left in the bag. In two-player games, this means that the players can now deduce from the initial letter distribution the exact tiles on each other's racks. Maven uses the B-star search algorithm to analyze the game tree during the endgame phase.

Move generation

Maven has used several algorithms for move generation, but the one that has stuck is the DAWG algorithm. The GADDAG algorithm is faster, but a DAWG for North American English is only 0.5 MB, compared to about 2.5 MB for a GADDAG. That makes a significant difference for download games, whereas the speed advantage is not important. (Note that unimportant does not mean that the difference is small, merely that users cannot tell the difference. The GADDAG is perhaps twice as fast, but both algorithms are fast enough.)

Rack evaluation

The first (1986) version of Maven used a set of about 100 patterns to value racks. Every single tile had a value (27 patterns). Each duplicate had a value (22 patterns). There were patterns for triplicates and quads for letters that have enough representation in the bag. Finally, the QU combination was a pattern.

Soon after the first version, Maven acquired rack evaluation terms for vowel/consonant balance and Q/U distribution. Vowel/consonant balance was a table lookup based on the count of vowels and consonants left in the rack. Q/U distribution varied the values of Q and U using a table lookup indexed by how many of each remained in the bag.

Shortly thereafter, Maven acquired a tile duplication evaluator. The idea was to vary a rack depending on the chance of drawing duplicates. For example, A is generally better than I as a tile, but if there are 7 A's and only 2 I's left in the bag, then maybe we should prefer to keep the I.

Parameter fitting was accomplished by tuning the values to predict the total of future scores. Nowadays this would be called Temporal Difference Learning.

This rack evaluation design was original to Maven. It was very successful in competing with the human champions of the day.

The design was later extended by other researchers. Mark Watkins championed what he called "tile synergy patterns." These are combination like ADES which form the basis of many high-scoring words. This is a natural extension of the design, which does significantly improve play. Maven's pattern set gradually expanded from the base set of 100 to well over 400.

Maven has since switched to a completely different architecture, proposed by John O'Laughlin and implemented in Quackle. This is the "exhaustive" architecture, where the program has a different rack evaluation parameter for each of the 3 million possible combinations of 0 to 7 tiles. With the advances in computer power over the last decade, it has become possible to tune such large parameter sets.

The downside of using an exhaustive approach is that Maven lost the ability to vary evaluations as a function of the tiles that remained in the bag. The point is that the exhaustive rack evaluator does not have terms that relate a rack's value to the possible draws from the bag.

Maven's version of exhaustive rack evaluation has added that ability. In Maven, each rack has its own liner evaluator, where the value of that rack varies as a function of the chance of drawing a duplicate, the chance of drawing a vowel, and the chance of drawing Q and U. This system has 5 parameters per rack, for about 15 million parameters in all.

Simulation

The great human champion Ron Tiekert had studied Scrabble by playing out individual positions dozens of times, and tabulating results. He suggested that with Maven's speed, it should be possible to automate that process in overnight runs. Brian Sheppard named this process "simulation", though it goes by the name "rollout" in backgammon, and "playout" in Go.

The process was to select N candidate moves using the score+rack heuristic. Then play out those moves hundreds or thousands of times to see which candidate performs best. You can vary the depth of the playout to suit your purpose; play two or four moves ahead to get a good idea about point differential, or play to the end of the game to measure winning chances.

By the mid-1990s, computers had become fast enough that Maven used simulation to choose moves in competitive games under tournament time controls. Algorithmic improvements were important to scaling simulation for this purpose. The most important innovation was to vary the number of trials given to candidates so that more successful candidates receive more effort. It was also helpful to control the racks so that all candidate moves were sampled against the same, unbiased distribution.

Analysis of games played by Maven's simulation engine suggest that Maven has surpassed the skill level of human champions.

Endgame

Strong play in Scrabble endgames is much harder than it looks. In theory, endgames are a game of perfect information, so the Alpha-beta pruning algorithm should work. But in practice Alpha Beta works badly on Scrabble.

The problem with Alpha Beta is that some Scrabble endgames require 14 moves to play out, and it is not possible to search that deeply. This is not merely a theoretical possibility. When one player is "stuck" with a tile, then playing out all remaining tiles is impossible. In that situation the optimal strategy for both sides is usually to play one tile on each turn.

Maven uses a different approach. The B* search algorithm is a selective-depth, progressive-widening algorithm that guarantees to find optimal solutions to two-player games when one can compute upper and lower bounds on the values of each position.

It turns out that it is possible to estimate upper and lower bounds on endgame positions. These bounds are correct (that is, the true value lies within the interval) for the overwhelming majority of positions. Since B* is reasonably robust in the presence of a small percentage of error in the bounds, Maven can solve endgames that other approaches cannot.

A further refinement makes Maven's endgame solutions asymptotically optimal even in the presence of errors. When the B* search terminates with a proof that one move is best, and there is still time remaining, then Maven widens its estimates by 1 point and searches again. These re-searches are usually very quick, because the tree from the previous search is still largely valid. Repeated use of this policy will progressively identify errors, starting with the smallest (and presumably most likely) errors.

Exhaustive pre-endgame

When only 1 or 2 tiles remain in the bag ("PEG-1" or "PEG-2"), it is possible to perform exhaustive searches of the state space.

The case of a PEG-1 is important, because almost half of all games pass through that state. Maven can play out such states exhaustively in almost all cases. That is, for all legal moves Maven can play out the resulting endgames (up to 8 for each legal move), and calculate which side will win the game in each case. Because there are some situations (e.g., two blanks, stuck-with-Q) that require extra effort, the calculation is performed progressively. That is, Maven expands its analysis first where the decision is close and relevant.

In a PEG-2 it is normally not possible to exhaustively examine all move sequences, so Maven goes as far as it can in the time available.

One feature of these low-tile situations is that it is very hard to safely prune the list of legal moves. For example, the optimal play is ranked behind more than 50 other moves by the score+rack heuristic more than 1% of the time.

This policy does not produce play that is theoretically perfect, because it is impossible to know what the true initial distribution of unseen tiles should be. Assuming a uniform distribution does well, and it is possible to calculate inferences about unseen tiles that marginally improves on that assumption.

Another limitation is that Maven is not addressing the "hidden information" aspect of such situations. That is, in theory there are situations where players maximize expectation by randomly choosing moves according to a probability distribution. Maven is choosing pure strategies at each node.

Related Research Articles

Minmax is a decision rule used in artificial intelligence, decision theory, game theory, statistics, and philosophy for minimizing the possible loss for a worst case scenario. When dealing with gains, it is referred to as "maximin" – to maximize the minimum gain. Originally formulated for several-player zero-sum game theory, covering both the cases where players take alternate moves and those where they make simultaneous moves, it has also been extended to more complex games and to general decision-making in the presence of uncertainty.

<i>Scrabble</i> Board game with words

Scrabble is a word game in which two to four players score points by placing tiles, each bearing a single letter, onto a game board divided into a 15×15 grid of squares. The tiles must form words that, in crossword fashion, read left to right in rows or downward in columns and are included in a standard dictionary or lexicon.

<span class="mw-page-title-main">Search algorithm</span> Any algorithm which solves the search problem

In computer science, a search algorithm is an algorithm designed to solve a search problem. Search algorithms work to retrieve information stored within particular data structure, or calculated in the search space of a problem domain, with either discrete or continuous values.

<span class="mw-page-title-main">Genetic algorithm</span> Competitive algorithm for searching a problem space

In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to generate high-quality solutions to optimization and search problems by relying on biologically inspired operators such as mutation, crossover and selection. Some examples of GA applications include optimizing decision trees for better performance, solving sudoku puzzles, hyperparameter optimization, causal inference, etc.

A solved game is a game whose outcome can be correctly predicted from any position, assuming that both players play perfectly. This concept is usually applied to abstract strategy games, and especially to games with full information and no element of chance; solving such a game may use combinatorial game theory and/or computer assistance.

<span class="mw-page-title-main">Computer chess</span> Computer hardware and software capable of playing chess

Computer chess includes both hardware and software capable of playing chess. Computer chess provides opportunities for players to practice even in the absence of human opponents, and also provides opportunities for analysis, entertainment and training. Computer chess applications that play at the level of a chess master or higher are available on hardware from supercomputers to smart phones. Standalone chess-playing machines are also available. Stockfish, GNU Chess, Fruit, and other free open source applications are available for various platforms.

Alpha–beta pruning is a search algorithm that seeks to decrease the number of nodes that are evaluated by the minimax algorithm in its search tree. It is an adversarial search algorithm used commonly for machine playing of two-player games. It stops evaluating a move when at least one possibility has been found that proves the move to be worse than a previously examined move. Such moves need not be evaluated further. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision.

<span class="mw-page-title-main">Evaluation function</span> Function in a computer game-playing program that evaluates a game position

An evaluation function, also known as a heuristic evaluation function or static evaluation function, is a function used by game-playing computer programs to estimate the value or goodness of a position in a game tree. Most of the time, the value is either a real number or a quantized integer, often in nths of the value of a playing piece such as a stone in go or a pawn in chess, where n may be tenths, hundredths or other convenient fraction, but sometimes, the value is an array of three values in the unit interval, representing the win, draw, and loss percentages of the position.

<span class="mw-page-title-main">Super Scrabble</span>

Super Scrabble is a board game introduced in 2004 and a variant of Scrabble. It is played on a 21×21 grid board instead of Scrabble's usual 15×15, and uses twice as many letter tiles.

<span class="mw-page-title-main">Endgame tablebase</span> Database of precalculated chess analysis

An endgame tablebase is a computerized database that contains precalculated exhaustive analysis of chess endgame positions. It is typically used by a computer chess engine during play, or by a human or computer that is retrospectively analysing a game that has already been played.

Scrabble variants are games created by changing the normal Scrabble rules or equipment.

<span class="mw-page-title-main">Francophone Scrabble</span>

Francophone Scrabble, or French-language Scrabble, is played by many thousands of amateurs throughout the world and the Fédération internationale de Scrabble francophone has more than 20,000 members. Just as in English, points are scored by playing valid words from the lettered tiles. In French there are 102 tiles - 100 lettered tiles and two blanks known as jokers. The official word list for Francophone Scrabble is L'Officiel du jeu Scrabble.

mi×ma+h is a Canadian board game developed by Wrebbit and published in 1987. It resembles a variant of Scrabble in that tiles are placed on a crossword-style grid, with special premiums such as squares that double or triple the value of a tile and a 50-point bonus for playing all seven tiles on the player's rack in one turn. Unlike Scrabble, Mixmath uses numbered tiles to generate short equations using simple arithmetic. Wrebbit, maker of Puzz-3D jigsaw puzzles, has since been taken over by Hasbro, and it appears that Mixmath has been discontinued.

<span class="mw-page-title-main">Tile tracking</span> Gaming strategy

Tile tracking is a technique most commonly associated with the game of Scrabble and similar word games. It refers to the practice of keeping track of letters played on the game board, typically by crossing letters off a score sheet or tracking grid as the tiles are played. Tracking tiles can be an important aid to strategy, especially during the endgame when there are no tiles left to draw, where careful tracking allows each player to deduce the remaining unseen letters on the opponent's final rack. The marking off of each letter from a pre-printed tracking grid as the tiles are played is a standard feature of tournament play.

In computer science, B* is a best-first graph search algorithm that finds the least-cost path from a given initial node to any goal node. First published by Hans Berliner in 1979, it is related to the A* search algorithm.

<span class="mw-page-title-main">Computer Othello</span> Abstract strategy game

Computer Othello refers to computer architecture encompassing computer hardware and computer software capable of playing the game of Othello. It is also known as Reversi for Microsoft Windows and Classic Mac OS.

A GADDAG is a data structure presented by Steven Gordon in 1994, for use in generating moves for Scrabble and other word-generation games where such moves require words that "hook into" existing words. It is often in contrast to move-generation algorithms using a directed acyclic word graph (DAWG) such as the one used by Maven. It is generally twice as fast as the traditional DAWG algorithms, but take about 5 times as much space for regulation Scrabble dictionaries.

<i>Words with Friends</i> Multiplayer crossword style video game

Words with Friends is a multiplayer computer word game developed by Newtoy. Players take turns building words crossword-puzzle style in a manner similar to the classic board game Scrabble. The rules of the two games are similar, but Words with Friends is not associated with the Scrabble brand. Up to 40 games can be played simultaneously using push notifications to alert players when it is their turn. Players may look up friends either by username or through Facebook, or be randomly assigned an opponent through "Smart Match". Players can also find potential opponents using Community Match.

In computer science, Monte Carlo tree search (MCTS) is a heuristic search algorithm for some kinds of decision processes, most notably those employed in software that plays board games. In that context MCTS is used to solve the game tree.

References

  1. pg14, Sheppard 2002