GADDAG

Last updated

A GADDAG is a data structure presented by Steven Gordon in 1994, for use in generating moves for Scrabble and other word-generation games where such moves require words that "hook into" existing words. It is often in contrast to move-generation algorithms using a directed acyclic word graph (DAWG) such as the one used by Maven. It is generally twice as fast as the traditional DAWG algorithms, but take about 5 times as much space for regulation Scrabble dictionaries. [1]

Contents

Quackle, an open-source Scrabble program, uses a GADDAG to generate moves. [2]

Description

The name GADDAG comes from DAG for directed acyclic graph, prefixed by its own reverse. [1]

A GADDAG is a specialization of a Trie, containing states and branches to other GADDAGs. It is distinct for its storage of every reversed prefix of every word in a dictionary. This means every word has as many representations as it does letters; since the average word in most Scrabble regulation dictionaries is 5 letters long, this makes the GADDAG about 5 times as big as a simple DAWG.

Definition

For any word in a dictionary that is formed by a non-empty prefix x and a suffix y, a GADDAG contains a direct, deterministic path for any string REV(x)+y, where + is a concatenation operator.

For example, for the word "explain," a GADDAG will contain direct paths to the strings

e+xplain xe+plain pxe+lain lpxe+ain alpxe+in ialpxe+n nialpxe

This setup enables searching for a word given any letter that occurs in it.

Use in move generation

Any move-generation algorithm must adhere to three types of constraints:

DAWG-based algorithms take advantage of the second and third constraint: the DAWG is built around the dictionary, and is traversed using tiles in the rack. It fails, however, to address the first constraint: supposing one want to 'hook into' the letter P in HARPY, and the board has 2 spaces before the P, one must search the dictionary for all words containing letters from the rack where the third letter is P. This is inefficient when searching through the DAWG, as many searches through the trie will be fruitless.

This is addressed by the GADDAG's storage of prefixes: by traversing the P branch of a GADDAG, one sees all words that have a P somewhere in their composition, and can "travel up" the prefix to form the word with tiles in the rack. To use the example from the § Definition section, searching for P turns up "pxe+lain". The letters between P and the + can be placed above the P on the board, and the rest below it (if space on the board permits).

See also

Related Research Articles

Scrabble is a word game in which two to four players score points by placing tiles, each bearing a single letter, onto a game board divided into a 15×15 grid of squares. The tiles must form words that, in crossword fashion, read left to right in rows or downward in columns, and be included in a standard dictionary or lexicon.

Trie A type of search tree data structure

In computer science, a trie, also called digital tree or prefix tree, is a kind of search tree—an ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated; i.e., the value of the key is distributed across the structure. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string. Keys tend to be associated with leaves, though some inner nodes may correspond to keys of interest. Hence, keys are not necessarily associated with every node. For the space-optimized presentation of prefix tree, see compact prefix tree.

Upwords Board game

Upwords is a board game invented by Elliot Rudell and originally published by the Milton Bradley Company, now a division of Hasbro. Worldwide marketing rights to UPWORDS have been licensed to Spin Master Inc. by Rudell Design, LLC as of 2018. Upwords is similar to Scrabble, or Words With Friends, in that players build words using letter tiles on a gridded gameboard. The point of difference is that in Upwords letters can be stacked on top of other letters already on the gameboard to create new words. The higher the stack of letters, the more points are scored. This typically makes words built in later turns of the game more valuable than earlier words, increasing play intensity and adding a level of strategy unique to Upwords. The memorization of two-letter words is considered a useful skill in this game.

<i>Boggle</i>

Boggle is a word game invented by Allan Turoff and originally distributed by Parker Brothers. The game is played using a plastic grid of lettered dice, in which players attempt to find words in sequences of adjacent letters.

In computer science, the Aho–Corasick algorithm is a string-searching algorithm invented by Alfred V. Aho and Margaret J. Corasick. It is a kind of dictionary-matching algorithm that locates elements of a finite set of strings within an input text. It matches all strings simultaneously. The complexity of the algorithm is linear in the length of the strings plus the length of the searched text plus the number of output matches. Note that because all matches are found, there can be a quadratic number of matches if every substring matches.

Maven is an artificial intelligence Scrabble player, created by Brian Sheppard. It has been used in official licensed Hasbro Scrabble games.

Suffix tree Tree containing all suffixes of a given text

In computer science, a suffix tree is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations.

Super Scrabble

Super Scrabble is a board game introduced in 2004 and a variant of Scrabble. It is played on a 21×21 grid board instead of Scrabble's usual 15×15, and uses twice as many letter tiles.

A blanagram is a word which is an anagram of another but for the substitution of a single letter. The term has its origin in competitive Scrabble, where a blank tile on a player's rack may be used to form any of several possible words in conjunction with the player's other tiles.

Scrabble variants are games created by changing the normal Scrabble rules or equipment.

In computer science, a ternary search tree is a type of trie where nodes are arranged in a manner similar to a binary search tree, but with up to three children rather than the binary tree's limit of two. Like other prefix trees, a ternary search tree can be used as an associative map structure with the ability for incremental string search. However, ternary search trees are more space efficient compared to standard prefix trees, at the cost of speed. Common applications for ternary search trees include spell-checking and auto-completion.

Francophone Scrabble

Francophone Scrabble, or French-language Scrabble, is played by many thousands of amateurs throughout the world and the Fédération internationale de Scrabble francophone has more than 20,000 members. Just as in English, points are scored by playing valid words from the lettered tiles. In French there are 102 tiles - 100 lettered tiles and two blanks known as jokers. The official word list for Francophone Scrabble is L'Officiel du jeu Scrabble.

Deterministic acyclic finite state automaton

In computer science, a deterministic acyclic finite state automaton (DAFSA), also called a directed acyclic word graph is a data structure that represents a set of strings, and allows for a query operation that tests whether a given string belongs to the set in time proportional to its length. Algorithms exist to construct and maintain such automata, while keeping them minimal.

Tile tracking

Tile tracking is a technique most commonly associated with the game of Scrabble and similar word games. It refers to the practice of keeping track of letters played on the game board, typically by crossing letters off a score sheet or tracking grid as the tiles are played. Tracking tiles can be an important aid to strategy, especially during the endgame when there are no tiles left to draw, where careful tracking allows each player to deduce the remaining unseen letters on the opponent's final rack. The marking off of each letter from a pre-printed tracking grid as the tiles are played is a standard feature of tournament play.

<i>The Computer Edition of Scrabble</i>

The Computer Edition of Scrabble is a computer game developed by Leisure Genius for the Macintosh in 1988, and was an official computerized version of the board game Scrabble.

<i>Words with Friends</i> Multiplayer crossword style video game

Words with Friends is a multiplayer word game developed by Newtoy. Players take turns building words crossword-puzzle style in a manner similar to the classic board game Scrabble. The rules of the two games are similar, but Words with Friends is not associated with the Scrabble brand. Up to 40 games can be played simultaneously using push notifications to alert players when it is their turn. Players may look up friends either by username or through Facebook, or be randomly assigned an opponent through "Smart Match". Players can also find potential opponents using Community Match.

<i>Scrabble Showdown</i>

Scrabble Showdown was an American game show created for the American cable network The Hub. The program was based on the board game Scrabble and was hosted by Justin Willman. It ran from September 3, 2011, to April 15, 2012.

Suffix automaton Deterministic finite automaton accepting set of all suffixes of particular string

In computer science, a suffix automaton is an efficient data structure for representing the substring index of a given string which allows the storage, processing, and retrieval of compressed information about all its substrings. The suffix automaton of a string is the smallest directed acyclic graph with a dedicated initial vertex and a set of "final" vertices, such that paths from the initial vertex to final vertices represent the suffixes of the string. Formally speaking, a suffix automaton is defined by the following set of properties:

  1. Its arcs are tagged with letters;
  2. none of its nodes have two outgoing arcs tagged with the same letter;
  3. for every suffix of there exists a path from initial vertex to some final vertex such that the concatenation of letters on the path forms this suffix;
  4. it has the least number of vertices among all graphs defined by the properties above.

Scrabble is an official computerized version of the board game of the same name.

References

  1. 1 2 Gordon, Steven A. (1994). "A faster Scrabble move generation algorithm" (PDF). Software: Practice and Experience. 24 (2): 219–232. doi:10.1002/spe.4380240205.
  2. Jason Katz-Brown; John O'Laughlin. "How Quackle Plays Scrabble" . Retrieved 2018-07-02.