GADDAG

A GADDAG is a data structure presented by Steven Gordon in 1994, for use in generating moves for Scrabble and other word-generation games where such moves require words that "hook into" existing words. It is often in contrast to move-generation algorithms using a directed acyclic word graph (DAWG) such as the one used by Maven. It is generally twice as fast as the traditional DAWG algorithms, but take about 5 times as much space for regulation Scrabble dictionaries.[1]

Quackle, an open-source Scrabble program, uses a GADDAG to generate moves.[2]

Description

The name GADDAG comes from DAG for directed acyclic graph, prefixed by its own reverse.[1]

A GADDAG is a specialization of a Trie, containing states and branches to other GADDAGs. It is distinct for its storage of every reversed prefix of every word in a dictionary. This means every word has as many representations as it does letters; since the average word in most Scrabble regulation dictionaries is 5 letters long, this makes the GADDAG about 5 times as big as a simple DAWG.

Definition

For any word in a dictionary that is formed by a non-empty prefix x and a suffix y, a GADDAG contains a direct, deterministic path for any string REV(x)+y, where + is a concatenation operator.

For example, for the word "explain," a GADDAG will contain direct paths to the strings

e+xplain
xe+plain
pxe+lain
lpxe+ain
alpxe+in
ialpxe+n
nialpxe

This setup enables searching for a word given any letter that occurs in it.

Use in move generation

Any move-generation algorithm must adhere to three types of constraints:

  • Board constraints: one may only build by 'hooking' onto existing letters of the board, and one may only place tiles on empty squares.
  • Rack constraints: one may only place tiles with letters on one's rack.
  • Dictionary constraint: all words resulting from the placement of tiles exist in the game's dictionary.

DAWG-based algorithms take advantage of the second and third constraint: the DAWG is built around the dictionary, and is traverse using tiles in the rack. It fails, however, to address the first constraint: supposing one want to 'hook into' the letter P in HARPY, and the board has 2 spaces before the P, one must search the dictionary for all words containing letters from the rack where the third letter is P. This is non-deterministic when searching through the DAWG, as many searches through the trie will be fruitless.

This is addressed by the GADDAG's storage of prefixes: by traversing the P branch of a GADDAG, one sees all words that have a P somewhere in their composition, and can "travel up" the prefix to form the word with tiles in the rack. To use the example from the § Definition section, searching for P turns up "pxe+lain". The letters between P and the + can be placed above the P on the board, and the rest below it (if space on the board permits).

See also

References

  1. 1 2 Gordon, Steven A. (1994). "A faster Scrabble move generation algorithm" (PDF). Software: Practice and Experience. 24 (2): 219–232.
  2. Jason Katz-Brown; John O'Laughlin. "How Quackle Plays Scrabble". Retrieved 2018-07-02.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.