Hash tree (persistent data structure)

Last updated

In computer science, a hash tree (or hash trie ) is a persistent data structure that can be used to implement sets and maps, intended to replace hash tables in purely functional programming. In its basic form, a hash tree stores the hashes of its keys, regarded as strings of bits, in a trie, with the actual keys and (optional) values stored at the trie's "final" nodes. [1]

Trie in computer science: ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings

In computer science, a trie, also called digital tree, radix tree or prefix tree, is a kind of search tree—an ordered tree data structure used to store a dynamic set or associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string. Keys tend to be associated with leaves, though some inner nodes may correspond to keys of interest. Hence, keys are not necessarily associated with every node. For the space-optimized presentation of prefix tree, see compact prefix tree.

In computing, a persistent data structure is a data structure that always preserves the previous version of itself when it is modified. Such data structures are effectively immutable, as their operations do not (visibly) update the structure in-place, but instead always yield a new updated structure. The term was introduced in Driscoll, Sarnak, Sleator, and Tarjans' 1986 article.

In computer science, a set is an abstract data type that can store unique values, without any particular order. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.

Hash array mapped tries and Ctries are refined versions of this data structure, using particular type of trie implementations. [1]

A hash array mapped trie (HAMT) is an implementation of an associative array that combines the characteristics of a hash table and an array mapped trie. It is a refined version of the more general notion of a hash tree.

A concurrent hash-trie or Ctrie is a concurrent thread-safe lock-free implementation of a hash array mapped trie. It is used to implement the concurrent map abstraction. It has particularly scalable concurrent insert and remove operations and is memory-efficient. It is the first known concurrent data-structure that supports O(1), atomic, lock-free snapshots.

Related Research Articles

Hash table Associates data values with key values - a lookup table

In computing, a hash table is a data structure that implements an associative array abstract data type, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.

In computer science, an associative array, map, symbol table, or dictionary is an abstract data type composed of a collection of pairs, such that each possible key appears at most once in the collection.

Radix tree data structure that represents a space-optimized trie in which each node that is the only child is merged with its parent

In computer science, a radix tree is a data structure that represents a space-optimized trie in which each node that is the only child is merged with its parent. The result is that the number of children of every internal node is at most the radix r of the radix tree, where r is a positive integer and a power x of 2, having x ≥ 1. Unlike regular tries, edges can be labeled with sequences of elements as well as single elements. This makes radix trees much more efficient for small sets and for sets of strings that share long prefixes.

Merkle tree tree in which every non-leaf node is labelled with the hash of the labels or values (in case of leaves) of its child nodes

In cryptography and computer science, a hash tree or Merkle tree is a tree in which every leaf node is labelled with the hash of a data block, and every non-leaf node is labelled with the cryptographic hash of the labels of its child nodes. Hash trees allow efficient and secure verification of the contents of large data structures. Hash trees are a generalization of hash lists and hash chains.

In computer science, a ternary search tree is a type of trie where nodes are arranged in a manner similar to a binary search tree, but with up to three children rather than the binary tree's limit of two. Like other prefix trees, a ternary search tree can be used as an associative map structure with the ability for incremental string search. However, ternary search trees are more space efficient compared to standard prefix trees, at the cost of speed. Common applications for ternary search trees include spell-checking and auto-completion.

In computer science, hash trie can refer to:

Java collections framework Collections in Java

The Java collections framework is a set of classes and interfaces that implement commonly reusable collection data structures.

In computer science, a container is a class, a data structure, or an abstract data type (ADT) whose instances are collections of other objects. In other words, they store objects in an organized way that follows specific access rules. The size of the container depends on the number of objects (elements) it contains. Underlying (inherited) implementations of various container types may vary in size and complexity, and provide flexibility in choosing the right implementation for any given scenario.

A prefix hash tree (PHT) is a distributed data structure that enables more sophisticated queries over a distributed hash table (DHT). The prefix hash tree uses the lookup interface of a DHT to construct a trie-based data structure that is both efficient, and resilient.

This Comparison of programming languages compares the features of associative array data structures or array-lookup processing for over 39 various computer programming languages.

In computer science, an x-fast trie is a data structure for storing integers from a bounded domain. It supports exact and predecessor or successor queries in time O(log log M), using O(n log M) space, where n is the number of stored values and M is the maximum value in the domain. The structure was proposed by Dan Willard in 1982, along with the more complicated y-fast trie, as a way to improve the space usage of van Emde Boas trees, while retaining the O(log log M) query time.

In computer science, a y-fast trie is a data structure for storing integers from a bounded domain. It supports exact and predecessor or successor queries in time O(log log M), using O(n) space, where n is the number of stored values and M is the maximum value in the domain. The structure was proposed by Dan Willard in 1982 to decrease the O(n log M) space used by an x-fast trie.

In computer science, hash tree may refer to:

The HAT-Trie is a type of radix trie that uses array nodes to collect individual key–value pairs under radix nodes and hash buckets into an associative array. Unlike a simple hash table, HAT-tries store key–value in an ordered collection. The original inventors are Nikolas Askitis and Ranjan Sinha. Dr. Askitis shows that building and accessing the HAT-trie key/value collection is considerably faster than other sorted access methods and is comparable to the Array Hash which is an unsorted collection. This is due to the cache-friendly nature of the data structure which attempts to group access to data in time and space into the 64 byte cache line size of the modern CPU.

References

  1. 1 2 Phil Bagwell (2000). Ideal Hash Trees (PDF) (Report). Infoscience Department, École Polytechnique Fédérale de Lausanne.