Extendible hashing

Last updated

Extendible hashing is a type of hash system which treats a hash as a bit string and uses a trie for bucket lookup. [1] Because of the hierarchical nature of the system, re-hashing is an incremental operation (done one bucket at a time, as needed). This means that time-sensitive applications are less affected by table growth than by standard full-table rehashes.

Contents

Extendible hashing was described by Ronald Fagin in 1979. Practically all modern filesystems use either extendible hashing or B-trees. In particular, the Global File System, ZFS, and the SpadFS filesystem use extendible hashing. [2]

Example

Assume that the hash function returns a string of bits. The first bits of each string will be used as indices to figure out where they will go in the "directory" (hash table), where is the smallest number such that the index of every item in the table is unique.

Keys to be used:

Let's assume that for this particular example, the bucket size is 1. The first two keys to be inserted, k1 and k2, can be distinguished by the most significant bit, and would be inserted into the table as follows:

Extendible hashing 1.svg

Now, if k3 were to be hashed to the table, it wouldn't be enough to distinguish all three keys by one bit (because both k3 and k1 have 1 as their leftmost bit). Also, because the bucket size is one, the table would overflow. Because comparing the first two most significant bits would give each key a unique location, the directory size is doubled as follows:

Extendible hashing 2.svg

And so now k1 and k3 have a unique location, being distinguished by the first two leftmost bits. Because k2 is in the top half of the table, both 00 and 01 point to it because there is no other key to compare to that begins with a 0.

The above example is from Fagin et al. (1979).

Further detail

Now, k4 needs to be inserted, and it has the first two bits as 01..(1110), and using a 2 bit depth in the directory, this maps from 01 to Bucket A. Bucket A is full (max size 1), so it must be split; because there is more than one pointer to Bucket A, there is no need to increase the directory size.

What is needed is information about:

  1. The key size that maps the directory (the global depth), and
  2. The key size that has previously mapped the bucket (the local depth)

In order to distinguish the two action cases:

  1. Doubling the directory when a bucket becomes full
  2. Creating a new bucket, and re-distributing the entries between the old and the new bucket

Examining the initial case of an extendible hash structure, if each directory entry points to one bucket, then the local depth should be equal to the global depth.

The number of directory entries is equal to 2global depth, and the initial number of buckets is equal to 2local depth.

Thus if global depth = local depth = 0, then 20 = 1, so an initial directory of one pointer to one bucket.

Back to the two action cases; if the bucket is full:

  1. If the local depth is equal to the global depth, then there is only one pointer to the bucket, and there is no other directory pointers that can map to the bucket, so the directory must be doubled.
  2. If the local depth is less than the global depth, then there exists more than one pointer from the directory to the bucket, and the bucket can be split.
Extendible hashing 3.svg

Key 01 points to Bucket A, and Bucket A's local depth of 1 is less than the directory's global depth of 2, which means keys hashed to Bucket A have only used a 1 bit prefix (i.e. 0), and the bucket needs to have its contents split using keys 1 + 1 = 2 bits in length; in general, for any local depth d where d is less than D, the global depth, then d must be incremented after a bucket split, and the new d used as the number of bits of each entry's key to redistribute the entries of the former bucket into the new buckets.

Extendible hashing 4.svg

Now,

is tried again, with 2 bits 01.., and now key 01 points to a new bucket but there is still in it ( and also begins with 01).

If had been 000110, with key 00, there would have been no problem, because would have remained in the new bucket A' and bucket D would have been empty.

(This would have been the most likely case by far when buckets are of greater size than 1 and the newly split buckets would be exceedingly unlikely to overflow, unless all the entries were all rehashed to one bucket again. But just to emphasize the role of the depth information, the example will be pursued logically to the end.)

So Bucket D needs to be split, but a check of its local depth, which is 2, is the same as the global depth, which is 2, so the directory must be split again, in order to hold keys of sufficient detail, e.g. 3 bits.

Extendible hashing 5.svg
  1. Bucket D needs to split due to being full.
  2. As D's local depth = the global depth, the directory must double to increase bit detail of keys.
  3. Global depth has incremented after directory split to 3.
  4. The new entry is rekeyed with global depth 3 bits and ends up in D which has local depth 2, which can now be incremented to 3 and D can be split to D' and E.
  5. The contents of the split bucket D, , has been re-keyed with 3 bits, and it ends up in D'.
  6. K4 is retried and it ends up in E which has a spare slot.
Extendible hashing 6.svg

Now, is in D and is tried again, with 3 bits 011.., and it points to bucket D which already contains so is full; D's local depth is 2 but now the global depth is 3 after the directory doubling, so now D can be split into bucket's D' and E, the contents of D, has its retried with a new global depth bitmask of 3 and ends up in D', then the new entry is retried with bitmasked using the new global depth bit count of 3 and this gives 011 which now points to a new bucket E which is empty. So goes in Bucket E.

Example implementation

Below is the extendible hashing algorithm in Python, with the disc block / memory page association, caching and consistency issues removed. Note a problem exists if the depth exceeds the bit size of an integer, because then doubling of the directory or splitting of a bucket won't allow entries to be rehashed to different buckets.

The code uses the least significant bits , which makes it more efficient to expand the table, as the entire directory can be copied as one block (Ramakrishnan & Gehrke (2003)).

Python example

PAGE_SZ=10classPage:def__init__(self)->None:self.map=[]self.local_depth=0deffull(self)->bool:returnlen(self.map)>=PAGE_SZdefput(self,k,v)->None:fori,(key,value)inenumerate(self.map):ifkey==k:delself.map[i]breakself.map.append((k,v))defget(self,k):forkey,valueinself.map:ifkey==k:returnvaluedefget_local_high_bit(self):return1<<self.local_depthclassExtendibleHashing:def__init__(self)->None:self.global_depth=0self.directory=[Page()]defget_page(self,k):h=hash(k)returnself.directory[h&((1<<self.global_depth)-1)]defput(self,k,v)->None:p=self.get_page(k)full=p.full()p.put(k,v)iffull:ifp.local_depth==self.global_depth:self.directory*=2self.global_depth+=1p0=Page()p1=Page()p0.local_depth=p1.local_depth=p.local_depth+1high_bit=p.get_local_high_bit()fork2,v2inp.map:h=hash(k2)new_p=p1ifh&high_bitelsep0new_p.put(k2,v2)foriinrange(hash(k)&(high_bit-1),len(self.directory),high_bit):self.directory[i]=p1ifi&high_bitelsep0defget(self,k):returnself.get_page(k).get(k)if__name__=="__main__":eh=ExtendibleHashing()N=10088l=list(range(N))importrandomrandom.shuffle(l)forxinl:eh.put(x,x)print(l)foriinrange(N):print(eh.get(i))

Notes

  1. Fagin et al. (1979).
  2. Mikuláš Patocka (2006). Design and Implementation of the Spad Filesystem (PDF) (Thesis). Archived from the original (PDF) on 2016-03-15. Retrieved 2016-02-27. "Section 4.1.6 Extendible hashing: ZFS and GFS" and "Table 4.1: Directory organization in filesystems"

See also

Related Research Articles

<span class="mw-page-title-main">Hash function</span> Mapping arbitrary data to fixed-size values

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

<span class="mw-page-title-main">Hash table</span> Associative array for storing key-value pairs

In computing, a hash table, also known as a hash map or a hash set, is a data structure that implements an associative array, also called a dictionary, which is an abstract data type that maps keys to values. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored.

<span class="mw-page-title-main">Trie</span> K-ary search tree data structure

In computer science, a trie, also called digital tree or prefix tree, is a type of k-ary search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes defined not by the entire key, but by individual characters. In order to access a key, the trie is traversed depth-first, following the links between nodes, which represent each character in the key.

In computer science, an associative array, map, symbol table, or dictionary is an abstract data type that stores a collection of pairs, such that each possible key appears at most once in the collection. In mathematical terms, an associative array is a function with finite domain. It supports 'lookup', 'remove', and 'insert' operations.

<span class="mw-page-title-main">Distributed hash table</span> Decentralized distributed system with lookup service

A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The main advantage of a DHT is that nodes can be added or removed with minimum work around re-distributing keys. Keys are unique identifiers which map to particular values, which in turn can be anything from addresses, to documents, to arbitrary data. Responsibility for maintaining the mapping from keys to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows a DHT to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

<span class="mw-page-title-main">Perfect hash function</span> Hash function without any collisions

In computer science, a perfect hash functionh for a set S is a hash function that maps distinct elements in S to a set of m integers, with no collisions. In mathematical terms, it is an injective function.

Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups. Kademlia nodes communicate among themselves using UDP. A virtual or overlay network is formed by the participant nodes. Each node is identified by a number or node ID. The node ID serves not only as identification, but the Kademlia algorithm uses the node ID to locate values.

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed ; the more items added, the larger the probability of false positives.

On the Amiga, the Old File System was the filesystem for AmigaOS before the Amiga Fast File System. Even though it used 512-byte blocks, it reserved the first small portion of each block for metadata, leaving an actual data block capacity of 488 bytes per block. It wasn't very suitable for anything except floppy disks, and it was soon replaced.

A B+ tree is an m-ary tree with a variable but often large number of children per node. A B+ tree consists of a root, internal nodes and leaves. The root may be either a leaf or a node with two or more children.

Linear hashing (LH) is a dynamic data structure which implements a hash table and grows or shrinks one bucket at a time. It was invented by Witold Litwin in 1980. It has been analyzed by Baeza-Yates and Soza-Pollman. It is the first in a number of schemes known as dynamic hashing such as Larson's Linear Hashing with Partial Extensions, Linear Hashing with Priority Splitting, Linear Hashing with Partial Expansions and Priority Splitting, or Recursive Linear Hashing.

<span class="mw-page-title-main">Cuckoo hashing</span> Data structure hashing scheme

Cuckoo hashing is a scheme in computer programming for resolving hash collisions of values of hash functions in a table, with worst-case constant lookup time. The name derives from the behavior of some species of cuckoo, where the cuckoo chick pushes the other eggs or young out of the nest when it hatches in a variation of the behavior referred to as brood parasitism; analogously, inserting a new key into a cuckoo hashing table may push an older key to a different location in the table.

A rolling hash is a hash function where the input is hashed in a window that moves through the input.

In computer science, dynamic perfect hashing is a programming technique for resolving collisions in a hash table data structure. While more memory-intensive than its hash table counterparts, this technique is useful for situations where fast queries, insertions, and deletions must be made on a large set of elements.

<span class="mw-page-title-main">Hopscotch hashing</span>

Hopscotch hashing is a scheme in computer programming for resolving hash collisions of values of hash functions in a table using open addressing. It is also well suited for implementing a concurrent hash table. Hopscotch hashing was introduced by Maurice Herlihy, Nir Shavit and Moran Tzafrir in 2008. The name is derived from the sequence of hops that characterize the table's insertion algorithm.

In computer science, tabulation hashing is a method for constructing universal families of hash functions by combining table lookup with exclusive or operations. It was first studied in the form of Zobrist hashing for computer games; later work by Carter and Wegman extended this method to arbitrary fixed-length keys. Generalizations of tabulation hashing have also been developed that can handle variable-length keys such as text strings.

A concurrent hash-trie or Ctrie is a concurrent thread-safe lock-free implementation of a hash array mapped trie. It is used to implement the concurrent map abstraction. It has particularly scalable concurrent insert and remove operations and is memory-efficient. It is the first known concurrent data-structure that supports O(1), atomic, lock-free snapshots.

F2FS is a flash file system initially developed by Samsung Electronics for the Linux kernel.

In computer science, a retrieval data structure, also known as static function, is a space-efficient dictionary-like data type composed of a collection of pairs that allows the following operations:

Spiral hashing, also known as Spiral Storage is an extensible hashing algorithm. As in all hashing schemes, spiral hashing stores records in a varying number of buckets, using a record key for addressing. In an expanding Linear hashing file, buckets are split in a predefined order. This results in adding a new bucket at the end of the file. While this allows gradual reorganization of the file, the expected number of records in the newly created bucket and the bucket from what it splits falls to half the previous number. Several attempts were made to alleviate this sudden drop in space utilization. Martin's spiral storage uses a different approach. The file consists of a number of continuously numbered buckets. The lower-numbered (left) buckets have a higher expected number of records. When the file expands, the left-most bucket is replaced by two buckets on the right. Some variants of this idea exist.

References