Quotient filter

Last updated October 27, 2023 • 11 min readFrom Wikipedia, The Free Encyclopedia

A quotient filter is a space-efficient probabilistic data structure used to test whether an element is a member of a set (an approximate membership query filter, AMQ). A query will elicit a reply specifying either that the element is definitely not in the set or that the element is probably in the set. The former result is definitive; i.e., the test does not generate false negatives. But with the latter result there is some probability, ε, of the test returning "element is in the set" when in fact the element is not present in the set (i.e., a false positive). There is a tradeoff between ε, the false positive rate, and storage size; increasing the filter's storage size reduces ε. Other AMQ operations include "insert" and "optionally delete". The more elements are added to the set, the larger the probability of false positives.

A typical application for quotient filters, and other AMQ filters, is to serve as a proxy for the keys in a database on disk. As keys are added to or removed from the database, the filter is updated to reflect this. Any lookup will first consult the fast quotient filter, then look in the (presumably much slower) database only if the quotient filter reported the presence of the key. If the filter returns absence, the key is known not to be in the database without any disk accesses having been performed.

A quotient filter has the usual AMQ operations of insert and query. In addition it can also be merged and re-sized without having to re-hash the original keys (thereby avoiding the need to access those keys from secondary storage). This property benefits certain kinds of log-structured merge-trees.

History

The compact hash table underlying a quotient filter was described by Cleary in 1984.^[1] First known reference to using the structure as an AMQ filter is by Pagh et al. in 2005.^[2] In 2009, Dillinger and Manolios optimized the structure's metadata, added in-place accommodation of more elements, and applied the structure to explicit-state model checking.^[3] In 2011, Bender et al. penned the name "quotient filter", described several variants with different metadata encoding trade-offs, showed how to merge and resize quotient filters, presented a write-optimized version of the quotient filter for use on disk, and applied the structure to database storage problems.^[4]^[5] In 2017, Pandey et al. described a version that uses hardware bit-manipulation instructions to improve performance, supports concurrent updates, and adds support for associating a variable-sized counter to each element.^[6]

Algorithm description

The quotient filter is based on a kind of hash table in which entries contain only a portion of the key plus some additional meta-data bits. These bits are used to deal with the case when distinct keys happen to hash to the same table entry. By way of contrast, other types of hash tables that deal with such collisions by linking to overflow areas are not compact because the overhead due to linkage can exceed the storage used to store the key.^[1] In a quotient filter a hash function generates a p-bit fingerprint. The r least significant bits is called the remainder while the q = p - r most significant bits is called the quotient, hence the name quotienting (coined by Knuth.^[7]) The hash table has 2^q slots.

For some key d which hashes to the fingerprint d_H, let its quotient be d_Q and the remainder be d_R. QF will try to store the remainder in slot d_Q, which is known as the canonical slot. However the canonical slot might already be occupied because multiple keys can hash to the same fingerprint—a hard collision—or because even when the keys' fingerprints are distinct they can have the same quotient—a soft collision. If the canonical slot is occupied then the remainder is stored in some slot to the right.

As described below, the insertion algorithm ensures that all fingerprints having the same quotient are stored in contiguous slots. Such a set of fingerprints is defined as a run.^[4] Note that a run's first fingerprint might not occupy its canonical slot if the run has been forced right by some run to the left.

However a run whose first fingerprint occupies its canonical slot indicates the start of a cluster.^[4] The initial run and all subsequent runs comprise the cluster, which terminates at an unoccupied slot or the start of another cluster.

The three additional bits are used to reconstruct a slot's fingerprint. They have the following function:

is_occupied: is set when a slot is the canonical slot for some key stored (somewhere) in the filter (but not necessarily in this slot).
is_continuation: is set when a slot is occupied but not by the first remainder in a run.
is_shifted: is set when the remainder in a slot is not in its canonical slot.

The various combinations have the following meaning:

is_occupied	is_continuation	is_shifted	meaning
0	0	0	Empty Slot
0	0	1	Slot is holding start of run that has been shifted from its canonical slot.
0	1	0	not used.
0	1	1	Slot is holding continuation of run that has been shifted from its canonical slot.
1	0	0	Slot is holding start of run that is in its canonical slot. This is also the start of the cluster.
1	0	1	Slot is holding start of run that has been shifted from its canonical slot. Also the run for which this is the canonical slot exists but is shifted right.
1	1	0	not used.
1	1	1	Slot is holding continuation of run that has been shifted from its canonical slot. Also the run for which this is the canonical slot exists but is shifted right.

Lookup

We can test if a quotient filter contains some key, d, as follows.^[4]

We hash the key to produce its fingerprint, d_H, which we then partition into its high-order q bits, d_Q, which comprise its quotient, and its low-order r bits, d_R, which comprise its remainder. Slot d_Q is the key's canonical slot. That slot is empty if its three meta-data bits are false. In that case the filter does not contain the key.

If the canonical slot is occupied then we must locate the quotient's run. The set of slots that hold remainders belonging to the same quotient are stored contiguously and these comprise the quotient's run. The first slot in the run might be the canonical slot but it is also possible that the entire run has been shifted to the right by the encroachment from the left of another run.

To locate the quotient's run we must first locate the start of the cluster. The cluster consists of a contiguous set of runs. Starting with the quotient's canonical slot we can scan left to locate the start of the cluster, then scan right to locate the quotient's run.

We scan left looking for a slot with is_shifted is false. This indicates the start of the cluster. Then we scan right keeping a running count of the number of runs we must skip over. Each slot to the left of the canonical slot having is_occupiedset indicates another run to be skipped, so we increment the running count. Each slot having is_continuationclear indicates the start of another run, thus the end of the previous run, so we decrement the running count. When the running count reaches zero, we are scanning the quotient's run. We can compare the remainder in each slot in the run with d_R. If found, we report that the key is (probably) in the filter otherwise we report that the key is definitely not in the filter.

Lookup example

Take, for example, looking up element e. See state 3 in the figure. We would compute hash(e), partition it into its remainder, e_R and its quotient e_Q, which is 4. Scanning left from slot 4 we encounter three is_occupied slots, at indexes 4, 2 and 1, indicating e_Q's run is the 3rd run in the cluster. The scan stops at slot 1, which we detect as the cluster's start because it is not empty and not shifted. Now we must scan right to the 3rd run. The start of a run is indicated by is_continuation being false. The 1st run is found at index 1, the 2nd at 4 and the 3rd at 5. We compare the remainder held in each slot in the run that starts at index 5. There is only one slot in that run but, in our example, its remainder equals e_R, indicating that e is indeed a member of the filter, with probability 1 - ε.

Insertion

Insertion follows a path similar to lookup until we ascertain that the key is definitely not in the filter.^[4] At that point we insert the remainder in a slot in the current run, a slot chosen to keep the run in sorted order. We shift forward the remainders in any slots in the cluster at or after the chosen slot and update the slot bits.

Shifting a slot's remainder does not affect the slot's is_occupied bit because it pertains to the slot, not the remainder contained in the slot.
If we insert a remainder at the start of an existing run, the previous remainder is shifted and becomes a continuation slot, so we set its is_continuation bit.
We set the is_shifted bit of any remainder that we shift.

Insertion example

The figure shows a quotient filter proceeding through a series of states as elements are added. In state 1 three elements have been added. The slot each one occupies forms a one-slot run which is also a distinct cluster.

In state 2 elements c and d have been added. Element c has a quotient of 1, the same as b. We assume b_R < c_R so c_R is shifted into slot 2, and is marked as both a continuation and shifted. Element d has a quotient of 2. Since its canonical slot is in use, it is shifted into slot 3, and is marked as shifted. In addition its canonical slot is marked as occupied. The runs for quotients 1 and 2 now comprise a cluster.

In state 3 element a has been added. Its quotient is 1. We assume a_R < b_R so the remainders in slots 1 through 4 must be shifted. Slot 2 receives b_R and is marked as a continuation and shifted. Slot 5 receives e_R and is marked as shifted. The runs for quotients 1, 2 and 4 now comprise a cluster, and the presence of those three runs in the cluster is indicated by having slots 1, 2 and 4 being marked as occupied.

Cost/performance

Cluster length

Bender^[4] argues that clusters are small. This is important because lookups and inserts require locating the start and length of an entire cluster. If the hash function generates uniformly distributed fingerprints then the length of most runs is O(1) and it is highly likely that all runs have length O(log m) where m is the number of slots in the table.^[4]

Probability of false positives

Bender^[4] calculates the probability of a false positive (i.e. when the hash of two keys results in the same fingerprint) in terms of the hash table's remainder size and load factor. Recall that a p bit fingerprint is partitioned into a q bit quotient, which determines the table size of m = 2^q slots, and a r bit remainder. The load factor $\alpha$ is the proportion of occupied slots n to total slots m: $\alpha =n/m$ . Then, for a good hash function, $1-e^{-\alpha /2^{r}}\leq 2^{-r}$ is approximately the probability of a hard collision.

Space/performance

Pandey's version of the quotient filter requires less space than a comparable Bloom filter when the target false-positive rate is less than 1/64.^[6]

Application

Quotient filters are AMQs and, as such, provide many of the same benefits as Bloom filters. A large database, such as Webtable^[8] may be composed of smaller sub-tables each of which has an associated filter. Each query is distributed concurrently to all sub-tables. If a sub-table does not contain the requested element, its filter can quickly complete the request without incurring any I/O.

Quotient filters offer two benefits in some applications.

Two quotient filters can be efficiently merged without affecting their false positive rates. This is not possible with Bloom filters.
A few duplicates can be tolerated efficiently and can be deleted.

The space used by quotient filters is comparable to that of Bloom filters. However quotient filters can be efficiently merged within memory without having to re-insert the original keys.

This is particularly important in some log structured storage systems that use the log-structured merge-tree or LSM-tree.^[9] The LSM-tree is actually a collection of trees but which is treated as a single key-value store. One variation of the LSM-Tree is the Sorted Array Merge Tree or SAMT.^[10] In this variation, a SAMT's component trees are called Wanna-B-trees. Each Wanna-B-tree has an associated quotient filter. A query on the SAMT is directed at only select Wanna-B-trees as evidenced by their quotient filters.

The storage system in its normal operation compacts the SAMT's Wanna-B-trees, merging smaller Wanna-B-trees into larger ones and merging their quotient filters. An essential property of quotient filters is that they can be efficiently merged without having to re-insert the original keys. Given that for large data sets the Wanna-B-trees may not be in memory, accessing them to retrieve the original keys would incur many I/Os.

By construction the values in a quotient filter are stored in sorted order. Each run is associated with a specific quotient value, which provides the most significant portion of the fingerprint, the runs are stored in order and each slot in the run provides the least significant portion of the fingerprint.

So, by working from left to right, one can reconstruct all the fingerprints and the resulting list of integers will be in sorted order. Merging two quotient filters is then a simple matter of converting each quotient filter into such a list, merging the two lists and using it to populate a new larger quotient filter. Similarly, we can halve or double the size of a quotient filter without rehashing the keys since the fingerprints can be recomputed using just the quotients and remainders.^[4]

Notes

1 2 Cleary, John G. (September 1984). "Compact hash tables using bidirectional linear probing". IEEE Transactions on Computers. 33 (9): 828–834. doi:10.1109/TC.1984.1676499. S2CID 195908955.
↑ Pagh, Anna; Pagh, Rasmus; Rao, S. Srinivasa (2005). "An optimal Bloom filter replacement" (PDF). Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms. pp. 823–829. Archived from the original (PDF) on 2012-02-04. Retrieved 2019-01-22.
↑ Dillinger, Peter C.; Manolios, Panagiotis (2009). "Fast, All-Purpose State Storage". 16th International SPIN Workshop on Model Checking Software. Springer, Lecture Notes in Computer Science 5578.
1 2 3 4 5 6 7 8 9 Bender, Michael A.; Farach-Colton, Martin; Johnson, Rob; Kuszmaul, Bradley C.; Medjedovic, Dzejla; Montes, Pablo; Shetty, Pradeep; Spillane, Richard P.; Zadok, Erez (June 2011). "Don't thrash: how to cache your hash on flash" (PDF). Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems (HotStorage'11). Retrieved 21 July 2012.
↑ Bender, Michael A.; Farach-Colton, Martin; Johnson, Rob; Kraner, Russell; Kuszmaul, Bradley C.; Medjedovic, Dzejla; Montes, Pablo; Shetty, Pradeep; Spillane, Richard P.; Zadok, Erez (27–31 August 2012). "Don't thrash: how to cache your hash on flash" (PDF). Proceedings of the VLDB Endowment. 5 (11): 1627–1637. arXiv: 1208.0290 . Bibcode:2012arXiv1208.0290B. doi:10.14778/2350229.2350275. S2CID 47180056.
1 2 Pandey, Prashant; Bender, Michael A.; Johnson, Rob; Patro, Rob (May 2017). "A General-Purpose Counting Filter: Making Every Bit Count". Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). doi: 10.1145/3035918.3035963 .
↑ Knuth, Donald (1973). The Art of Computer Programming:Searching and Sorting, volume 3. Section 6.4, exercise 13: Addison Wesley.{{cite book}}: CS1 maint: location (link)
↑ Chang, Fay; et al. (2006). "Bigtable: A Distributed Storage System for Structured Data" (PDF). OSDI '06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation: 15. Retrieved 21 July 2012.
↑ O'Neil, Patrick; et al. (1996). "The log-structured merge-tree (LSM-tree)". Acta Informatica. 33 (4): 351–385. doi:10.1007/s002360050048. S2CID 12627452.
↑ Spillane, Richard (May 2012). "Efficient, Scalable, and Versatile Application and System Transaction Management for Direct Storage Layers" (PDF). Retrieved 21 July 2012.{{cite journal}}: Cite journal requires |journal= (help)

External links

Media related to Quotient filter at Wikimedia Commons

Related Research Articles

In computer science, an array is a data structure consisting of a collection of elements, of same memory size, each identified by at least one array index or key. An array is stored such that the position of each element can be computed from its index tuple by a mathematical formula. The simplest type of data structure is a linear array, also called one-dimensional array.

In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array.

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

In computing, a hash table, also known as a hash map, is a data structure that implements an associative array, also called a dictionary, which is an abstract data type that maps keys to values. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored.

In computer science, a trie, also called digital tree or prefix tree, is a type of k-ary search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes defined not by the entire key, but by individual characters. In order to access a key, the trie is traversed depth-first, following the links between nodes, which represent each character in the key.

In computer science, a set is an abstract data type that can store unique values, without any particular order. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.

A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed ; the more items added, the larger the probability of false positives.

A van Emde Boas tree, also known as a vEB tree or van Emde Boas priority queue, is a tree data structure which implements an associative array with $m$ -bit integer keys. It was invented by a team led by Dutch computer scientist Peter van Emde Boas in 1975. It performs all operations in $O (log m)$ time, or equivalently in $O (log log M)$ time, where $M = 2 m$ is the largest element that can be stored in the tree. The parameter $M$ is not to be confused with the actual number of elements stored in the tree, by which the performance of other tree data-structures is often measured.

A bit array is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level parallelism in hardware to perform operations quickly. A typical bit array stores kw bits, where w is the number of bits in the unit of storage, such as a byte or word, and k is some nonnegative integer. If w does not divide the number of bits to be stored, some space is wasted due to internal fragmentation.

<span class="mw-page-title-main">Open addressing</span> Hash collision resolution technique

Open addressing, or closed hashing, is a method of collision resolution in hash tables. With this method a hash collision is resolved by probing, or searching through alternative locations in the array until either the target record is found, or an unused array slot is found, which indicates that there is no such key in the table. Well-known probe sequences include:

A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database table every time said table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.

A rainbow table is a precomputed table for caching the outputs of a cryptographic hash function, usually for cracking password hashes. Passwords are typically stored not in plain text form, but as hash values. If such a database of hashed passwords falls into the hands of an attacker, they can use a precomputed rainbow table to recover the plaintext passwords. A common defense against this attack is to compute the hashes using a key derivation function that adds a "salt" to each password before hashing it, with different passwords receiving different salts, which are stored in plain text along with the hash.

In computer science, consistent hashing is a special kind of hashing technique such that when a hash table is resized, only $keys need to be remapped on average where is the number of keys and is the number of slots. In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped because the mapping between the keys and the slots is defined by a modular operation.$

Cuckoo hashing is a scheme in computer programming for resolving hash collisions of values of hash functions in a table, with worst-case constant lookup time. The name derives from the behavior of some species of cuckoo, where the cuckoo chick pushes the other eggs or young out of the nest when it hatches in a variation of the behavior referred to as brood parasitism; analogously, inserting a new key into a cuckoo hashing table may push an older key to a different location in the table.

F2FS is a flash file system initially developed by Samsung Electronics for the Linux kernel.

A cuckoo filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set, like a Bloom filter does. False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set". A cuckoo filter can also delete existing items, which is not supported by Bloom filters. In addition, for applications that store many items and target moderately low false positive rates, cuckoo filters can achieve lower space overhead than space-optimized Bloom filters.

In computer science, a retrieval data structure, also known as static function, is a space-efficient dictionary-like data type composed of a collection of pairs that allows the following operations:

Approximate Membership Query Filter (AMQ-Filter) is a group of space-efficient probabilistic data structures that supports approximate membership queries. An approximate membership query answers if an element is in a set or not with a false positive rate of $.$

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Cleary-1] 1 2 Cleary, John G. (September 1984). "Compact hash tables using bidirectional linear probing". IEEE Transactions on Computers. 33 (9): 828–834. doi:10.1109/TC.1984.1676499. S2CID 195908955.

[Pagh-2] Pagh, Anna; Pagh, Rasmus; Rao, S. Srinivasa (2005). "An optimal Bloom filter replacement" (PDF). Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms. pp. 823–829. Archived from the original (PDF) on 2012-02-04. Retrieved 2019-01-22.

[Dillinger-3] Dillinger, Peter C.; Manolios, Panagiotis (2009). "Fast, All-Purpose State Storage". 16th International SPIN Workshop on Model Checking Software. Springer, Lecture Notes in Computer Science 5578.

[Bender-4] 1 2 3 4 5 6 7 8 9 Bender, Michael A.; Farach-Colton, Martin; Johnson, Rob; Kuszmaul, Bradley C.; Medjedovic, Dzejla; Montes, Pablo; Shetty, Pradeep; Spillane, Richard P.; Zadok, Erez (June 2011). "Don't thrash: how to cache your hash on flash" (PDF). Proceedings of the 3rd USENIX conference on Hot topics in storage and file systems (HotStorage'11). Retrieved 21 July 2012.

[5] Bender, Michael A.; Farach-Colton, Martin; Johnson, Rob; Kraner, Russell; Kuszmaul, Bradley C.; Medjedovic, Dzejla; Montes, Pablo; Shetty, Pradeep; Spillane, Richard P.; Zadok, Erez (27–31 August 2012). "Don't thrash: how to cache your hash on flash" (PDF). Proceedings of the VLDB Endowment. 5 (11): 1627–1637. arXiv: 1208.0290 . Bibcode:2012arXiv1208.0290B. doi:10.14778/2350229.2350275. S2CID 47180056.

[Pandey-6] 1 2 Pandey, Prashant; Bender, Michael A.; Johnson, Rob; Patro, Rob (May 2017). "A General-Purpose Counting Filter: Making Every Bit Count". Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD '17). doi: 10.1145/3035918.3035963 .

[7] Knuth, Donald (1973). The Art of Computer Programming:Searching and Sorting, volume 3. Section 6.4, exercise 13: Addison Wesley.{{cite book}}: CS1 maint: location (link)

[8] Chang, Fay; et al. (2006). "Bigtable: A Distributed Storage System for Structured Data" (PDF). OSDI '06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation: 15. Retrieved 21 July 2012.

[9] O'Neil, Patrick; et al. (1996). "The log-structured merge-tree (LSM-tree)". Acta Informatica. 33 (4): 351–385. doi:10.1007/s002360050048. S2CID 12627452.

[Spillane-10] Spillane, Richard (May 2012). "Efficient, Scalable, and Versatile Application and System Transaction Management for Direct Storage Layers" (PDF). Retrieved 21 July 2012.{{cite journal}}: Cite journal requires |journal= (help)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]