Non-cryptographic hash function

Last updated December 14, 2024

The non-cryptographic hash functions (NCHFs^[1]) are hash functions intended for applications that do not need the rigorous security requirements of the cryptographic hash functions (e.g., preimage resistance) and therefore can be faster and less resource-intensive.^[2] Typical examples of CPU-optimized non-cryptographic hashes include FNV-1a and Murmur3.^[3] Some non-cryptographic hash functions are used in cryptographic applications (usually in combination with other cryptographic primitives); in this case they are described as universal hash functions.^[4]

Applications and requirements

Among the typical uses of non-cryptographic hash functions are bloom filters, hash tables, and count sketches. These applications require, in addition to speed, uniform distribution and avalanche properties.^[3] Collision resistance is an additional feature that can be useful against hash flooding attacks; simple NCHFs, like the cyclic redundancy check (CRC), have essentially no collision resistance^[5] and thus cannot be used with an input open to manipulation by an attacker.

NCHFs are used in diverse systems: lexical analyzers, compilers, databases, communication networks, video games, DNS servers, filesystems—anywhere in computing where there is a need to find the information very quickly (preferably in the O(1) time, which will also achieve perfect scalability).^[6]

Estébanez et al. list the "most important" NCHFs:^[7]

The Fowler–Noll–Vo hash function (FNV) was created by Glenn Fowler and Phong Vo in 1991 with contributions from Landon Curt Noll. FNV with its two variants, FNV-1 and FNV-1a, is very widely used in Linux, FreeBSD OSes, DNS servers, NFS, Twitter, PlayStation 2, and Xbox, among others.
lookup3 was created by Robert Jenkins. This hash is also widely used and can be found in PostgreSQL, Linux, Perl, Ruby, and Infoseek.
SuperFastHash was created by Paul Hsieh using ideas from FNV and lookup3, with one of the goals being a high degree of avalanche effect. The hash is used in WebKit (part of Safari and Google Chrome).
MurmurHash2 was created by Austin Appleby in 2008 and is used in libmemcached, Maatkit, and Apache Hadoop.
DJBX33A ("Daniel J. Bernstein, Times 33 with Addition"). This very simple multiplication-and-addition function was proposed by Daniel J. Bernstein. It is fast and efficient during initialization. Many programming environments based on PHP 5, Python, and ASP.NET use variants of this hash. The hash is easy to flood, exposing the servers.
BuzHash was created by Robert Uzgalis in 1992. It is designed around a substitution table and can tolerate extremely skewed distributions on the input.
DEK is an early multiplicative hash based on a proposal by Donald Knuth and is one of the oldest hashes that is still in use.

Design

Non-cryptographic hash functions optimized for software frequently involve the multiplication operation. Since in-hardware multiplication is resource-intensive and frequency-limiting, ASIC-friendlier designs had been proposed, including SipHash (which has an additional benefit of being able to use a secret key for message authentication), NSGAhash, and XORhash. Although technically lightweight cryptography can be used for the same applications, the latency of its algorithms is usually too high due to a large number of rounds.^[3] Sateesan et al. propose using the reduced-round versions of lightweight hashes and ciphers as non-cryptographic hash functions.^[2]

Many NCHFs have a relatively small result size (e.g., 64 bits for SipHash or even less): large result size does not increase the performance of the target applications, but slows down the calculation, as more bits need to be generated.^[8]

Related Research Articles

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable-length output. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter-storage addressing.

The Session Initiation Protocol (SIP) is a signaling protocol used for initiating, maintaining, and terminating communication sessions that include voice, video and messaging applications. SIP is used in Internet telephony, in private IP telephone systems, as well as mobile phone calling over LTE (VoLTE).

Daniel Julius Bernstein is an American mathematician, cryptologist, and computer scientist. He was a visiting professor at CASA at Ruhr University Bochum until 2024, as well as a research professor of Computer Science at the University of Illinois at Chicago. Before this, he was a visiting professor in the department of mathematics and computer science at the Eindhoven University of Technology.

FNV may refer to:

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

In cryptography, a key derivation function (KDF) is a cryptographic algorithm that derives one or more secret keys from a secret value such as a master key, a password, or a passphrase using a pseudorandom function. KDFs can be used to stretch keys into longer keys or to obtain keys of a required format, such as converting a group element that is the result of a Diffie–Hellman key exchange into a symmetric key for use with AES. Keyed cryptographic hash functions are popular examples of pseudorandom functions used for key derivation.

In cryptography, a timing attack is a side-channel attack in which the attacker attempts to compromise a cryptosystem by analyzing the time taken to execute cryptographic algorithms. Every logical operation in a computer takes time to execute, and the time can differ based on the input; with precise measurements of the time for each operation, an attacker can work backwards to the input. Finding secrets through timing information may be significantly easier than using cryptanalysis of known plaintext, ciphertext pairs. Sometimes timing information is combined with cryptanalysis to increase the rate of information leakage.

In cryptography, a message authentication code (MAC), sometimes known as an authentication tag, is a short piece of information used for authenticating and integrity-checking a message. In other words, to confirm that the message came from the stated sender and has not been changed. The MAC value allows verifiers to detect any changes to the message content.

In cryptography, a collision attack on a cryptographic hash tries to find two inputs producing the same hash value, i.e. a hash collision. This is in contrast to a preimage attack where a specific target hash value is specified.

Fowler–Noll–Vo is a non-cryptographic hash function created by Glenn Fowler, Landon Curt Noll, and Kiem-Phong Vo.

In cryptography, PBKDF1 and PBKDF2 are key derivation functions with a sliding computational cost, used to reduce vulnerability to brute-force attacks.

A hash chain is the successive application of a cryptographic hash function to a piece of data. In computer security, a hash chain is a method used to produce many one-time keys from a single key or password. For non-repudiation, a hash function can be applied successively to additional pieces of data in order to record the chronology of data's existence.

SHA-3 is the latest member of the Secure Hash Algorithm family of standards, released by NIST on August 5, 2015. Although part of the same series of standards, SHA-3 is internally different from the MD5-like structure of SHA-1 and SHA-2.

In cryptography, Very Smooth Hash (VSH) is a provably secure cryptographic hash function invented in 2005 by Scott Contini, Arjen Lenstra, and Ron Steinfeld. Provably secure means that finding collisions is as difficult as some known hard mathematical problem. Unlike other provably secure collision-resistant hashes, VSH is efficient and usable in practice. Asymptotically, it only requires a single multiplication per $log(n)$ message-bits and uses RSA-type arithmetic. Therefore, VSH can be useful in embedded environments where code space is limited.

MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup. It was created by Austin Appleby in 2008 and, as of 8 January 2016, is hosted on GitHub along with its test suite named SMHasher. It also exists in a number of variants, all of which have been released into the public domain. The name comes from two basic operations, multiply (MU) and rotate (R), used in its inner loop.

In cryptography, SWIFFT is a collection of provably secure hash functions. It is based on the concept of the fast Fourier transform (FFT). SWIFFT is not the first hash function based on the FFT, but it sets itself apart by providing a mathematical proof of its security. It also uses the LLL basis reduction algorithm. It can be shown that finding collisions in SWIFFT is at least as difficult as finding short vectors in cyclic/ideal lattices in the worst case. By giving a security reduction to the worst-case scenario of a difficult mathematical problem, SWIFFT gives a much stronger security guarantee than most other cryptographic hash functions.

SipHash is an add–rotate–xor (ARX) based family of pseudorandom functions created by Jean-Philippe Aumasson and Daniel J. Bernstein in 2012, in response to a spate of "hash flooding" denial-of-service attacks (HashDoS) in late 2011.

Extendable-output function (XOF) is an extension of the cryptographic hash that allows its output to be arbitrarily long. In particular, the sponge construction makes any sponge hash a natural XOF: the squeeze operation can be repeated, and the regular hash functions with a fixed-size result are obtained from a sponge mechanism by stopping the squeezing phase after obtaining the fixed number of bits).

In cryptography, domain separation is a construct used to implement multiple different functions using only one underlying template in an efficient way. The domain separation can be defined as partitioning of the domain of a function to assign separate subdomains to different applications of the same function.

References

↑ Estébanez et al. 2013.
1 2 Sateesan et al. 2023, p. 1.
1 2 3 Sateesan et al. 2023, p. 2.
↑ Mittelbach & Fischlin 2021, p. 303.
↑ Stamp 2011.
↑ Estébanez et al. 2013, p. 1.
↑ Estébanez et al. 2013, pp. 3–4.
↑ Patgiri, Nayak & Muppalaneni 2023, pp. 37–38.

Sources

Sateesan, Arish; Biesmans, Jelle; Claesen, Thomas; Vliegen, Jo; Mentens, Nele (April 2023). "Optimized algorithms and architectures for fast non-cryptographic hash functions in hardware". Microprocessors and Microsystems. 98: 104782. doi:10.1016/j.micpro.2023.104782. ISSN 0141-9331.
Estébanez, César; Saez, Yago; Recio, Gustavo; Isasi, Pedro (28 January 2013). "Performance of the most common non-cryptographic hash functions" (PDF). Software: Practice and Experience. 44 (6): 681–698. doi:10.1002/spe.2179. ISSN 0038-0644.
Stamp, Mark (8 November 2011). "Non-Cryptographic Hashes". Information Security: Principles and Practice (2 ed.). John Wiley & Sons. ISBN 978-1-118-02796-7. OCLC 1039294381.
Patgiri, Ripon; Nayak, Sabuzima; Muppalaneni, Naresh Babu (25 April 2023). Bloom Filter: A Data Structure for Computer Networking, Big Data, Cloud Computing, Internet of Things, Bioinformatics and Beyond. Academic Press. pp. 37–38. ISBN 978-0-12-823646-8. OCLC 1377693258.
Mittelbach, Arno; Fischlin, Marc (2021). "Non-cryptographic Hashing". The Theory of Hash Functions and Random Oracles. Cham: Springer International Publishing. pp. 303–334. doi:10.1007/978-3-030-63287-8_7. ISBN 978-3-030-63286-1.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[FOOTNOTEEstébanezSaezRecioIsasi2013-1] Estébanez et al. 2013.

[FOOTNOTESateesanBiesmansClaesenVliegen20231-2] 1 2 Sateesan et al. 2023, p. 1.

[FOOTNOTESateesanBiesmansClaesenVliegen20232-3] 1 2 3 Sateesan et al. 2023, p. 2.

[FOOTNOTEMittelbachFischlin2021303-4] Mittelbach & Fischlin 2021, p. 303.

[FOOTNOTEStamp2011-5] Stamp 2011.

[FOOTNOTEEstébanezSaezRecioIsasi20131-6] Estébanez et al. 2013, p. 1.

[FOOTNOTEEstébanezSaezRecioIsasi20133–4-7] Estébanez et al. 2013, pp. 3–4.

[FOOTNOTEPatgiriNayakMuppalaneni202337–38-8] Patgiri, Nayak & Muppalaneni 2023, pp. 37–38.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]