Non-cryptographic hash function

Last updated

The non-cryptographic hash functions (NCHFs [1] ) are hash functions intended for applications that do not need the rigorous security requirements of the cryptographic hash functions (e.g., preimage resistance) and therefore can be faster and less resource-intensive. [2] Typical examples of CPU-optimized non-cryptographic hashes include FNV-1a, Murmur3. [3] Some of the non-cryptographic hash functions are used in the cryptographic applications (usually in combination with other cryptographic primitives), in this case they are described as universal hash functions. [4]

Contents

Applications and requirements

Among the typical uses of the non-cryptographic hash functions are bloom filters, hash tables, and count sketches. These applications require, in addition to speed, uniform distribution and avalanche properties. [3] Collision resistance is an additional feature that can be useful against hash flooding attacks; simple NCHFs, like the Cyclic redundancy check (CRC), have essentially no collision resistance [5] and thus cannot be used with an input open to manipulation by an attacker.

NCHFs are used in diverse systems: lexical analyzers, compilers, databases, communication networks, videogames, DNS servers, filesystems - anywhere in computing where there is a need to find the information very quickly (preferably in the O(1) time, which will also achieve perfect scalability). [6]

Estébanez et al. list the "most important" NCHFs: [7]

Design

Non-cryptographic hash functions optimized for software frequently involve the multiplication operation. Since in hardware multiplication is resource-intensive and frequency-limiting, ASIC-friendlier designs had been proposed, including SipHash (that has an additional benefit of being able to use a secret key for message authentication), NSGAhash and XORhash. Although technically the lightweight cryptography can be used for the same applications, the latency of its algorithms is usually way too high due to a large number of rounds. [3] Sateesan et al. propose using the reduced-round versions of the lightweight hashes and ciphers as non-cryptographic hash functions. [2]

Many NCHFs have a relatively small result size (e.g., 64 bits for SipHash or even less): large result size does not increase the performance of the target applications, but slows down the calculation, as more bits need to be generated. [8]

See also

Related Research Articles

<span class="mw-page-title-main">Hash function</span> Mapping arbitrary data to fixed-size values

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as RFC 1321.

<span class="mw-page-title-main">Daniel J. Bernstein</span> American mathematician, cryptologist and computer scientist

Daniel Julius Bernstein is an American mathematician, cryptologist, and computer scientist. He is a visiting professor at CASA at Ruhr University Bochum, as well as a research professor of Computer Science at the University of Illinois at Chicago. Before this, he was a visiting professor in the department of mathematics and computer science at the Eindhoven University of Technology.

<span class="mw-page-title-main">Cryptographic hash function</span> Hash function that is suitable for use in cryptography

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

<span class="mw-page-title-main">Key derivation function</span> Function that derives secret keys from a secret value

In cryptography, a key derivation function (KDF) is a cryptographic algorithm that derives one or more secret keys from a secret value such as a master key, a password, or a passphrase using a pseudorandom function. KDFs can be used to stretch keys into longer keys or to obtain keys of a required format, such as converting a group element that is the result of a Diffie–Hellman key exchange into a symmetric key for use with AES. Keyed cryptographic hash functions are popular examples of pseudorandom functions used for key derivation.

In cryptography, a timing attack is a side-channel attack in which the attacker attempts to compromise a cryptosystem by analyzing the time taken to execute cryptographic algorithms. Every logical operation in a computer takes time to execute, and the time can differ based on the input; with precise measurements of the time for each operation, an attacker can work backwards to the input. Finding secrets through timing information may be significantly easier than using cryptanalysis of known plaintext, ciphertext pairs. Sometimes timing information is combined with cryptanalysis to increase the rate of information leakage.

In cryptography, a message authentication code (MAC), sometimes known as an authentication tag, is a short piece of information used for authenticating and integrity-checking a message. In other words, to confirm that the message came from the stated sender and has not been changed. The MAC value allows verifiers to detect any changes to the message content.

In cryptography, a collision attack on a cryptographic hash tries to find two inputs producing the same hash value, i.e. a hash collision. This is in contrast to a preimage attack where a specific target hash value is specified.

Fowler–Noll–Vo is a non-cryptographic hash function created by Glenn Fowler, Landon Curt Noll, and Kiem-Phong Vo.

In cryptography, PBKDF1 and PBKDF2 are key derivation functions with a sliding computational cost, used to reduce vulnerability to brute-force attacks.

In cryptography, Galois/Counter Mode (GCM) is a mode of operation for symmetric-key cryptographic block ciphers which is widely adopted for its performance. GCM throughput rates for state-of-the-art, high-speed communication channels can be achieved with inexpensive hardware resources.

In cryptography, a T-function is a bijective mapping that updates every bit of the state in a way that can be described as , or in simple words an update function in which each bit of the state is updated by a linear combination of the same bit and a function of a subset of its less significant bits. If every single less significant bit is included in the update of every bit in the state, such a T-function is called triangular. Thanks to their bijectivity regardless of the used Boolean functions and regardless of the selection of inputs, T-functions are now widely used in cryptography to construct block ciphers, stream ciphers, PRNGs and hash functions. T-functions were first proposed in 2002 by A. Klimov and A. Shamir in their paper "A New Class of Invertible Mappings". Ciphers such as TSC-1, TSC-3, TSC-4, ABC, Mir-1 and VEST are built with different types of T-functions.

In computing, data deduplication is a technique for eliminating duplicate copies of repeating data. Successful implementation of the technique can improve storage utilization, which may in turn lower capital expenditure by reducing the overall amount of storage media required to meet storage capacity needs. It can also be applied to network data transfers to reduce the number of bytes that must be sent.

SHA-3 is the latest member of the Secure Hash Algorithm family of standards, released by NIST on August 5, 2015. Although part of the same series of standards, SHA-3 is internally different from the MD5-like structure of SHA-1 and SHA-2.

The Jenkins hash functions are a family of non-cryptographic hash functions for multi-byte keys designed by Bob Jenkins. The first one was formally published in 1997.

MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup. It was created by Austin Appleby in 2008 and is currently hosted on GitHub along with its test suite named 'SMHasher'. It also exists in a number of variants, all of which have been released into the public domain. The name comes from two basic operations, multiply (MU) and rotate (R), used in its inner loop.

In cryptography, SWIFFT is a collection of provably secure hash functions. It is based on the concept of the fast Fourier transform (FFT). SWIFFT is not the first hash function based on FFT, but it sets itself apart by providing a mathematical proof of its security. It also uses the LLL basis reduction algorithm. It can be shown that finding collisions in SWIFFT is at least as difficult as finding short vectors in cyclic/ideal lattices in the worst case. By giving a security reduction to the worst-case scenario of a difficult mathematical problem, SWIFFT gives a much stronger security guarantee than most other cryptographic hash functions.

SipHash is an add–rotate–xor (ARX) based family of pseudorandom functions created by Jean-Philippe Aumasson and Daniel J. Bernstein in 2012, in response to a spate of "hash flooding" denial-of-service attacks (HashDoS) in late 2011.

In cryptography, a round or round function is a basic transformation that is repeated (iterated) multiple times inside the algorithm. Splitting a large algorithmic function into rounds simplifies both implementation and cryptanalysis.

Extendable-output function (XOF) is an extension of the cryptographic hash that allows its output to be arbitrarily long. In particular, the sponge construction makes any sponge hash a natural XOF: the squeeze operation can be repeated, and the regular hash functions with a fixed-size result are obtained from a sponge mechanism by stopping the squeezing phase after obtaining the fixed number of bits).

References

Sources