One-way compression function

Last updated

In cryptography, a one-way compression function is a function that transforms two fixed-length inputs into a fixed-length output. [1] The transformation is "one-way", meaning that it is difficult given a particular output to compute inputs which compress to that output. One-way compression functions are not related to conventional data compression algorithms, which instead can be inverted exactly (lossless compression) or approximately (lossy compression) to the original data.

Contents

A one-way compression function One-way compression.svg
A one-way compression function

One-way compression functions are for instance used in the Merkle–Damgård construction inside cryptographic hash functions.

One-way compression functions are often built from block ciphers. Some methods to turn any normal block cipher into a one-way compression function are Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel (single-block-length compression functions) and MDC-2/Meyer–Schilling, MDC-4, Hirose (double-block-length compression functions). These methods are described in detail further down. (MDC-2 is also the name of a hash function patented by IBM.)

Another method is 2BOW (or NBOW in general), which is a "high-rate multi-block-length hash function based on block ciphers" [1] and typically achieves (asymptotic) rates between 1 and 2 independent of the hash size (only with small constant overhead). This method has not yet seen any serious security analysis, so should be handled with care.

Compression

A compression function mixes two fixed length inputs and produces a single fixed length output of the same size as one of the inputs. This can also be seen as that the compression function transforms one large fixed-length input into a shorter, fixed-length output.

For instance, input A might be 128 bits, input B 128 bits and they are compressed together to a single output of 128 bits. This is equivalent to having a single 256-bit input compressed to a single output of 128 bits.

Some compression functions do not compress by half, but instead by some other factor. For example, input A might be 256 bits, and input B 128 bits, which are compressed to a single output of 128 bits. That is, a total of 384 input bits are compressed together to 128 output bits.

The mixing is done in such a way that full avalanche effect is achieved. That is, every output bit depends on every input bit.

One-way

A one-way function is a function that is easy to compute but hard to invert. A one-way compression function (also called hash function) should have the following properties:

Ideally one would like the "infeasibility" in preimage-resistance and second preimage-resistance to mean a work of about where is the number of bits in the hash function's output. However, particularly for second preimage-resistance this is a difficult problem.[ citation needed ]

The Merkle–Damgård construction

The Merkle-Damgard hash construction. The boxes labeled [f] are a one-way compression function. Merkle-Damgard hash big.svg
The Merkle–Damgård hash construction. The boxes labeled [f] are a one-way compression function.

A common use of one-way compression functions is in the Merkle–Damgård construction inside cryptographic hash functions. Most widely used hash functions, including MD5, SHA-1 (which is deprecated [2] ) and SHA-2 use this construction.

A hash function must be able to process an arbitrary-length message into a fixed-length output. This can be achieved by breaking the input up into a series of equal-sized blocks, and operating on them in sequence using a one-way compression function. The compression function can either be specially designed for hashing or be built from a block cipher. The last block processed should also be length padded, which is crucial to the security of this construction.

When length padding (also called MD-strengthening) is applied, attacks cannot find collisions faster than the birthday paradox (, being the block size in bits) if the used function is collision-resistant. [3] [4] Hence, the Merkle–Damgård hash construction reduces the problem of finding a proper hash function to finding a proper compression function.

A second preimage attack (given a message an attacker finds another message to satisfy can be done according to Kelsey and Schneier [5] for a -message-block message in time . Note that the complexity of this attack reaches a minimum of for long messages when and approaches when messages are short.

Construction from block ciphers

A typical modern block cipher Block cipher.svg
A typical modern block cipher

One-way compression functions are often built from block ciphers.

Block ciphers take (like one-way compression functions) two fixed size inputs (the key and the plaintext) and return one single output (the ciphertext) which is the same size as the input plaintext.

However, modern block ciphers are only partially one-way. That is, given a plaintext and a ciphertext it is infeasible to find a key that encrypts the plaintext to the ciphertext. But, given a ciphertext and a key a matching plaintext can be found simply by using the block cipher's decryption function. Thus, to turn a block cipher into a one-way compression function some extra operations have to be added.

Some methods to turn any normal block cipher into a one-way compression function are Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel (single-block-length compression functions) and MDC-2, MDC-4, Hirose (double-block-length compressions functions).

Single-block-length compression functions output the same number of bits as processed by the underlying block cipher. Consequently, double-block-length compression functions output twice the number of bits.

If a block cipher has a block size of say 128 bits single-block-length methods create a hash function that has the block size of 128 bits and produces a hash of 128 bits. Double-block-length methods make hashes with double the hash size compared to the block size of the block cipher used. So a 128-bit block cipher can be turned into a 256-bit hash function.

These methods are then used inside the Merkle–Damgård construction to build the actual hash function. These methods are described in detail further down.

Using a block cipher to build the one-way compression function for a hash function is usually somewhat slower than using a specially designed one-way compression function in the hash function. This is because all known secure constructions do the key scheduling for each block of the message. Black, Cochran and Shrimpton have shown that it is impossible to construct a one-way compression function that makes only one call to a block cipher with a fixed key. [6] In practice reasonable speeds are achieved provided the key scheduling of the selected block cipher is not a too heavy operation.

But, in some cases it is easier because a single implementation of a block cipher can be used for both a block cipher and a hash function. It can also save code space in very tiny embedded systems like for instance smart cards or nodes in cars or other machines.

Therefore, the hash-rate or rate gives a glimpse of the efficiency of a hash function based on a certain compression function. The rate of an iterated hash function outlines the ratio between the number of block cipher operations and the output. More precisely, the rate represents the ratio between the number of processed bits of input , the output bit-length of the block cipher, and the necessary block cipher operations to produce these output bits. Generally, the usage of fewer block cipher operations results in a better overall performance of the entire hash function, but it also leads to a smaller hash-value which could be undesirable. The rate is expressed by the formula:

The hash function can only be considered secure if at least the following conditions are met:

The constructions presented below: Davies–Meyer, Matyas–Meyer–Oseas, Miyaguchi–Preneel and Hirose have been shown to be secure under the black-box analysis. [7] [8] The goal is to show that any attack that can be found is at most as efficient as the birthday attack under certain assumptions. The black-box model assumes that a block cipher is used that is randomly chosen from a set containing all appropriate block ciphers. In this model an attacker may freely encrypt and decrypt any blocks, but does not have access to an implementation of the block cipher. The encryption and decryption function are represented by oracles that receive a pair of either a plaintext and a key or a ciphertext and a key. The oracles then respond with a randomly chosen plaintext or ciphertext, if the pair was asked for the first time. They both share a table for these triplets, a pair from the query and corresponding response, and return the record, if a query was received for the second time. For the proof there is a collision finding algorithm that makes randomly chosen queries to the oracles. The algorithm returns 1, if two responses result in a collision involving the hash function that is built from a compression function applying this block cipher (0 else). The probability that the algorithm returns 1 is dependent on the number of queries which determine the security level.

Davies–Meyer

The Davies-Meyer one-way compression function Davies-Meyer hash.svg
The Davies–Meyer one-way compression function

The Davies–Meyer single-block-length compression function feeds each block of the message () as the key to a block cipher. It feeds the previous hash value () as the plaintext to be encrypted. The output ciphertext is then also XORed (⊕) with the previous hash value () to produce the next hash value (). In the first round when there is no previous hash value it uses a constant pre-specified initial value ().

In mathematical notation Davies–Meyer can be described as:

The scheme has the rate (k is the keysize):

If the block cipher uses for instance 256-bit keys then each message block () is a 256-bit chunk of the message. If the same block cipher uses a block size of 128 bits then the input and output hash values in each round is 128 bits.

Variations of this method replace XOR with any other group operation, such as addition on 32-bit unsigned integers.

A notable property of the Davies–Meyer construction is that even if the underlying block cipher is totally secure, it is possible to compute fixed points for the construction: for any , one can find a value of such that : one just has to set . [9] This is a property that random functions certainly do not have. So far, no practical attack has been based on this property, but one should be aware of this "feature". The fixed-points can be used in a second preimage attack (given a message , attacker finds another message to satisfy of Kelsey and Schneier [5] for a -message-block message in time . If the construction does not allow easy creation of fixed points (like Matyas–Meyer–Oseas or Miyaguchi–Preneel) then this attack can be done in time. Note that in both cases the complexity is above but below when messages are long and that when messages get shorter the complexity of the attack approaches .

The security of the Davies–Meyer construction in the Ideal Cipher Model was first proven by R. Winternitz. [10]

Matyas–Meyer–Oseas

The Matyas-Meyer-Oseas one-way compression function Matyas-Meyer-Oseas hash.svg
The Matyas–Meyer–Oseas one-way compression function

The Matyas–Meyer–Oseas single-block-length one-way compression function can be considered the dual (the opposite) of Davies–Meyer.

It feeds each block of the message () as the plaintext to be encrypted. The output ciphertext is then also XORed (⊕) with the same message block () to produce the next hash value (). The previous hash value () is fed as the key to the block cipher. In the first round when there is no previous hash value it uses a constant pre-specified initial value ().

If the block cipher has different block and key sizes the hash value () will have the wrong size for use as the key. The cipher might also have other special requirements on the key. Then the hash value is first fed through the function to be converted/padded to fit as key for the cipher.

In mathematical notation Matyas–Meyer–Oseas can be described as:

The scheme has the rate:

A second preimage attack (given a message an attacker finds another message to satisfy ) can be done according to Kelsey and Schneier [5] for a -message-block message in time . Note that the complexity is above but below when messages are long, and that when messages get shorter the complexity of the attack approaches .

Miyaguchi–Preneel

The Miyaguchi-Preneel one-way compression function Miyaguchi-Preneel hash.svg
The Miyaguchi–Preneel one-way compression function

The Miyaguchi–Preneel single-block-length one-way compression function is an extended variant of Matyas–Meyer–Oseas. It was independently proposed by Shoji Miyaguchi and Bart Preneel.

It feeds each block of the message () as the plaintext to be encrypted. The output ciphertext is then XORed (⊕) with the same message block () and then also XORed with the previous hash value () to produce the next hash value (). The previous hash value () is fed as the key to the block cipher. In the first round when there is no previous hash value it uses a constant pre-specified initial value ().

If the block cipher has different block and key sizes the hash value () will have the wrong size for use as the key. The cipher might also have other special requirements on the key. Then the hash value is first fed through the function to be converted/padded to fit as key for the cipher.

In mathematical notation Miyaguchi–Preneel can be described as:

The scheme has the rate:

The roles of and may be switched, so that is encrypted under the key , thus making this method an extension of Davies–Meyer instead.

A second preimage attack (given a message an attacker finds another message to satisfy ) can be done according to Kelsey and Schneier [5] for a -message-block message in time . Note that the complexity is above but below when messages are long, and that when messages get shorter the complexity of the attack approaches .

Hirose

The Hirose double-block-length compression function Hirose.png
The Hirose double-block-length compression function

The Hirose [8] double-block-length one-way compression function consists of a block cipher plus a permutation . It was proposed by Shoichi Hirose in 2006 and is based on a work [11] by Mridul Nandi.

It uses a block cipher whose key length is larger than the block length , and produces a hash of size . For example, any of the AES candidates with a 192- or 256-bit key (and 128-bit block).

Each round accepts a portion of the message that is bits long, and uses it to update two -bit state values and .

First, is concatenated with to produce a key . Then the two feedback values are updated according to:

is an arbitrary fixed-point-free permutation on an -bit value, typically defined as for an arbitrary non-zero constant (all ones may be a convenient choice).

Each encryption resembles the standard Davies–Meyer construction. The advantage of this scheme over other proposed double-block-length schemes is that both encryptions use the same key, and thus key scheduling effort may be shared.

The final output is . The scheme has the rate relative to encrypting the message with the cipher.

Hirose also provides a proof in the Ideal Cipher Model.

Sponge construction

The sponge construction can be used to build one-way compression functions.

See also

Related Research Articles

In cryptography, a block cipher is a deterministic algorithm operating on fixed-length groups of bits, called blocks. Block ciphers are specified elementary components in the design of many cryptographic protocols and are widely used to encrypt large amounts of data, including in data exchange protocols. A block cipher uses blocks as an unvarying transformation.

<span class="mw-page-title-main">Hash function</span> Mapping arbitrary data to fixed-size values

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

<span class="mw-page-title-main">HMAC</span> Computer communications hash algorithm

In cryptography, an HMAC is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret cryptographic key. As with any MAC, it may be used to simultaneously verify both the data integrity and authenticity of a message.

In cryptography, a block cipher mode of operation is an algorithm that uses a block cipher to provide information security such as confidentiality or authenticity. A block cipher by itself is only suitable for the secure cryptographic transformation of one fixed-length group of bits called a block. A mode of operation describes how to repeatedly apply a cipher's single-block operation to securely transform amounts of data larger than a block.

<span class="mw-page-title-main">Cryptographic hash function</span> Hash function that is suitable for use in cryptography

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

In computer science and cryptography, Whirlpool is a cryptographic hash function. It was designed by Vincent Rijmen and Paulo S. L. M. Barreto, who first described it in 2000.

In cryptography, a preimage attack on cryptographic hash functions tries to find a message that has a specific hash value. A cryptographic hash function should resist attacks on its preimage.

SHA-2 is a set of cryptographic hash functions designed by the United States National Security Agency (NSA) and first published in 2001. They are built using the Merkle–Damgård construction, from a one-way compression function itself built using the Davies–Meyer structure from a specialized block cipher.

In cryptography, MDC-2 is a cryptographic hash function. MDC-2 is a hash function based on a block cipher with a proof of security in the ideal-cipher model. The length of the output hash depends on the underlying block cipher used.

<span class="mw-page-title-main">CBC-MAC</span> Message authentication code algorithm

In cryptography, a cipher block chaining message authentication code (CBC-MAC) is a technique for constructing a message authentication code (MAC) from a block cipher. The message is encrypted with some block cipher algorithm in cipher block chaining (CBC) mode to create a chain of blocks such that each block depends on the proper encryption of the previous block. This interdependence ensures that a change to any of the plaintext bits will cause the final encrypted block to change in a way that cannot be predicted or counteracted without knowing the key to the block cipher.

The GOST hash function, defined in the standards GOST R 34.11-94 and GOST 34.311-95 is a 256-bit cryptographic hash function. It was initially defined in the Russian national standard GOST R 34.11-94 Information Technology – Cryptographic Information Security – Hash Function. The equivalent standard used by other member-states of the CIS is GOST 34.311-95.

In cryptography, a Lamport signature or Lamport one-time signature scheme is a method for constructing a digital signature. Lamport signatures can be built from any cryptographically secure one-way function; usually a cryptographic hash function is used.

<span class="mw-page-title-main">Merkle–Damgård construction</span> Method of building collision-resistant cryptographic hash functions

In cryptography, the Merkle–Damgård construction or Merkle–Damgård hash function is a method of building collision-resistant cryptographic hash functions from collision-resistant one-way compression functions. This construction was used in the design of many popular hash algorithms such as MD5, SHA-1 and SHA-2.

In cryptography, Galois/Counter Mode (GCM) is a mode of operation for symmetric-key cryptographic block ciphers which is widely adopted for its performance. GCM throughput rates for state-of-the-art, high-speed communication channels can be achieved with inexpensive hardware resources.

SHA-3 is the latest member of the Secure Hash Algorithm family of standards, released by NIST on August 5, 2015. Although part of the same series of standards, SHA-3 is internally different from the MD5-like structure of SHA-1 and SHA-2.

The following tables compare general and technical information for a number of cryptographic hash functions. See the individual functions' articles for further information. This article is not all-inclusive or necessarily up-to-date. An overview of hash function security/cryptanalysis can be found at hash function security summary.

In cryptography, Very Smooth Hash (VSH) is a provably secure cryptographic hash function invented in 2005 by Scott Contini, Arjen Lenstra and Ron Steinfeld. Provably secure means that finding collisions is as difficult as some known hard mathematical problem. Unlike other provably secure collision-resistant hashes, VSH is efficient and usable in practice. Asymptotically, it only requires a single multiplication per log(n) message-bits and uses RSA-type arithmetic. Therefore, VSH can be useful in embedded environments where code space is limited.

In cryptography, the fast syndrome-based hash functions (FSB) are a family of cryptographic hash functions introduced in 2003 by Daniel Augot, Matthieu Finiasz, and Nicolas Sendrier. Unlike most other cryptographic hash functions in use today, FSB can to a certain extent be proven to be secure. More exactly, it can be proven that breaking FSB is at least as difficult as solving a certain NP-complete problem known as regular syndrome decoding so FSB is provably secure. Though it is not known whether NP-complete problems are solvable in polynomial time, it is often assumed that they are not.

Streebog is a cryptographic hash function defined in the Russian national standard GOST R 34.11-2012 Information Technology – Cryptographic Information Security – Hash Function. It was created to replace an obsolete GOST hash function defined in the old standard GOST R 34.11-94, and as an asymmetric reply to SHA-3 competition by the US National Institute of Standards and Technology. The function is also described in RFC 6986 and one out of hash functions in ISO/IEC 10118-3:2018.

LSH is a cryptographic hash function designed in 2014 by South Korea to provide integrity in general-purpose software environments such as PCs and smart devices. LSH is one of the cryptographic algorithms approved by the Korean Cryptographic Module Validation Program (KCMVP). And it is the national standard of South Korea.

References

Citations

  1. 1 2 Handbook of Applied Cryptography by Alfred J. Menezes, Paul C. van Oorschot, Scott A. Vanstone. Fifth Printing (August 2001) page 328.
  2. "Announcing the first SHA1 collision". Google Online Security Blog. Retrieved 2020-01-12.
  3. Ivan Damgård. A design principle for hash functions. In Gilles Brassard, editor, CRYPTO, volume 435 of LNCS, pages 416–427. Springer, 1989.
  4. Ralph Merkle. One way hash functions and DES. In Gilles Brassard, editor, CRYPTO, volume 435 of LNCS, pages 428–446. Springer, 1989.
  5. 1 2 3 4 John Kelsey and Bruce Schneier. Second preimages on n-bit hash functions for much less than 2n work. In Ronald Cramer, editor, EUROCRYPT, volume 3494 of LNCS, pages 474–490. Springer, 2005.
  6. John Black, Martin Cochran, and Thomas Shrimpton. On the Impossibility of Highly-Efficient Blockcipher-Based Hash Functions. Advances in Cryptology – EUROCRYPT '05, Aarhus, Denmark, 2005. The authors define a hash function "highly efficient if its compression function uses exactly one call to a block cipher whose key is fixed".
  7. John Black, Phillip Rogaway, and Tom Shrimpton. Black-Box Analysis of the Block-Cipher-Based Hash-Function Constructions from PGV. Advances in Cryptology – CRYPTO '02, Lecture Notes in Computer Science, vol. 2442, pp. 320–335, Springer, 2002. See the table on page 3, Davies–Meyer, Matyas–Meyer–Oseas and Miyaguchi–Preneel are numbered in the first column as hash functions 5, 1 and 3.
  8. 1 2 S. Hirose, Some Plausible Constructions of Double-Block-Length Hash Functions . In: Robshaw, M. J. B. (ed.) FSE 2006, LNCS, vol. 4047, pp. 210–225, Springer, Heidelberg 2006.
  9. Handbook of Applied Cryptography by Alfred J. Menezes, Paul C. van Oorschot, Scott A. Vanstone. Fifth Printing (August 2001) page 375.
  10. R. Winternitz. A secure one-way hash function built from DES. In Proceedings of the IEEE Symposium on Information Security and Privacy, p. 88-90. IEEE Press, 1984.
  11. M. Nandi, Towards optimal double-length hash functions, In: Proceedings of the 6th International Conference on Cryptology in India (INDOCRYPT 2005), Lecture Notes in Computer Science 3797, pages 77–89, 2005.

Sources