Security of cryptographic hash functions

Last updated

In cryptography, cryptographic hash functions can be divided into two main categories. In the first category are those functions whose designs are based on mathematical problems, and whose security thus follows from rigorous mathematical proofs, complexity theory and formal reduction. These functions are called provably secure cryptographic hash functions. To construct these is very difficult, and few examples have been introduced. Their practical use is limited.

Contents

In the second category are functions which are not based on mathematical problems, but on an ad-hoc constructions, in which the bits of the message are mixed to produce the hash. These are then believed to be hard to break, but no formal proof is given. Almost all hash functions in widespread use reside in this category. Some of these functions are already broken, and are no longer in use. See Hash function security summary.

Types of security of hash functions

Generally, the basic security of cryptographic hash functions can be seen from different angles: pre-image resistance, second pre-image resistance, collision resistance, and pseudo-randomness.

The meaning of hard

The basic question is the meaning of hard. There are two approaches to answer this question. First is the intuitive/practical approach: "hard means that it is almost certainly beyond the reach of any adversary who must be prevented from breaking the system for as long as the security of the system is deemed important." The second approach is theoretical and is based on the computational complexity theory: if problem A is hard, then there exists a formal security reduction from a problem which is widely considered unsolvable in polynomial time, such as integer factorization or the discrete logarithm problem.

However, non-existence of a polynomial time algorithm does not automatically ensure that the system is secure. The difficulty of a problem also depends on its size. For example, RSA public-key cryptography (which relies on the difficulty of integer factorization) is considered secure only with keys that are at least 2048 bits long, whereas keys for the ElGamal cryptosystem (which relies on the difficulty of the discrete logarithm problem) are commonly in the range of 256–512 bits.

Password case

If the set of inputs to the hash is relatively small or is ordered by likelihood in some way, then a brute force search may be practical, regardless of theoretical security. The likelihood of recovering the preimage depends on the input set size and the speed or cost of computing the hash function. A common example is the use of hashes to store password validation data. Rather than store the plaintext of user passwords, an access control system typically stores a hash of the password. When a person requests access, the password they submit is hashed and compared with the stored value. If the stored validation data is stolen, then the thief will only have the hash values, not the passwords. However, most users choose passwords in predictable ways, and passwords are often short enough so that all possible combinations can be tested if fast hashes are used. [1] Special hashes called key derivation functions have been created to slow searches. See Password cracking.

Cryptographic hash functions

Most hash functions are built on an ad-hoc basis, where the bits of the message are nicely mixed to produce the hash. Various bitwise operations (e.g. rotations), modular additions, and compression functions are used in iterative mode to ensure high complexity and pseudo-randomness of the output. In this way, the security is very hard to prove and the proof is usually not done. Only a few years ago[ when? ], one of the most popular hash functions, SHA-1, was shown to be less secure than its length suggested: collisions could be found in only 251 [2] tests, rather than the brute-force number of 280.

In other words, most of the hash functions in use nowadays are not provably collision-resistant. These hashes are not based on purely mathematical functions. This approach results generally in more effective hashing functions, but with the risk that a weakness of such a function will be eventually used to find collisions. One famous case is MD5.

Provably secure hash functions

In this approach, the security of a hash function is based on some hard mathematical problem, and it is proved that finding collisions of the hash function is as hard as breaking the underlying problem. This gives a somewhat stronger notion of security than just relying on complex mixing of bits as in the classical approach.

A cryptographic hash function has provable security against collision attacks if finding collisions is provably polynomial-time reducible from a problem P which is supposed to be unsolvable in polynomial time. The function is then called provably secure, or just provable.

It means that if finding collisions would be feasible in polynomial time by algorithm A, then one could find and use polynomial time algorithm R (reduction algorithm) that would use algorithm A to solve problem P, which is widely supposed to be unsolvable in polynomial time. That is a contradiction. This means that finding collisions cannot be easier than solving P.

However, this only indicates that finding collisions is difficult in some cases, as not all instances of a computationally hard problem are typically hard. Indeed, very large instances of NP-hard problems are routinely solved, while only the hardest are practically impossible to solve.

Hard problems

Examples of problems that are assumed to be not solvable in polynomial time include

Downsides of the provable approach

SWIFFT is an example of a hash function that circumvents these security problems. It can be shown that, for any algorithm that can break SWIFFT with probability p within an estimated time t, one can find an algorithm that solves the worst-case scenario of a certain difficult mathematical problem within time t depending on t and p.[ citation needed ]

Example of (impractical) provably secure hash function

Let hash(m) = xm mod n, where n is a hard-to-factor composite number, and x is some prespecified base value. A collision xm1xm2 (mod n) reveals a multiple m1m2 of the multiplicative order of x modulo n. This information can be used to factor n in polynomial time, assuming certain properties of x.

But the algorithm is quite inefficient because it requires on average 1.5 multiplications modulo n per message-bit.

More practical provably secure hash functions

Related Research Articles

In cryptography, SHA-1 is a hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits. It was designed by the United States National Security Agency, and is a U.S. Federal Information Processing Standard. The algorithm has been cryptographically broken but is still widely used.

In mathematics, for given real numbers a and b, the logarithm logba is a number x such that bx = a. Analogously, in any group G, powers bk can be defined for all integers k, and the discrete logarithm logba is an integer k such that bk = a. In number theory, the more commonly used term is index: we can write x = indra (mod m) (read "the index of a to the base r modulo m") for rxa (mod m) if r is a primitive root of m and gcd(a,m) = 1.

Articles related to cryptography include:

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. Not being one-to-one is not considered sufficient for a function to be called one-way.

<span class="mw-page-title-main">Cryptographic hash function</span> Hash function that is suitable for use in cryptography

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

In cryptography, a Schnorr signature is a digital signature produced by the Schnorr signature algorithm that was described by Claus Schnorr. It is a digital signature scheme known for its simplicity, among the first whose security is based on the intractability of certain discrete logarithm problems. It is efficient and generates short signatures. It was covered by U.S. patent 4,995,082 which expired in February 2010.

In cryptography, a random oracle is an oracle that responds to every unique query with a (truly) random response chosen uniformly from its output domain. If a query is repeated, it responds the same way every time that query is submitted.

In cryptography, a preimage attack on cryptographic hash functions tries to find a message that has a specific hash value. A cryptographic hash function should resist attacks on its preimage.

In cryptography, a collision attack on a cryptographic hash tries to find two inputs producing the same hash value, i.e. a hash collision. This is in contrast to a preimage attack where a specific target hash value is specified.

In cryptography, collision resistance is a property of cryptographic hash functions: a hash function H is collision-resistant if it is hard to find two inputs that hash to the same output; that is, two inputs a and b where ab but H(a) = H(b). The pigeonhole principle means that any hash function with more inputs than outputs will necessarily have such collisions; the harder they are to find, the more cryptographically secure the hash function is.

In computational complexity theory, the average-case complexity of an algorithm is the amount of some computational resource used by the algorithm, averaged over all possible inputs. It is frequently contrasted with worst-case complexity which considers the maximal complexity of the algorithm over all possible inputs.

Lattice-based cryptography is the generic term for constructions of cryptographic primitives that involve lattices, either in the construction itself or in the security proof. Lattice-based constructions support important standards of post-quantum cryptography. Unlike more widely used and known public-key schemes such as the RSA, Diffie-Hellman or elliptic-curve cryptosystems — which could, theoretically, be defeated using Shor's algorithm on a quantum computer — some lattice-based constructions appear to be resistant to attack by both classical and quantum computers. Furthermore, many lattice-based constructions are considered to be secure under the assumption that certain well-studied computational lattice problems cannot be solved efficiently.

<span class="mw-page-title-main">Cryptography</span> Practice and study of secure communication techniques

Cryptography, or cryptology, is the practice and study of techniques for secure communication in the presence of adversarial behavior. More generally, cryptography is about constructing and analyzing protocols that prevent third parties or the public from reading private messages. Modern cryptography exists at the intersection of the disciplines of mathematics, computer science, information security, electrical engineering, digital signal processing, physics, and others. Core concepts related to information security are also central to cryptography. Practical applications of cryptography include electronic commerce, chip-based payment cards, digital currencies, computer passwords, and military communications.

In cryptography, Very Smooth Hash (VSH) is a provably secure cryptographic hash function invented in 2005 by Scott Contini, Arjen Lenstra, and Ron Steinfeld. Provably secure means that finding collisions is as difficult as some known hard mathematical problem. Unlike other provably secure collision-resistant hashes, VSH is efficient and usable in practice. Asymptotically, it only requires a single multiplication per log(n) message-bits and uses RSA-type arithmetic. Therefore, VSH can be useful in embedded environments where code space is limited.

In cryptography, the fast syndrome-based hash functions (FSB) are a family of cryptographic hash functions introduced in 2003 by Daniel Augot, Matthieu Finiasz, and Nicolas Sendrier. Unlike most other cryptographic hash functions in use today, FSB can to a certain extent be proven to be secure. More exactly, it can be proven that breaking FSB is at least as difficult as solving a certain NP-complete problem known as regular syndrome decoding so FSB is provably secure. Though it is not known whether NP-complete problems are solvable in polynomial time, it is often assumed that they are not.

The elliptic curve only hash (ECOH) algorithm was submitted as a candidate for SHA-3 in the NIST hash function competition. However, it was rejected in the beginning of the competition since a second pre-image attack was found.

In cryptography, SWIFFT is a collection of provably secure hash functions. It is based on the concept of the fast Fourier transform (FFT). SWIFFT is not the first hash function based on the FFT, but it sets itself apart by providing a mathematical proof of its security. It also uses the LLL basis reduction algorithm. It can be shown that finding collisions in SWIFFT is at least as difficult as finding short vectors in cyclic/ideal lattices in the worst case. By giving a security reduction to the worst-case scenario of a difficult mathematical problem, SWIFFT gives a much stronger security guarantee than most other cryptographic hash functions.

Post-quantum cryptography (PQC), sometimes referred to as quantum-proof, quantum-safe, or quantum-resistant, is the development of cryptographic algorithms that are thought to be secure against a cryptanalytic attack by a quantum computer. Most widely-used public-key algorithms rely on the difficulty of one of three mathematical problems: the integer factorization problem, the discrete logarithm problem or the elliptic-curve discrete logarithm problem. All of these problems could be easily solved on a sufficiently powerful quantum computer running Shor's algorithm or even faster and less demanding alternatives.

In discrete mathematics, ideal lattices are a special class of lattices and a generalization of cyclic lattices. Ideal lattices naturally occur in many parts of number theory, but also in other areas. In particular, they have a significant place in cryptography. Micciancio defined a generalization of cyclic lattices as ideal lattices. They can be used in cryptosystems to decrease by a square root the number of parameters necessary to describe a lattice, making them more efficient. Ideal lattices are a new concept, but similar lattice classes have been used for a long time. For example, cyclic lattices, a special case of ideal lattices, are used in NTRUEncrypt and NTRUSign.

In post-quantum cryptography, ring learning with errors (RLWE) is a computational problem which serves as the foundation of new cryptographic algorithms, such as NewHope, designed to protect against cryptanalysis by quantum computers and also to provide the basis for homomorphic encryption. Public-key cryptography relies on construction of mathematical problems that are believed to be hard to solve if no further information is available, but are easy to solve if some information used in the problem construction is known. Some problems of this sort that are currently used in cryptography are at risk of attack if sufficiently large quantum computers can ever be built, so resistant problems are sought. Homomorphic encryption is a form of encryption that allows computation on ciphertext, such as arithmetic on numeric values stored in an encrypted database.

References

  1. Goodin, Dan (2012-12-10). "25-GPU cluster cracks every standard Windows password in <6 hours". Ars Technica . Retrieved 2020-11-23.
  2. http://eprint.iacr.org/2008/469.pdf [ bare URL PDF ]
  3. Petit, C.; Quisquater, J.-J.; Tillich, J.-P., "Hard and easy Components of Collision Search in the Zémor-Tillich hash function:new Attacks and Reduced Variants with Equivalent Security" (PDF), {{citation}}: Missing or empty |title= (help)