Fast syndrome-based hash

Fast syndrome-based hash function (FSB)
General
Designers	Daniel Augot, Matthieu Finiasz, Nicolas Sendrier
First published	2003
Derived from	McEliece cryptosystem and Niederreiter cryptosystem
Successors	Improved fast syndrome-based hash function
Related to	Syndrome-based hash function
Detail
Digest sizes	Scalable

Last updated August 13, 2024

In cryptography, the fast syndrome-based hash functions (FSB) are a family of cryptographic hash functions introduced in 2003 by Daniel Augot, Matthieu Finiasz, and Nicolas Sendrier.^[1] Unlike most other cryptographic hash functions in use today, FSB can to a certain extent be proven to be secure. More exactly, it can be proven that breaking FSB is at least as difficult as solving a certain NP-complete problem known as regular syndrome decoding so FSB is provably secure. Though it is not known whether NP-complete problems are solvable in polynomial time, it is often assumed that they are not.

Description of the hash function
Definitions
The compression function
Example of the compression
Security proof of FSB
Pre-image resistance and regular syndrome decoding (RSD)
Collision resistance and 2-regular null syndrome decoding (2-RNSD)
Examples
Linear cryptanalysis
Practical security results
Genesis
Other properties
Variants
References
External links

Several versions of FSB have been proposed, the latest of which was submitted to the SHA-3 cryptography competition but was rejected in the first round. Though all versions of FSB claim provable security, some preliminary versions were eventually broken.^[2] The design of the latest version of FSB has however taken this attack into account and remains secure to all currently known attacks.

As usual, provable security comes at a cost. FSB is slower than traditional hash functions and uses quite a lot of memory, which makes it impractical on memory constrained environments. Furthermore, the compression function used in FSB needs a large output size to guarantee security. This last problem has been solved in recent versions by simply compressing the output by another compression function called Whirlpool. However, though the authors argue that adding this last compression does not reduce security, it makes a formal security proof impossible.^[3]

Description of the hash function

We start with a compression function $\phi$ with parameters ${n,r,w}$ such that $n>w$ and $w\log(n/w)>r$ . This function will only work on messages with length $s=w\log(n/w)$ ; $r$ will be the size of the output. Furthermore, we want $n,r,w,s$ and $\log(n/w)$ to be natural numbers, where $\log$ denotes the binary logarithm. The reason for $w\cdot \log(n/w)>r$ is that we want $\phi$ to be a compression function, so the input must be larger than the output. We will later use the Merkle–Damgård construction to extend the domain to inputs of arbitrary lengths.

The basis of this function consists of a (randomly chosen) binary $r\times n$ matrix $H$ which acts on a message of $n$ bits by matrix multiplication. Here we encode the $w\log(n/w)$ -bit message as a vector in $(\mathbf {F} _{2})^{n}$ , the $n$ -dimensional vector space over the field of two elements, so the output will be a message of $r$ bits.

For security purposes as well as to get a faster hash speed we want to use only “regular words of weight $w$ ” as input for our matrix.

Definitions

A message is called a word of weight $w$ and length $n$ if it consists of $n$ bits and exactly $w$ of those bits are ones.
A word of weight $w$ and length $n$ is called regular if in every interval $[(i-1)(n/w),i(n/w))$ it contains exactly one nonzero entry for all $0<i<w+1$ . More intuitively, this means that if we chop up the message in w equal parts, then each part contains exactly one nonzero entry.

The compression function

There are exactly $(n/w)^{w}$ different regular words of weight $w$ and length $n$ , so we need exactly $\log((n/w)^{w})=w\log(n/w)=s$ bits of data to encode these regular words. We fix a bijection from the set of bit strings of length $s$ to the set of regular words of weight $w$ and length $n$ and then the FSB compression function is defined as follows :

input: a message of size $s$
convert to regular word of length $n$ and weight $w$
multiply by the matrix $H$
output: hash of size $r$

This version is usually called syndrome-based compression. It is very slow and in practice done in a different and faster way resulting in fast syndrome-based compression. We split $H$ into sub-matrices $H_{i}$ of size $r\times n/w$ and we fix a bijection from the bit strings of length $w\log(n/w)$ to the set of sequences of $w$ numbers between 1 and $n/w$ . This is equivalent to a bijection to the set of regular words of length $n$ and weight $w$ , since we can see such a word as a sequence of numbers between 1 and $n/w$ . The compression function looks as follows:

Input: message of size $s$
Convert $s$ to a sequence of $w$ numbers $s_{1},\dots ,s_{w}$ between 1 and $n/w$
Add the corresponding columns of the matrices $H_{i}$ to obtain a binary string a length $r$
Output: hash of size $r$

We can now use the Merkle–Damgård construction to generalize the compression function to accept inputs of arbitrary lengths.

Example of the compression

Situation and initialization: Hash a message $s=010011$ using $4\times 12$ matrix H
$H=\left({\begin{array}{llllcllllcllll}1&0&1&1&~&0&1&0&0&~&1&0&1&1\\0&1&0&0&~&0&1&1&1&~&0&1&0&0\\0&1&1&1&~&0&1&0&0&~&1&0&1&0\\1&1&0&0&~&1&0&1&1&~&0&0&0&1\end{array}}\right)$
that is separated into $w=3$ sub-blocks $H_{1}$ , $H_{2}$ , $H_{3}$ .

Algorithm:

We split the input $s$ into $w=3$ parts of length $\log _{2}(12/3)=2$ and we get $s_{1}=01$ , $s_{2}=00$ , $s_{3}=11$ .
We convert each $s_{i}$ into an integer and get $s_{1}=1$ , $s_{2}=0$ , $s_{3}=3$ .
From the first sub-matrix $H_{1}$ , we pick the column 2, from the second sub-matrix $H_{2}$ the column 1 and from the third sub-matrix the column 4.
We add the chosen columns and obtain the result $r=0111\oplus 0001\oplus 1001=1111$ .

Security proof of FSB

The Merkle–Damgård construction is proven to base its security only on the security of the used compression function. So we only need to show that the compression function $\phi$ is secure.

A cryptographic hash function needs to be secure in three different aspects:

Pre-image resistance: Given a Hash h it should be hard to find a message m such that Hash(m)=h
Second pre-image resistance: Given a message m₁ it should be hard to find a message m₂ such that Hash(m₁) = Hash(m₂)
Collision resistance: It should be hard to find two different messages m₁ and m₂ such that Hash(m₁)=Hash(m₂)

Note that if an adversary can find a second pre-image, then it can certainly find a collision. This means that if we can prove our system to be collision resistant, it will certainly be second-pre-image resistant.

Usually in cryptography hard means something like “almost certainly beyond the reach of any adversary who must be prevented from breaking the system”. We will however need a more exact meaning of the word hard. We will take hard to mean “The runtime of any algorithm that finds a collision or pre-image will depend exponentially on size of the hash value”. This means that by relatively small additions to the hash size, we can quickly reach high security.

Pre-image resistance and regular syndrome decoding (RSD)

As said before, the security of FSB depends on a problem called regular syndrome decoding (RSD). Syndrome decoding is originally a problem from coding theory but its NP-completeness makes it a nice application for cryptography. Regular syndrome decoding is a special case of syndrome decoding and is defined as follows:

Definition of RSD: given $w$ matrices $H_{i}$ of dimension $r\times (n/w)$ and a bit string $S$ of length $r$ such that there exists a set of $w$ columns, one in each $H_{i}$ , summing to $S$ . Find such a set of columns.

This problem has been proven to be NP-complete by a reduction from 3-dimensional matching. Again, though it is not known whether there exist polynomial time algorithms for solving NP-complete problems, none are known and finding one would be a huge discovery.

It is easy to see that finding a pre-image of a given hash $S$ is exactly equivalent to this problem, so the problem of finding pre-images in FSB must also be NP-complete.

We still need to prove collision resistance. For this we need another NP-complete variation of RSD: 2-regular null syndrome decoding.

Collision resistance and 2-regular null syndrome decoding (2-RNSD)

Definition of 2-RNSD: Given $w$ matrices $H_{i}$ of dimension $r\times (n/w)$ and a bit string $S$ of length $r$ such that there exists a set of $w'$ columns, two or zero in each $H_{i}$ , summing to zero. $(0<w'<2w)$ . Find such a set of columns.

2-RNSD has also been proven to be NP-complete by a reduction from 3-dimensional matching.

Just like RSD is in essence equivalent to finding a regular word $w$ such that $Hw=S$ , 2-RNSD is equivalent to finding a 2-regular word $w'$ such that $Hw'=0$ . A 2-regular word of length $n$ and weight $w$ is a bit string of length $n$ such that in every interval $[(i-1)w,iw)$ exactly two or zero entries are equal to 1. Note that a 2-regular word is just a sum of two regular words.

Suppose that we have found a collision, so we have Hash(m₁) = Hash(m₂) with $m_{1}\neq m_{2}$ . Then we can find two regular words $w_{1}$ and $w_{2}$ such that $Hw_{1}=Hw_{2}$ . We then have $H(w_{1}+w_{2})=Hw_{1}+Hw_{2}=2Hw_{1}=0$ ; $(w_{1}+w_{2})$ is a sum of two different regular words and so must be a 2-regular word of which the hash is zero, so we have solved an instance of 2-RNSD. We conclude that finding collisions in FSB is at least as difficult as solving 2-RNSD and so must be NP-complete.

The latest versions of FSB use the compression function Whirlpool to further compress the hash output. Though this cannot be proven, the authors argue that this last compression does not reduce security. Note that even if one were able to find collisions in Whirlpool, one would still need to find the collisions pre-images in the original FSB compression function to find a collision in FSB.

Examples

Solving RSD, we are in the opposite situation as when hashing. Using the same values as in the previous example, we are given $H$ separated into $w=3$ sub-blocks and a string $r=1111$ . We are asked to find in each sub-block exactly one column such that they would all sum to $r$ . The expected answer is thus $s_{1}=1$ , $s_{2}=0$ , $s_{3}=3$ . This is known to be hard to compute for large matrices.

In 2-RNSD we want to find in each sub-block not one column, but two or zero such that they would sum up to 0000 (and not to $r$ ). In the example, we might use column (counting from 0) 2 and 3 from $H_{1}$ , no column from $H_{2}$ column 0 and 2 from $H_{3}$ . More solutions are possible, for example might use no columns from $H_{3}$ .

Linear cryptanalysis

The provable security of FSB means that finding collisions is NP-complete. But the proof is a reduction to a problem with asymptotically hard worst-case complexity. This offers only limited security assurance as there still can be an algorithm that easily solves the problem for a subset of the problem space. For example, there exists a linearization method that can be used to produce collisions of in a matter of seconds on a desktop PC for early variants of FSB with claimed 2^128 security. It is shown that the hash function offers minimal pre-image or collision resistance when the message space is chosen in a specific way.

Practical security results

The following table shows the complexity of the best known attacks against FSB.

Output size (bits)	Complexity of collision search	Complexity of inversion
160	2^100.3	2^163.6
224	2^135.3	2^229.0
256	2^190.0	2^261.0
384	2^215.5	2^391.5
512	2^285.6	2^527.4

Genesis

FSB is a speed-up version of syndrome-based hash function (SB). In the case of SB the compression function is very similar to the encoding function of Niederreiter's version of McEliece cryptosystem. Instead of using the parity check matrix of a permuted Goppa code, SB uses a random matrix $H$ . From the security point of view this can only strengthen the system.

Other properties

Both the block size of the hash function and the output size are completely scalable.
The speed can be adjusted by adjusting the number of bitwise operations used by FSB per input bit.
The security can be adjusted by adjusting the output size.
Bad instances exist and one must take care when choosing the matrix $H$ .
The matrix used in the compression function may grow large in certain situations. This might be a limitation when trying to use FSB on memory constrained devices. This problem was solved in the related hash function called Improved FSB, which is still provably secure, but relies on slightly stronger assumptions.

Variants

In 2007, IFSB was published.^[3] In 2010, S-FSB was published, which is 30% faster than the original. ^[4]

In 2011, D. J. Bernstein and Tanja Lange published RFSB, which is 10x faster than the original FSB-256. ^[5] RFSB was shown to run very fast on the Spartan 6 FPGA, reaching throughputs of around 5 Gbit/s.> ^[6]

Related Research Articles

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

<span class="mw-page-title-main">HMAC</span> Computer communications authentication algorithm

In cryptography, an HMAC is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret cryptographic key. As with any MAC, it may be used to simultaneously verify both the data integrity and authenticity of a message. An HMAC is a type of keyed hash function that can also be used in a key derivation scheme or a key stretching scheme.

A birthday attack is a bruteforce collision attack that exploits the mathematics behind the birthday problem in probability theory. This attack can be used to abuse communication between two or more parties. The attack depends on the higher likelihood of collisions found between random attack attempts and a fixed degree of permutations (pigeonholes). With a birthday attack, it is possible to find a collision of a hash function with $chance in, with being the classical preimage resistance security with the same probability. There is a general result that quantum computers can perform birthday attacks, thus breaking collision resistance, in .$

In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here, "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems. Not being one-to-one is not considered sufficient for a function to be called one-way.

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

In computer science and cryptography, Whirlpool is a cryptographic hash function. It was designed by Vincent Rijmen and Paulo S. L. M. Barreto, who first described it in 2000.

In cryptography, MDC-2 is a cryptographic hash function. MDC-2 is a hash function based on a block cipher with a proof of security in the ideal-cipher model. The length of the output hash depends on the underlying block cipher used.

The GOST hash function, defined in the standards GOST R 34.11-94 and GOST 34.311-95 is a 256-bit cryptographic hash function. It was initially defined in the Russian national standard GOST R 34.11-94 Information Technology – Cryptographic Information Security – Hash Function. The equivalent standard used by other member-states of the CIS is GOST 34.311-95.

In cryptography, a one-way compression function is a function that transforms two fixed-length inputs into a fixed-length output. The transformation is "one-way", meaning that it is difficult given a particular output to compute inputs which compress to that output. One-way compression functions are not related to conventional data compression algorithms, which instead can be inverted exactly or approximately to the original data.

In cryptography, the Merkle–Damgård construction or Merkle–Damgård hash function is a method of building collision-resistant cryptographic hash functions from collision-resistant one-way compression functions. This construction was used in the design of many popular hash algorithms such as MD5, SHA-1 and SHA-2.

SHA-3 is the latest member of the Secure Hash Algorithm family of standards, released by NIST on August 5, 2015. Although part of the same series of standards, SHA-3 is internally different from the MD5-like structure of SHA-1 and SHA-2.

In cryptography, Very Smooth Hash (VSH) is a provably secure cryptographic hash function invented in 2005 by Scott Contini, Arjen Lenstra and Ron Steinfeld. Provably secure means that finding collisions is as difficult as some known hard mathematical problem. Unlike other provably secure collision-resistant hashes, VSH is efficient and usable in practice. Asymptotically, it only requires a single multiplication per log(n) message-bits and uses RSA-type arithmetic. Therefore, VSH can be useful in embedded environments where code space is limited.

In cryptography, cryptographic hash functions can be divided into two main categories. In the first category are those functions whose designs are based on mathematical problems, and whose security thus follows from rigorous mathematical proofs, complexity theory and formal reduction. These functions are called Provably Secure Cryptographic Hash Functions. To construct these is very difficult, and few examples have been introduced. Their practical use is limited.

The elliptic curve only hash (ECOH) algorithm was submitted as a candidate for SHA-3 in the NIST hash function competition. However, it was rejected in the beginning of the competition since a second pre-image attack was found.

In cryptography, SWIFFT is a collection of provably secure hash functions. It is based on the concept of the fast Fourier transform (FFT). SWIFFT is not the first hash function based on FFT, but it sets itself apart by providing a mathematical proof of its security. It also uses the LLL basis reduction algorithm. It can be shown that finding collisions in SWIFFT is at least as difficult as finding short vectors in cyclic/ideal lattices in the worst case. By giving a security reduction to the worst-case scenario of a difficult mathematical problem, SWIFFT gives a much stronger security guarantee than most other cryptographic hash functions.

Post-quantum cryptography (PQC), sometimes referred to as quantum-proof, quantum-safe, or quantum-resistant, is the development of cryptographic algorithms that are thought to be secure against a cryptanalytic attack by a quantum computer. Most widely-used public-key algorithms rely on the difficulty of one of three mathematical problems: the integer factorization problem, the discrete logarithm problem or the elliptic-curve discrete logarithm problem. All of these problems could be easily solved on a sufficiently powerful quantum computer running Shor's algorithm or even faster and less demanding alternatives.

In discrete mathematics, ideal lattices are a special class of lattices and a generalization of cyclic lattices. Ideal lattices naturally occur in many parts of number theory, but also in other areas. In particular, they have a significant place in cryptography. Micciancio defined a generalization of cyclic lattices as ideal lattices. They can be used in cryptosystems to decrease by a square root the number of parameters necessary to describe a lattice, making them more efficient. Ideal lattices are a new concept, but similar lattice classes have been used for a long time. For example, cyclic lattices, a special case of ideal lattices, are used in NTRUEncrypt and NTRUSign.

Network coding has been shown to optimally use bandwidth in a network, maximizing information flow but the scheme is very inherently vulnerable to pollution attacks by malicious nodes in the network. A node injecting garbage can quickly affect many receivers. The pollution of network packets spreads quickly since the output of honest node is corrupted if at least one of the incoming packets is corrupted.

In cryptography, an accumulator is a one way membership hash function. It allows users to certify that potential candidates are a member of a certain set without revealing the individual members of the set. This concept was formally introduced by Josh Benaloh and Michael de Mare in 1993.

In computer science, a retrieval data structure, also known as static function, is a space-efficient dictionary-like data type composed of a collection of pairs that allows the following operations:

References

↑ Augot, D.; Finiasz, M.; Sendrier, N. (2003), A fast provably secure cryptographic hash function
↑ Saarinen, Markku-Juhani O. (2007), "Linearization Attacks Against Syndrome Based Hashes", Progress in Cryptology – INDOCRYPT 2007 (PDF), Lecture Notes in Computer Science, vol. 4859, pp. 1–9, doi:10.1007/978-3-540-77026-8_1, ISBN 978-3-540-77025-1 , retrieved 2022-11-12
1 2 Finiasz, M.; Gaborit, P.; Sendrier, N. (2007), Improved Fast Syndrome Based Cryptographic Hash Functions (PDF), ECRYPT Hash Workshop 2007, archived from the original (PDF) on 2016-03-03, retrieved 2010-01-04
↑ Meziani, Mohammed; Dagdelen, Özgür; Cayrel, Pierre-Louis; El Yousfi Alaoui, Sidi Mohamed (2011). "S-FSB: An Improved Variant of the FSB Hash Family" (PDF). Information Security and Assurance. Communications in Computer and Information Science. Vol. 200. pp. 132–145. doi:10.1007/978-3-642-23141-4_13. ISBN 978-3-642-23140-7. Archived from the original (PDF) on 2015-12-22. Retrieved 2014-12-10.
↑ Bernstein, Daniel J.; Lange, Tanja; Peters, Christiane; Schwabe, Peter (2011), "Really Fast Syndrome-Based Hashing", Progress in Cryptology – AFRICACRYPT 2011 (PDF), Lecture Notes in Computer Science, vol. 6737, pp. 134–152, doi:10.1007/978-3-642-21969-6_9, ISBN 978-3-642-21968-9 , retrieved 2022-11-12
↑ von Maurich, Ingo; Güneysu, Tim (2012), "Embedded Syndrome-Based Hashing", Progress in Cryptology - INDOCRYPT 2012 (PDF), Lecture Notes in Computer Science, vol. 7668, pp. 339–357, doi:10.1007/978-3-642-34931-7_20, ISBN 978-3-642-34930-0, archived from the original (PDF) on 2015-05-02, retrieved 2014-12-10

External links

FSB website for SHA-3 competition

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Augot, D.; Finiasz, M.; Sendrier, N. (2003), A fast provably secure cryptographic hash function

[2] Saarinen, Markku-Juhani O. (2007), "Linearization Attacks Against Syndrome Based Hashes", Progress in Cryptology – INDOCRYPT 2007 (PDF), Lecture Notes in Computer Science, vol. 4859, pp. 1–9, doi:10.1007/978-3-540-77026-8_1, ISBN 978-3-540-77025-1 , retrieved 2022-11-12

[Finiasz-2007-3] 1 2 Finiasz, M.; Gaborit, P.; Sendrier, N. (2007), Improved Fast Syndrome Based Cryptographic Hash Functions (PDF), ECRYPT Hash Workshop 2007, archived from the original (PDF) on 2016-03-03, retrieved 2010-01-04

[4] Meziani, Mohammed; Dagdelen, Özgür; Cayrel, Pierre-Louis; El Yousfi Alaoui, Sidi Mohamed (2011). "S-FSB: An Improved Variant of the FSB Hash Family" (PDF). Information Security and Assurance. Communications in Computer and Information Science. Vol. 200. pp. 132–145. doi:10.1007/978-3-642-23141-4_13. ISBN 978-3-642-23140-7. Archived from the original (PDF) on 2015-12-22. Retrieved 2014-12-10.

[5] Bernstein, Daniel J.; Lange, Tanja; Peters, Christiane; Schwabe, Peter (2011), "Really Fast Syndrome-Based Hashing", Progress in Cryptology – AFRICACRYPT 2011 (PDF), Lecture Notes in Computer Science, vol. 6737, pp. 134–152, doi:10.1007/978-3-642-21969-6_9, ISBN 978-3-642-21968-9 , retrieved 2022-11-12

[6] von Maurich, Ingo; Güneysu, Tim (2012), "Embedded Syndrome-Based Hashing", Progress in Cryptology - INDOCRYPT 2012 (PDF), Lecture Notes in Computer Science, vol. 7668, pp. 339–357, doi:10.1007/978-3-642-34931-7_20, ISBN 978-3-642-34930-0, archived from the original (PDF) on 2015-05-02, retrieved 2014-12-10

[1]

[2]

[3]

[4]

[5]

[6]