Interpolation attack

Last updated July 31, 2024

In cryptography, an interpolation attack is a type of cryptanalytic attack against block ciphers.

After the two attacks, differential cryptanalysis and linear cryptanalysis, were presented on block ciphers, some new block ciphers were introduced, which were proven secure against differential and linear attacks. Among these there were some iterated block ciphers such as the KN-Cipher and the SHARK cipher. However, Thomas Jakobsen and Lars Knudsen showed in the late 1990s that these ciphers were easy to break by introducing a new attack called the interpolation attack.

In the attack, an algebraic function is used to represent an S-box. This may be a simple quadratic, or a polynomial or rational function over a Galois field. Its coefficients can be determined by standard Lagrange interpolation techniques, using known plaintexts as data points. Alternatively, chosen plaintexts can be used to simplify the equations and optimize the attack.

In its simplest version an interpolation attack expresses the ciphertext as a polynomial of the plaintext. If the polynomial has a relative low number of unknown coefficients, then with a collection of plaintext/ciphertext (p/c) pairs, the polynomial can be reconstructed. With the polynomial reconstructed the attacker then has a representation of the encryption, without exact knowledge of the secret key.

The interpolation attack can also be used to recover the secret key.

It is easiest to describe the method with an example.

Example

Let an iterated cipher be given by

c_{i}=(c_{i-1}\oplus k_{i})^{3},

where $c_{0}$ is the plaintext, $c_{i}$ the output of the $i^{th}$ round, $k_{i}$ the secret $i^{th}$ round key (derived from the secret key $K$ by some key schedule), and for a $r$ -round iterated cipher, $c_{r}$ is the ciphertext.

Consider the 2-round cipher. Let $x$ denote the message, and $c$ denote the ciphertext.

Then the output of round 1 becomes

c_{1}=(x+k_{1})^{3}=(x^{2}+k_{1}^{2})(x+k_{1})=x^{3}+k_{1}^{2}x+x^{2}k_{1}+k_{1}^{3},

and the output of round 2 becomes

c_{2}=c=(c_{1}+k_{2})^{3}=(x^{3}+k_{1}^{2}x+x^{2}k_{1}+k_{1}^{3}+k_{2})^{3}

=x^{9}+x^{8}k_{1}+x^{6}k_{2}+x^{4}k_{1}^{2}k_{2}+x^{3}k_{2}^{2}+x^{2}(k_{1}k_{2}^{2}+k_{1}^{4}k_{2})+x(k_{1}^{2}k_{2}^{2}+k_{1}^{8})+k_{1}^{3}k_{2}^{2}+k_{1}^{9}+k_{2}^{3},

Expressing the ciphertext as a polynomial of the plaintext yields

p(x)=a_{1}x^{9}+a_{2}x^{8}+a_{3}x^{6}+a_{4}x^{4}+a_{5}x^{3}+a_{6}x^{2}+a_{7}x+a_{8},

where the $a_{i}$ 's are key dependent constants.

Using as many plaintext/ciphertext pairs as the number of unknown coefficients in the polynomial $p(x)$ , then we can construct the polynomial. This can for example be done by Lagrange Interpolation (see Lagrange polynomial). When the unknown coefficients have been determined, then we have a representation $p(x)$ of the encryption, without knowledge of the secret key $K$ .

Existence

Considering an $m$ -bit block cipher, then there are $2^{m}$ possible plaintexts, and therefore $2^{m}$ distinct $p/c$ pairs. Let there be $n$ unknown coefficients in $p(x)$ . Since we require as many $p/c$ pairs as the number of unknown coefficients in the polynomial, then an interpolation attack exist only if $n\leq 2^{m}$ .

Time complexity

Assume that the time to construct the polynomial $p(x)$ using $p/c$ pairs are small, in comparison to the time to encrypt the required plaintexts. Let there be $n$ unknown coefficients in $p(x)$ . Then the time complexity for this attack is $n$ , requiring $n$ known distinct $p/c$ pairs.

Interpolation attack by Meet-In-The-Middle

Often this method is more efficient. Here is how it is done.

Given an $r$ round iterated cipher with block length $m$ , let $z$ be the output of the cipher after $s$ rounds with $s<r$ . We will express the value of $z$ as a polynomial of the plaintext $x$ , and as a polynomial of the ciphertext $c$ . Let $g(x)\in GF(2^{m})[x]$ be the expression of $z$ via $x$ , and let $h(c)\in GF(2^{m})[c]$ be the expression of $z$ via $c$ . The polynomial $g(x)$ is obtain by computing forward using the iterated formula of the cipher until round $s$ , and the polynomial $h(c)$ is obtain by computing backwards from the iterated formula of the cipher starting from round $r$ until round $s+1$ .

So it should hold that

g(x)=h(c),

and if both $g$ and $h$ are polynomials with a low number of coefficients, then we can solve the equation for the unknown coefficients.

Time complexity

Assume that $g(x)$ can be expressed by $p$ coefficients, and $h(c)$ can be expressed by $q$ coefficients. Then we would need $p+q$ known distinct $p/c$ pairs to solve the equation by setting it up as a matrix equation. However, this matrix equation is solvable up to a multiplication and an addition. So to make sure that we get a unique and non-zero solution, we set the coefficient corresponding to the highest degree to one, and the constant term to zero. Therefore, $p+q-2$ known distinct $p/c$ pairs are required. So the time complexity for this attack is $p+q-2$ , requiring $p+q-2$ known distinct $p/c$ pairs.

By the Meet-In-The-Middle approach the total number of coefficients is usually smaller than using the normal method. This makes the method more efficient, since less $p/c$ pairs are required.

Key-recovery

We can also use the interpolation attack to recover the secret key $K$ .

If we remove the last round of an $r$ -round iterated cipher with block length $m$ , the output of the cipher becomes ${\tilde {y}}=c_{r-1}$ . Call the cipher the reduced cipher. The idea is to make a guess on the last round key $k_{r}$ , such that we can decrypt one round to obtain the output ${\tilde {y}}$ of the reduced cipher. Then to verify the guess we use the interpolation attack on the reduced cipher either by the normal method or by the Meet-In-The-Middle method. Here is how it is done.

By the normal method we express the output ${\tilde {y}}$ of the reduced cipher as a polynomial of the plaintext $x$ . Call the polynomial $p(x)\in GF(2^{m})[x]$ . Then if we can express $p(x)$ with $n$ coefficients, then using $n$ known distinct $p/c$ pairs, we can construct the polynomial. To verify the guess of the last round key, then check with one extra $p/c$ pair if it holds that

p(x)={\tilde {y}}.

If yes, then with high probability the guess of the last round key was correct. If no, then make another guess of the key.

By the Meet-In-The-Middle method we express the output $z$ from round $s<r$ as a polynomial of the plaintext $x$ and as a polynomial of the output of the reduced cipher ${\tilde {y}}$ . Call the polynomials $g(x)$ and $h({\tilde {y}})$ , and let them be expressed by $p$ and $q$ coefficients, respectively. Then with $q+p-2$ known distinct $p/c$ pairs we can find the coefficients. To verify the guess of the last round key, then check with one extra $p/c$ pair if it holds that

g(x)=h({\tilde {y}}).

If yes, then with high probability the guess of the last round key was correct. If no, then make another guess of the key.

Once we have found the correct last round key, then we can continue in a similar fashion on the remaining round keys.

Time complexity

With a secret round key of length $m$ , then there are $2^{m}$ different keys. Each with probability $1/2^{m}$ to be correct if chosen at random. Therefore, we will on average have to make $1/2\cdot 2^{m}$ guesses before finding the correct key.

Hence, the normal method have average time complexity $2^{m-1}(n+1)$ , requiring $n+1$ known distinct $c/p$ pairs, and the Meet-In-The-Middle method have average time complexity $2^{m-1}(p+q-1)$ , requiring $p+q-1$ known distinct $c/p$ pairs.

Real world application

The Meet-in-the-middle attack can be used in a variant to attack S-boxes, which uses the inverse function, because with an $m$ -bit S-box then $S:f(x)=x^{-1}=x^{2^{m}-2}$ in $GF(2^{m})$ .

The block cipher SHARK uses SP-network with S-box $S:f(x)=x^{-1}$ . The cipher is resistant against differential and linear cryptanalysis after a small number of rounds. However it was broken in 1996 by Thomas Jakobsen and Lars Knudsen, using interpolation attack. Denote by SHARK $(n,m,r)$ a version of SHARK with block size $nm$ bits using $n$ parallel $m$ -bit S-boxes in $r$ rounds. Jakobsen and Knudsen found that there exist an interpolation attack on SHARK $(8,8,4)$ (64-bit block cipher) using about $2^{21}$ chosen plaintexts, and an interpolation attack on SHARK $(8,16,7)$ (128-bit block cipher) using about $2^{61}$ chosen plaintexts.

Also Thomas Jakobsen introduced a probabilistic version of the interpolation attack using Madhu Sudan's algorithm for improved decoding of Reed-Solomon codes. This attack can work even when an algebraic relationship between plaintexts and ciphertexts holds for only a fraction of values.

Related Research Articles

In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called blocks. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage and exchange of data, where such data is secured and authenticated via encryption.

Differential cryptanalysis is a general form of cryptanalysis applicable primarily to block ciphers, but also to stream ciphers and cryptographic hash functions. In the broadest sense, it is the study of how differences in information input can affect the resultant difference at the output. In the case of a block cipher, it refers to a set of techniques for tracing differences through the network of transformation, discovering where the cipher exhibits non-random behavior, and exploiting such properties to recover the secret key.

In cryptography, linear cryptanalysis is a general form of cryptanalysis based on finding affine approximations to the action of a cipher. Attacks have been developed for block ciphers and stream ciphers. Linear cryptanalysis is one of the two most widely used attacks on block ciphers; the other being differential cryptanalysis.

<span class="mw-page-title-main">Vigenère cipher</span> Simple type of polyalphabetic encryption system

The Vigenère cipher is a method of encrypting alphabetic text where each letter of the plaintext is encoded with a different Caesar cipher, whose increment is determined by the corresponding letter of another text, the key.

Malleability is a property of some cryptographic algorithms. An encryption algorithm is "malleable" if it is possible to transform a ciphertext into another ciphertext which decrypts to a related plaintext. That is, given an encryption of a plaintext $, it is possible to generate another ciphertext which decrypts to, for a known function, without necessarily knowing or learning .$

<span class="mw-page-title-main">Block cipher mode of operation</span> Cryptography algorithm

In cryptography, a block cipher mode of operation is an algorithm that uses a block cipher to provide information security such as confidentiality or authenticity. A block cipher by itself is only suitable for the secure cryptographic transformation of one fixed-length group of bits called a block. A mode of operation describes how to repeatedly apply a cipher's single-block operation to securely transform amounts of data larger than a block.

In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext that is unreadable by a human or computer without the proper cipher to decrypt it. This process prevents the loss of sensitive information via hacking. Decryption, the inverse of encryption, is the process of turning ciphertext into readable plaintext. Ciphertext is not to be confused with codetext because the latter is a result of a code, not a cipher.

The meet-in-the-middle attack (MITM), a known plaintext attack, is a generic space–time tradeoff cryptographic attack against encryption schemes that rely on performing multiple encryption operations in sequence. The MITM attack is the primary reason why Double DES is not used and why a Triple DES key (168-bit) can be brute-forced by an attacker with 2⁵⁶ space and 2¹¹² operations.

The NTRUEncrypt public key cryptosystem, also known as the NTRU encryption algorithm, is an NTRU lattice-based alternative to RSA and elliptic curve cryptography (ECC) and is based on the shortest vector problem in a lattice.

In cryptography, a classical cipher is a type of cipher that was used historically but for the most part, has fallen into disuse. In contrast to modern cryptographic algorithms, most classical ciphers can be practically computed and solved by hand. However, they are also usually very simple to break with modern technology. The term includes the simple systems used since Greek and Roman times, the elaborate Renaissance ciphers, World War II cryptography such as the Enigma machine and beyond.

The rail fence cipher is a classical type of transposition cipher. It derives its name from the manner in which encryption is performed, in analogy to a fence built with horizontal rails.

The four-square cipher is a manual symmetric encryption technique. It was invented by the French cryptographer Felix Delastelle.

The Two-square cipher, also called double Playfair, is a manual symmetric encryption technique. It was developed to ease the cumbersome nature of the large encryption/decryption matrix used in the four-square cipher while still being slightly stronger than the single-square Playfair cipher.

The slide attack is a form of cryptanalysis designed to deal with the prevailing idea that even weak ciphers can become very strong by increasing the number of rounds, which can ward off a differential attack. The slide attack works in such a way as to make the number of rounds in a cipher irrelevant. Rather than looking at the data-randomizing aspects of the block cipher, the slide attack works by analyzing the key schedule and exploiting weaknesses in it to break the cipher. The most common one is the keys repeating in a cyclic manner.

Disk encryption is a special case of data at rest protection when the storage medium is a sector-addressable device. This article presents cryptographic aspects of the problem. For an overview, see disk encryption. For discussion of different software packages and hardware devices devoted to this problem, see disk encryption software and disk encryption hardware.

In cryptography, a one-way compression function is a function that transforms two fixed-length inputs into a fixed-length output. The transformation is "one-way", meaning that it is difficult given a particular output to compute inputs which compress to that output. One-way compression functions are not related to conventional data compression algorithms, which instead can be inverted exactly or approximately to the original data.

In cryptography, Galois/Counter Mode (GCM) is a mode of operation for symmetric-key cryptographic block ciphers which is widely adopted for its performance. GCM throughput rates for state-of-the-art, high-speed communication channels can be achieved with inexpensive hardware resources.

<span class="mw-page-title-main">Key encapsulation mechanism</span>

In cryptography, a key encapsulation mechanism, or KEM, is a public-key cryptosystem that allows a sender to generate a short secret key and transmit it to a receiver securely, in spite of eavesdropping and intercepting adversaries.

ACE is the collection of units, implementing both a public key encryption scheme and a digital signature scheme. Corresponding names for these schemes — «ACE Encrypt» and «ACE Sign». Schemes are based on Cramer-Shoup public key encryption scheme and Cramer-Shoup signature scheme. Introduced variants of these schemes are intended to achieve a good balance between performance and security of the whole encryption system.

HEAAN is an open source homomorphic encryption (HE) library which implements an approximate HE scheme proposed by Cheon, Kim, Kim and Song (CKKS). The first version of HEAAN was published on GitHub on 15 May 2016, and later a new version of HEAAN with a bootstrapping algorithm was released. Currently, the latest version is Version 2.1.

References

Thomas Jakobsen, Lars Knudsen (January 1997). The Interpolation Attack on Block Ciphers (PDF/PostScript). 4th International Workshop on Fast Software Encryption (FSE '97), LNCS 1267. Haifa: Springer-Verlag. pp. 28–40. Retrieved 2007-07-03.
Thomas Jakobsen (August 25, 1998). Cryptanalysis of Block Ciphers with Probabilistic Non-linear Relations of Low Degree (PDF/PostScript). Advances in Cryptology — CRYPTO '98. Santa Barbara, California: Springer-Verlag. pp. 212–222. Retrieved 2007-07-06. (Video of presentation at Google Video —uses Flash)
Shiho Moriai; Takeshi Shimoyama; Toshinobu Kaneko (March 1999). Interpolation Attacks of the Block Cipher: SNAKE (PDF). FSE '99. Rome: Springer-Verlag. pp. 275–289. doi:10.1007/3-540-48519-8_20 . Retrieved 2022-11-06.
Amr M. Youssef; Guang Gong (April 2000). On the Interpolation Attacks on Block Ciphers (PDF). FSE 2000. New York City: Springer-Verlag. pp. 109–120. Retrieved 2007-07-06.
Kaoru Kurosawa; Tetsu Iwata; Viet Duong Quang (August 2000). Root Finding Interpolation Attack (PDF/PostScript). Proceedings of the 7th Annual International Workshop on Selected Areas in Cryptography (SAC 2000). Waterloo, Ontario: Springer-Verlag. pp. 303–314. Retrieved 2007-07-06.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

Interpolation attack

Contents

Example

Existence

Time complexity

Interpolation attack by Meet-In-The-Middle

Time complexity

Key-recovery

Time complexity

Real world application

Related Research Articles

References