Confusion and diffusion

Last updated

In cryptography, confusion and diffusion are two properties of the operation of a secure cipher identified by Claude Shannon in his 1945 classified report A Mathematical Theory of Cryptography. [1] These properties, when present, work together to thwart the application of statistics and other methods of cryptanalysis.

Contents

Confusion in a symmetric cipher is obscuring the local correlation between the input (plaintext) and output (ciphertext) by varying the application of the key to the data, while diffusion is hiding the plaintext statistics by spreading it over a larger area of ciphertext. [2] Although ciphers can be confusion-only (substitution cipher, one-time pad) or diffusion-only (transposition cipher), any "reasonable" block cipher uses both confusion and diffusion. [2] These concepts are also important in the design of cryptographic hash functions and pseudorandom number generators, where decorrelation of the generated values is the main feature, diffusion (and its avalanche effect) is also applicable to non-cryptographic hash functions.

Definition

Confusion

Confusion means that each binary digit (bit) of the ciphertext should depend on several parts of the key, obscuring the connections between the two. [3]

The property of confusion hides the relationship between the ciphertext and the key.

This property makes it difficult to find the key from the ciphertext and if a single bit in a key is changed, the calculation of most or all of the bits in the ciphertext will be affected.

Confusion increases the ambiguity of ciphertext and it is used by both block and stream ciphers.

In substitution–permutation networks, confusion is provided by substitution boxes. [4]

Diffusion

Diffusion means that if we change a single bit of the plaintext, then about half of the bits in the ciphertext should change, and similarly, if we change one bit of the ciphertext, then about half of the plaintext bits should change. [5] This is equivalent to the expectation that encryption schemes exhibit an avalanche effect.

The purpose of diffusion is to hide the statistical relationship between the ciphertext and the plain text. For example, diffusion ensures that any patterns in the plaintext, such as redundant bits, are not apparent in the ciphertext. [3] Block ciphers achieve this by "diffusing" the information about the plaintext's structure across the rows and columns of the cipher.

In substitution–permutation networks, diffusion is provided by permutation boxes (a.k.a. permutation layer [4] ). In the beginning of the 21st century a consensus had appeared where the designers preferred the permutation layer to consist of linear Boolean functions, although nonlinear functions can be used, too. [4]

Theory

In Shannon's original definitions, confusion refers to making the relationship between the ciphertext and the symmetric key as complex and involved as possible; diffusion refers to dissipating the statistical structure of plaintext over the bulk of ciphertext. This complexity is generally implemented through a well-defined and repeatable series of substitutions and permutations. Substitution refers to the replacement of certain components (usually bits) with other components, following certain rules. Permutation refers to manipulation of the order of bits according to some algorithm. To be effective, any non-uniformity of plaintext bits needs to be redistributed across much larger structures in the ciphertext, making that non-uniformity much harder to detect.

In particular, for a randomly chosen input, if one flips the i-th bit, then the probability that the j-th output bit will change should be one half, for any i and j—this is termed the strict avalanche criterion. More generally, one may require that flipping a fixed set of bits should change each output bit with probability one half.

One aim of confusion is to make it very hard to find the key even if one has a large number of plaintext-ciphertext pairs produced with the same key. Therefore, each bit of the ciphertext should depend on the entire key, and in different ways on different bits of the key. In particular, changing one bit of the key should change the ciphertext completely.

Practical applications

Design of a modern block cipher uses both confusion and diffusion, [2] with confusion changing data between the input and the output by applying a key-dependent non-linear transformation (linear calculations are easier to reverse and thus are easier to break).

Confusion inevitably involves some diffusion, [6] so a design with a very wide-input S-box can provide the necessary diffusion properties,[ citation needed ] but will be very costly in implementation. Therefore, the practical ciphers utilize relatively small S-boxes, operating on small groups of bits ("bundles" [7] ). For example, the design of AES has 8-bit S-boxes, Serpent − 4-bit, BaseKing and 3-way − 3-bit. [8] Small S-boxes provide almost no diffusion, so the resources are spent on simpler diffusion transformations. [6] For example, the wide trail strategy popularized by the Rijndael design, involves a linear mixing transformation that provides high diffusion, [9] although the security proofs do not depend on the diffusion layer being linear. [10]

One of the most researched cipher structures uses the substitution-permutation network (SPN) where each round includes a layer of local nonlinear permutations (S-boxes) for confusion and a linear diffusion transformation (usually a multiplication by a matrix over a finite field). [11] Modern block ciphers mostly follow the confusion layer/diffusion layer model, with the efficiency of the diffusion layer estimated using the so-called branch number, a numerical parameter that can reach the value for s input bundles for the perfect diffusion transformation. [12] Since the transformations that have high branch numbers (and thus require a lot of bundles as inputs) are costly in implementation, the diffusion layer is sometimes (for example, in the AES) composed from two sublayers, "local diffusion" that processes subsets of the bundles in a bricklayer fashion (each subset is transformed independently) and "dispersion" that makes the bits that were "close" (within one subset of bundles) to become "distant" (spread to different subsets and thus be locally diffused within these new subsets on the next round). [13]

Analysis of AES

The Advanced Encryption Standard (AES) has both excellent confusion and diffusion. Its confusion look-up tables are very non-linear and good at destroying patterns. [14] Its diffusion stage spreads every part of the input to every part of the output: changing one bit of input changes half the output bits on average. Both confusion and diffusion are repeated multiple times for each input to increase the amount of scrambling. The secret key is mixed in at every stage so that an attacker cannot precalculate what the cipher does.

None of this happens when a simple one-stage scramble is based on a key. Input patterns would flow straight through to the output. It might look random to the eye but analysis would find obvious patterns and the cipher could be broken.

See also

Related Research Articles

<span class="mw-page-title-main">Advanced Encryption Standard</span> Standard for the encryption of electronic data

The Advanced Encryption Standard (AES), also known by its original name Rijndael, is a specification for the encryption of electronic data established by the U.S. National Institute of Standards and Technology (NIST) in 2001.

In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called blocks. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage and exchange of data, where such data is secured and authenticated via encryption.

<span class="mw-page-title-main">Cipher</span> Algorithm for encrypting and decrypting information

In cryptography, a cipher is an algorithm for performing encryption or decryption—a series of well-defined steps that can be followed as a procedure. An alternative, less common term is encipherment. To encipher or encode is to convert information into cipher or code. In common parlance, "cipher" is synonymous with "code", as they are both a set of steps that encrypt a message; however, the concepts are distinct in cryptography, especially classical cryptography.

Differential cryptanalysis is a general form of cryptanalysis applicable primarily to block ciphers, but also to stream ciphers and cryptographic hash functions. In the broadest sense, it is the study of how differences in information input can affect the resultant difference at the output. In the case of a block cipher, it refers to a set of techniques for tracing differences through the network of transformation, discovering where the cipher exhibits non-random behavior, and exploiting such properties to recover the secret key.

<span class="mw-page-title-main">Stream cipher</span> Type of symmetric key cipher

A stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream (keystream). In a stream cipher, each plaintext digit is encrypted one at a time with the corresponding digit of the keystream, to give a digit of the ciphertext stream. Since encryption of each digit is dependent on the current state of the cipher, it is also known as state cipher. In practice, a digit is typically a bit and the combining operation is an exclusive-or (XOR).

In cryptography, linear cryptanalysis is a general form of cryptanalysis based on finding affine approximations to the action of a cipher. Attacks have been developed for block ciphers and stream ciphers. Linear cryptanalysis is one of the two most widely used attacks on block ciphers; the other being differential cryptanalysis.

In cryptography, a block cipher mode of operation is an algorithm that uses a block cipher to provide information security such as confidentiality or authenticity. A block cipher by itself is only suitable for the secure cryptographic transformation of one fixed-length group of bits called a block. A mode of operation describes how to repeatedly apply a cipher's single-block operation to securely transform amounts of data larger than a block.

<span class="mw-page-title-main">Ciphertext</span> Encrypted information

In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext that is unreadable by a human or computer without the proper cipher to decrypt it. This process prevents the loss of sensitive information via hacking. Decryption, the inverse of encryption, is the process of turning ciphertext into readable plaintext. Ciphertext is not to be confused with codetext because the latter is a result of a code, not a cipher.

Articles related to cryptography include:

In cryptography, an S-box (substitution-box) is a basic component of symmetric key algorithms which performs substitution. In block ciphers, they are typically used to obscure the relationship between the key and the ciphertext, thus ensuring Shannon's property of confusion. Mathematically, an S-box is a nonlinear vectorial Boolean function.

<span class="mw-page-title-main">Substitution–permutation network</span> Cipher design construction

In cryptography, an SP-network, or substitution–permutation network (SPN), is a series of linked mathematical operations used in block cipher algorithms such as AES (Rijndael), 3-Way, Kalyna, Kuznyechik, PRESENT, SAFER, SHARK, and Square.

<span class="mw-page-title-main">LOKI97</span> Block cipher

In cryptography, LOKI97 is a block cipher which was a candidate in the Advanced Encryption Standard competition. It is a member of the LOKI family of ciphers, with earlier instances being LOKI89 and LOKI91. LOKI97 was designed by Lawrie Brown, assisted by Jennifer Seberry and Josef Pieprzyk.

In cryptography, SHARK is a block cipher identified as one of the predecessors of Rijndael.

<span class="mw-page-title-main">Avalanche effect</span> Concept in cryptography

In cryptography, the avalanche effect is the desirable property of cryptographic algorithms, typically block ciphers and cryptographic hash functions, wherein if an input is changed slightly, the output changes significantly. In the case of high-quality block ciphers, such a small change in either the key or the plaintext should cause a drastic change in the ciphertext. The actual term was first used by Horst Feistel, although the concept dates back to at least Shannon's diffusion.

In cryptography, a distinguishing attack is any form of cryptanalysis on data encrypted by a cipher that allows an attacker to distinguish the encrypted data from random data. Modern symmetric-key ciphers are specifically designed to be immune to such an attack. In other words, modern encryption schemes are pseudorandom permutations and are designed to have ciphertext indistinguishability. If an algorithm is found that can distinguish the output from random faster than a brute force search, then that is considered a break of the cipher.

<span class="mw-page-title-main">Permutation box</span>

In cryptography, a permutation box is a method of bit-shuffling used to permute or transpose bits across S-boxes inputs, retaining diffusion while transposing.

The following outline is provided as an overview of and topical guide to cryptography:

In cryptography, a known-key distinguishing attack is an attack model against symmetric ciphers, whereby an attacker who knows the key can find a structural property in cipher, where the transformation from plaintext to ciphertext is not random. There is no common formal definition for what such a transformation may be. The chosen-key distinguishing attack is strongly related, where the attacker can choose a key to introduce such transformations.

Ascon is a family of lightweight authenticated ciphers that had been selected by US National Institute of Standards and Technology (NIST) for future standardization of the lightweight cryptography.

In cryptography, a round or round function is a basic transformation that is repeated (iterated) multiple times inside the algorithm. Splitting a large algorithmic function into rounds simplifies both implementation and cryptanalysis.

References

  1. "Information Theory and Entropy". Model Based Inference in the Life Sciences: A Primer on Evidence. Springer New York. 2008-01-01. pp. 51–82. doi:10.1007/978-0-387-74075-1_3. ISBN   9780387740737.
  2. 1 2 3 Stamp & Low 2007, p. 182.
  3. 1 2 Shannon, C. E. (October 1949). "Communication Theory of Secrecy Systems*". Bell System Technical Journal. 28 (4): 656–715. doi:10.1002/j.1538-7305.1949.tb00928.x.
  4. 1 2 3 Liu, Rijmen & Leander 2018, p. 1.
  5. Stallings, William (2014). Cryptography and Network Security (6th ed.). Upper Saddle River, N.J.: Prentice Hall. pp. 67–68. ISBN   978-0133354690.
  6. 1 2 Daemen & Rijmen 2013, p. 130.
  7. Daemen & Rijmen 2013, p. 20.
  8. Daemen & Rijmen 2013, p. 21.
  9. Daemen & Rijmen 2013, p. 126.
  10. Liu, Rijmen & Leander 2018, p. 2.
  11. Li & Wang 2017.
  12. Sajadieh et al. 2012.
  13. Daemen & Rijmen 2013, p. 131.
  14. William, Stallings (2017). Cryptography and Network Security: Principles and Practice, Global Edition. Pearson. p. 177. ISBN   978-1292158587.

Sources