Confusion and diffusion

Last updated July 30, 2024

In cryptography, confusion and diffusion are two properties of a secure cipher identified by Claude Shannon in his 1945 classified report A Mathematical Theory of Cryptography.^[1] These properties, when present, work together to thwart the application of statistics, and other methods of cryptanalysis.

Confusion in a symmetric cipher is obscuring the local correlation between the input (plaintext), and output (ciphertext) by varying the application of the key to the data, while diffusion is hiding the plaintext statistics by spreading it over a larger area of ciphertext.^[2] Although ciphers can be confusion-only (substitution cipher, one-time pad) or diffusion-only (transposition cipher), any "reasonable" block cipher uses both confusion and diffusion.^[2] These concepts are also important in the design of cryptographic hash functions, and pseudorandom number generators, where decorrelation of the generated values is the main feature. Diffusion (and its avalanche effect) is also applicable to non-cryptographic hash functions.

Definition

Confusion

Confusion means that each binary digit (bit) of the ciphertext should depend on several parts of the key, obscuring the connections between the two.^[3]

The property of confusion hides the relationship between the ciphertext and the key.

This property makes it difficult to find the key from the ciphertext and if a single bit in a key is changed, the calculation of most or all of the bits in the ciphertext will be affected.

Confusion increases the ambiguity of ciphertext and it is used by both block and stream ciphers.

In substitution–permutation networks, confusion is provided by substitution boxes.^[4]

Diffusion

Diffusion means that if we change a single bit of the plaintext, then about half of the bits in the ciphertext should change, and similarly, if we change one bit of the ciphertext, then about half of the plaintext bits should change.^[5] This is equivalent to the expectation that encryption schemes exhibit an avalanche effect.

The purpose of diffusion is to hide the statistical relationship between the ciphertext and the plain text. For example, diffusion ensures that any patterns in the plaintext, such as redundant bits, are not apparent in the ciphertext.^[3] Block ciphers achieve this by "diffusing" the information about the plaintext's structure across the rows and columns of the cipher.

In substitution–permutation networks, diffusion is provided by permutation boxes (a.k.a. permutation layer^[4]). In the beginning of the 21st century a consensus had appeared where the designers preferred the permutation layer to consist of linear Boolean functions, although nonlinear functions can be used, too.^[4]

Theory

In Shannon's original definitions, confusion refers to making the relationship between the ciphertext and the symmetric key as complex and involved as possible; diffusion refers to dissipating the statistical structure of plaintext over the bulk of ciphertext. This complexity is generally implemented through a well-defined and repeatable series of substitutions and permutations. Substitution refers to the replacement of certain components (usually bits) with other components, following certain rules. Permutation refers to manipulation of the order of bits according to some algorithm. To be effective, any non-uniformity of plaintext bits needs to be redistributed across much larger structures in the ciphertext, making that non-uniformity much harder to detect.

In particular, for a randomly chosen input, if one flips the i-th bit, then the probability that the j-th output bit will change should be one half, for any i and j—this is termed the strict avalanche criterion. More generally, one may require that flipping a fixed set of bits should change each output bit with probability one half.

One aim of confusion is to make it very hard to find the key even if one has a large number of plaintext-ciphertext pairs produced with the same key. Therefore, each bit of the ciphertext should depend on the entire key, and in different ways on different bits of the key. In particular, changing one bit of the key should change the ciphertext completely.

Practical applications

Design of a modern block cipher uses both confusion and diffusion,^[2] with confusion changing data between the input and the output by applying a key-dependent non-linear transformation (linear calculations are easier to reverse and thus are easier to break).

Confusion inevitably involves some diffusion,^[6] so a design with a very wide-input S-box can provide the necessary diffusion properties,^{[ citation needed ]} but will be very costly in implementation. Therefore, the practical ciphers utilize relatively small S-boxes, operating on small groups of bits ("bundles"^[7]). For example, the design of AES has 8-bit S-boxes, Serpent − 4-bit, BaseKing and 3-way − 3-bit.^[8] Small S-boxes provide almost no diffusion, so the resources are spent on simpler diffusion transformations.^[6] For example, the wide trail strategy popularized by the Rijndael design, involves a linear mixing transformation that provides high diffusion,^[9] although the security proofs do not depend on the diffusion layer being linear.^[10]

One of the most researched cipher structures uses the substitution-permutation network (SPN) where each round includes a layer of local nonlinear permutations (S-boxes) for confusion and a linear diffusion transformation (usually a multiplication by a matrix over a finite field).^[11] Modern block ciphers mostly follow the confusion layer/diffusion layer model, with the efficiency of the diffusion layer estimated using the so-called branch number, a numerical parameter that can reach the value $s+1$ for $s$ input bundles for the perfect diffusion transformation.^[12] Since the transformations that have high branch numbers (and thus require a lot of bundles as inputs) are costly in implementation, the diffusion layer is sometimes (for example, in the AES) composed from two sublayers, "local diffusion" that processes subsets of the bundles in a bricklayer fashion (each subset is transformed independently) and "dispersion" that makes the bits that were "close" (within one subset of bundles) to become "distant" (spread to different subsets and thus be locally diffused within these new subsets on the next round).^[13]

Analysis of AES

The Advanced Encryption Standard (AES) has both excellent confusion and diffusion. Its confusion look-up tables are very non-linear and good at destroying patterns.^[14] Its diffusion stage spreads every part of the input to every part of the output: changing one bit of input changes half the output bits on average. Both confusion and diffusion are repeated multiple times for each input to increase the amount of scrambling. The secret key is mixed in at every stage so that an attacker cannot precalculate what the cipher does.

None of this happens when a simple one-stage scramble is based on a key. Input patterns would flow straight through to the output. It might look random to the eye but analysis would find obvious patterns and the cipher could be broken.

Related Research Articles

The Advanced Encryption Standard (AES), also known by its original name Rijndael, is a specification for the encryption of electronic data established by the U.S. National Institute of Standards and Technology (NIST) in 2001.

In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called blocks. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage and exchange of data, where such data is secured and authenticated via encryption.

Differential cryptanalysis is a general form of cryptanalysis applicable primarily to block ciphers, but also to stream ciphers and cryptographic hash functions. In the broadest sense, it is the study of how differences in information input can affect the resultant difference at the output. In the case of a block cipher, it refers to a set of techniques for tracing differences through the network of transformation, discovering where the cipher exhibits non-random behavior, and exploiting such properties to recover the secret key.

A stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream (keystream). In a stream cipher, each plaintext digit is encrypted one at a time with the corresponding digit of the keystream, to give a digit of the ciphertext stream. Since encryption of each digit is dependent on the current state of the cipher, it is also known as state cipher. In practice, a digit is typically a bit and the combining operation is an exclusive-or (XOR).

In cryptography, linear cryptanalysis is a general form of cryptanalysis based on finding affine approximations to the action of a cipher. Attacks have been developed for block ciphers and stream ciphers. Linear cryptanalysis is one of the two most widely used attacks on block ciphers; the other being differential cryptanalysis.

<span class="mw-page-title-main">Block cipher mode of operation</span> Cryptography algorithm

In cryptography, a block cipher mode of operation is an algorithm that uses a block cipher to provide information security such as confidentiality or authenticity. A block cipher by itself is only suitable for the secure cryptographic transformation of one fixed-length group of bits called a block. A mode of operation describes how to repeatedly apply a cipher's single-block operation to securely transform amounts of data larger than a block.

Articles related to cryptography include:

In cryptography, an S-box (substitution-box) is a basic component of symmetric key algorithms which performs substitution. In block ciphers, they are typically used to obscure the relationship between the key and the ciphertext, thus ensuring Shannon's property of confusion. Mathematically, an S-box is a nonlinear vectorial Boolean function.

<span class="mw-page-title-main">Substitution–permutation network</span> Cipher design construction

In cryptography, an SP-network, or substitution–permutation network (SPN), is a series of linked mathematical operations used in block cipher algorithms such as AES (Rijndael), 3-Way, Kalyna, Kuznyechik, PRESENT, SAFER, SHARK, and Square.

In cryptography, LOKI97 is a block cipher which was a candidate in the Advanced Encryption Standard competition. It is a member of the LOKI family of ciphers, with earlier instances being LOKI89 and LOKI91. LOKI97 was designed by Lawrie Brown, assisted by Jennifer Seberry and Josef Pieprzyk.

In cryptography, SHARK is a block cipher identified as one of the predecessors of Rijndael.

In cryptography, Madryga is a block cipher published in 1984 by W. E. Madryga. It was designed to be easy and efficient for implementation in software. Serious weaknesses have since been found in the algorithm, but it was one of the first encryption algorithms to make use of data-dependent rotations, later used in other ciphers, such as RC5 and RC6.

In cryptography, the avalanche effect is the desirable property of cryptographic algorithms, typically block ciphers and cryptographic hash functions, wherein if an input is changed slightly, the output changes significantly. In the case of high-quality block ciphers, such a small change in either the key or the plaintext should cause a drastic change in the ciphertext. The actual term was first used by Horst Feistel, although the concept dates back to at least Shannon's diffusion.

In cryptography, Q is a block cipher invented by Leslie McBride. It was submitted to the NESSIE project, but was not selected.

In cryptography, a distinguishing attack is any form of cryptanalysis on data encrypted by a cipher that allows an attacker to distinguish the encrypted data from random data. Modern symmetric-key ciphers are specifically designed to be immune to such an attack. In other words, modern encryption schemes are pseudorandom permutations and are designed to have ciphertext indistinguishability. If an algorithm is found that can distinguish the output from random faster than a brute force search, then that is considered a break of the cipher.

<span class="mw-page-title-main">Permutation box</span>

In cryptography, a permutation box is a method of bit-shuffling used to permute or transpose bits across S-boxes inputs, retaining diffusion while transposing.

The following outline is provided as an overview of and topical guide to cryptography:

In cryptography, a known-key distinguishing attack is an attack model against symmetric ciphers, whereby an attacker who knows the key can find a structural property in cipher, where the transformation from plaintext to ciphertext is not random. There is no common formal definition for what such a transformation may be. The chosen-key distinguishing attack is strongly related, where the attacker can choose a key to introduce such transformations.

Ascon is a family of lightweight authenticated ciphers that had been selected by US National Institute of Standards and Technology (NIST) for future standardization of the lightweight cryptography.

In cryptography, a round or round function is a basic transformation that is repeated (iterated) multiple times inside the algorithm. Splitting a large algorithmic function into rounds simplifies both implementation and cryptanalysis.

References

↑ "Information Theory and Entropy". Model Based Inference in the Life Sciences: A Primer on Evidence. Springer New York. 2008-01-01. pp. 51–82. doi:10.1007/978-0-387-74075-1_3. ISBN 9780387740737.
1 2 3 Stamp & Low 2007, p. 182.
1 2 Shannon, C. E. (October 1949). "Communication Theory of Secrecy Systems*". Bell System Technical Journal. 28 (4): 656–715. doi:10.1002/j.1538-7305.1949.tb00928.x.
1 2 3 Liu, Rijmen & Leander 2018, p. 1.
↑ Stallings, William (2014). Cryptography and Network Security (6th ed.). Upper Saddle River, N.J.: Prentice Hall. pp. 67–68. ISBN 978-0133354690.
1 2 Daemen & Rijmen 2013, p. 130.
↑ Daemen & Rijmen 2013, p. 20.
↑ Daemen & Rijmen 2013, p. 21.
↑ Daemen & Rijmen 2013, p. 126.
↑ Liu, Rijmen & Leander 2018, p. 2.
↑ Li & Wang 2017.
↑ Sajadieh et al. 2012.
↑ Daemen & Rijmen 2013, p. 131.
↑ William, Stallings (2017). Cryptography and Network Security: Principles and Practice, Global Edition. Pearson. p. 177. ISBN 978-1292158587.

Sources

Claude E. Shannon, "A Mathematical Theory of Cryptography", Bell System Technical Memo MM 45-110-02, September 1, 1945.
Claude E. Shannon, "Communication Theory of Secrecy Systems", Bell System Technical Journal, vol. 28–4, pages 656–715, 1949. Archived 2007-06-05 at the Wayback Machine
Wade Trappe and Lawrence C. Washington, Introduction to Cryptography with Coding Theory. Second edition. Pearson Prentice Hall, 2006.
Li, Chaoyun; Wang, Qingju (2017). "Design of Lightweight Linear Diffusion Layers from Near-MDS Matrices" (PDF). IACR Transactions on Symmetric Cryptology. 1: 129–155. doi:10.13154/tosc.v2017.i1.129-155.
Sajadieh, Mahdi; Dakhilalian, Mohammad; Mala, Hamid; Sepehrdad, Pouyan (2012). "Recursive Diffusion Layers for Block Ciphers and Hash Functions". Fast Software Encryption (PDF). Springer Berlin Heidelberg. pp. 385–401. doi:10.1007/978-3-642-34047-5_22. eISSN 1611-3349. ISSN 0302-9743.
Daemen, Joan; Rijmen, Vincent (9 March 2013). The Design of Rijndael: AES - The Advanced Encryption Standard (PDF). Springer Science & Business Media. ISBN 978-3-662-04722-4. OCLC 1259405449.
Stamp, Mark; Low, Richard M. (15 June 2007). Applied Cryptanalysis: Breaking Ciphers in the Real World. John Wiley & Sons. ISBN 978-0-470-14876-1. OCLC 1044324461.
Liu, Yunwen; Rijmen, Vincent; Leander, Gregor (20 January 2018). "Nonlinear diffusion layers" (PDF). Designs, Codes and Cryptography. 86 (11): 2469–2484. doi:10.1007/s10623-018-0458-5. eISSN 1573-7586. ISSN 0925-1022.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Information Theory and Entropy". Model Based Inference in the Life Sciences: A Primer on Evidence. Springer New York. 2008-01-01. pp. 51–82. doi:10.1007/978-0-387-74075-1_3. ISBN 9780387740737.

[FOOTNOTEStampLow2007182-2] 1 2 3 Stamp & Low 2007, p. 182.

[:0-3] 1 2 Shannon, C. E. (October 1949). "Communication Theory of Secrecy Systems*". Bell System Technical Journal. 28 (4): 656–715. doi:10.1002/j.1538-7305.1949.tb00928.x.

[FOOTNOTELiuRijmenLeander20181-4] 1 2 3 Liu, Rijmen & Leander 2018, p. 1.

[5] Stallings, William (2014). Cryptography and Network Security (6th ed.). Upper Saddle River, N.J.: Prentice Hall. pp. 67–68. ISBN 978-0133354690.

[FOOTNOTEDaemenRijmen2013130-6] 1 2 Daemen & Rijmen 2013, p. 130.

[FOOTNOTEDaemenRijmen201320-7] Daemen & Rijmen 2013, p. 20.

[FOOTNOTEDaemenRijmen201321-8] Daemen & Rijmen 2013, p. 21.

[FOOTNOTEDaemenRijmen2013126-9] Daemen & Rijmen 2013, p. 126.

[FOOTNOTELiuRijmenLeander20182-10] Liu, Rijmen & Leander 2018, p. 2.

[FOOTNOTELiWang2017-11] Li & Wang 2017.

[FOOTNOTESajadiehDakhilalianMalaSepehrdad2012-12] Sajadieh et al. 2012.

[FOOTNOTEDaemenRijmen2013131-13] Daemen & Rijmen 2013, p. 131.

[14] William, Stallings (2017). Cryptography and Network Security: Principles and Practice, Global Edition. Pearson. p. 177. ISBN 978-1292158587.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

v t e Block ciphers (security summary)
Common algorithms	AES Blowfish DES (internal mechanics, Triple DES) Serpent SM4 Twofish
Less common algorithms	ARIA Camellia CAST-128 GOST IDEA LEA RC5 RC6 SEED Skipjack TEA XTEA
Other algorithms	3-Way Adiantum Akelarre Anubis Ascon BaseKing BassOmatic BATON BEAR and LION CAST-256 Chiasmus CIKS-1 CIPHERUNICORN-A CIPHERUNICORN-E CLEFIA CMEA Cobra COCONUT98 Crab Cryptomeria/C2 CRYPTON CS-Cipher DEAL DES-X DFC E2 FEAL FEA-M FROG G-DES Grand Cru Hasty Pudding cipher Hierocrypt ICE IDEA NXT Intel Cascade Cipher Iraqi Kalyna KASUMI KeeLoq KHAZAD Khufu and Khafre KN-Cipher Kuznyechik Ladder-DES LOKI (97, 89/91) Lucifer M6 M8 MacGuffin Madryga MAGENTA MARS Mercy MESH MISTY1 MMB MULTI2 MultiSwap New Data Seal NewDES Nimbus NOEKEON NUSH PRESENT Prince Q RC2 REDOC Red Pike S-1 SAFER SAVILLE SC2000 SHACAL SHARK Simon Speck Spectr-H64 Square SXAL/MBAL Threefish Treyfer UES xmx XXTEA Zodiac
Design	Feistel network Key schedule Lai–Massey scheme Product cipher S-box P-box SPN Confusion and diffusion Round Avalanche effect Block size Key size Key whitening (Whitening transformation)
Attack (cryptanalysis)	Brute-force (EFF DES cracker) MITM Biclique attack 3-subset MITM attack Linear (Piling-up lemma) Differential Impossible Truncated Higher-order Differential-linear Distinguishing (Known-key) Integral/Square Boomerang Mod n Related-key Slide Rotational Side-channel Timing Power-monitoring Electromagnetic Acoustic Differential-fault XSL Interpolation Partitioning Rubber-hose Black-bag Davies Rebound Weak key Tau Chi-square Time/memory/data tradeoff
Standardization	AES process CRYPTREC NESSIE NSA Suite B CNSA
Utilization	Initialization vector Mode of operation Padding