Chosen-plaintext attack

Last updated September 06, 2024

A chosen-plaintext attack (CPA) is an attack model for cryptanalysis which presumes that the attacker can obtain the ciphertexts for arbitrary plaintexts.^[1] The goal of the attack is to gain information that reduces the security of the encryption scheme.^[2]

Introduction

In a chosen-plaintext attack the adversary can (possibly adaptively) ask for the ciphertexts of arbitrary plaintext messages. This is formalized by allowing the adversary to interact with an encryption oracle, viewed as a black box. The attacker’s goal is to reveal all or a part of the secret encryption key.

It may seem infeasible in practice that an attacker could obtain ciphertexts for given plaintexts. However, modern cryptography is implemented in software or hardware and is used for a diverse range of applications; for many cases, a chosen-plaintext attack is often very feasible (see also In practice). Chosen-plaintext attacks become extremely important in the context of public key cryptography where the encryption key is public and so attackers can encrypt any plaintext they choose.

Different forms

There are two forms of chosen-plaintext attacks:

Batch chosen-plaintext attack, where the adversary chooses all of the plaintexts before seeing any of the corresponding ciphertexts. This is often the meaning intended by "chosen-plaintext attack" when this is not qualified.
Adaptive chosen-plaintext attack (CPA2), where the adversary can request the ciphertexts of additional plaintexts after seeing the ciphertexts for some plaintexts.

General method of an attack

A general batch chosen-plaintext attack is carried out as follows ^{[ failed verification ]}:

The attacker may choose n plaintexts. (This parameter n is specified as part of the attack model, it may or may not be bounded.)
The attacker then sends these n plaintexts to the encryption oracle.
The encryption oracle will then encrypt the attacker's plaintexts and send them back to the attacker.
The attacker receives n ciphertexts back from the oracle, in such a way that the attacker knows which ciphertext corresponds to each plaintext.
Based on the plaintext–ciphertext pairs, the attacker can attempt to extract the key used by the oracle to encode the plaintexts. Since the attacker in this type of attack is free to craft the plaintext to match his needs, the attack complexity may be reduced.

Consider the following extension of the above situation. After the last step,

The adversary outputs two plaintexts m₀ and m₁.
A bit b is chosen uniformly at random $b\leftarrow \{0,1\}$ .
The adversary receives the encryption of m_b, and attempts to "guess" which plaintext it received, and outputs a bit b'.

A cipher has indistinguishable encryptions under a chosen-plaintext attack if after running the above experiment with n=1^{[ failed verification ]} the adversary can't guess correctly (b=b') with probability non-negligibly better than 1/2.^[3]

Examples

The following examples demonstrate how some ciphers that meet other security definitions may be broken with a chosen-plaintext attack.

Caesar cipher

The following attack on the Caesar cipher allows full recovery of the secret key:

Suppose the adversary sends the message: Attack at dawn,
and the oracle returns Nggnpx ng qnja.
The adversary can then work through to recover the key in the same way as a Caesar cipher. The adversary could deduce the substitutions A → N, T → G and so on. This would lead the adversary to determine that 13 was the key used in the Caesar cipher.

With more intricate or complex encryption methodologies the decryption method becomes more resource-intensive, however, the core concept is still relatively the same.

One-time pads

The following attack on a one-time pad allows full recovery of the secret key. Suppose the message length and key length are equal to n.

The adversary sends a string consisting of n zeroes to the oracle.
The oracle returns the bitwise exclusive-or of the key with the string of zeroes.
The string returned by the oracle is the secret key.

While the one-time pad is used as an example of an information-theoretically secure cryptosystem, this security only holds under security definitions weaker than CPA security. This is because under the formal definition of CPA security the encryption oracle has no state. This vulnerability may not be applicable to all practical implementations – the one-time pad can still be made secure if key reuse is avoided (hence the name "one-time" pad).

In practice

In World War II US Navy cryptanalysts discovered that Japan was planning to attack a location referred to as "AF". They believed that "AF" might be Midway Island, because other locations in the Hawaiian Islands had codewords that began with "A". To prove their hypothesis that "AF" corresponded to "Midway Island" they asked the US forces at Midway to send a plaintext message about low supplies. The Japanese intercepted the message and immediately reported to their superiors that "AF" was low on water, confirming the Navy's hypothesis and allowing them to position their force to win the battle.^[3]^[4]

Also during World War II, Allied codebreakers at Bletchley Park would sometimes ask the Royal Air Force to lay mines at a position that didn't have any abbreviations or alternatives in the German naval system's grid reference. The hope was that the Germans, seeing the mines, would use an Enigma machine to encrypt a warning message about the mines and an "all clear" message after they were removed, giving the allies enough information about the message to break the German naval Enigma. This process of planting a known-plaintext was called gardening .^[5] Allied codebreakers also helped craft messages sent by double agent Juan Pujol García, whose encrypted radio reports were received in Madrid, manually decrypted, and then re-encrypted with an Enigma machine for transmission to Berlin.^[6] This helped the codebreakers decrypt the code used on the second leg, having supplied the original text.^[7]

In modern day, chosen-plaintext attacks (CPAs) are often used to break symmetric ciphers. To be considered CPA-secure, the symmetric cipher must not be vulnerable to chosen-plaintext attacks. Thus, it is important for symmetric cipher implementors to understand how an attacker would attempt to break their cipher and make relevant improvements.

For some chosen-plaintext attacks, only a small part of the plaintext may need to be chosen by the attacker; such attacks are known as plaintext injection attacks.

Relation to other attacks

A chosen-plaintext attack is more powerful than known-plaintext attack, because the attacker can directly target specific terms or patterns without having to wait for these to appear naturally, allowing faster gathering of data relevant to cryptanalysis. Therefore, any cipher that prevents chosen-plaintext attacks is also secure against known-plaintext and ciphertext-only attacks.

However, a chosen-plaintext attack is less powerful than a chosen-ciphertext attack, where the attacker can obtain the plaintexts of arbitrary ciphertexts. A CCA-attacker can sometimes break a CPA-secure system.^[3] For example, the El Gamal cipher is secure against chosen plaintext attacks, but vulnerable to chosen ciphertext attacks because it is unconditionally malleable.

Related Research Articles

In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called blocks. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage and exchange of data, where such data is secured and authenticated via encryption.

In cryptography, a cipher is an algorithm for performing encryption or decryption—a series of well-defined steps that can be followed as a procedure. An alternative, less common term is encipherment. To encipher or encode is to convert information into cipher or code. In common parlance, "cipher" is synonymous with "code", as they are both a set of steps that encrypt a message; however, the concepts are distinct in cryptography, especially classical cryptography.

<span class="mw-page-title-main">Cryptanalysis</span> Study of analyzing information systems in order to discover their hidden aspects

Cryptanalysis refers to the process of analyzing information systems in order to understand hidden aspects of the systems. Cryptanalysis is used to breach cryptographic security systems and gain access to the contents of encrypted messages, even if the cryptographic key is unknown.

In cryptography, encryption is the process of transforming information in a way that, ideally, only authorized parties can decode. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Despite its goal, encryption does not itself prevent interference but denies the intelligible content to a would-be interceptor.

In cryptography, the one-time pad (OTP) is an encryption technique that cannot be cracked, but requires the use of a single-use pre-shared key that is larger than or equal to the size of the message being sent. In this technique, a plaintext is paired with a random secret key. Then, each bit or character of the plaintext is encrypted by combining it with the corresponding bit or character from the pad using modular addition.

In cryptography, a Caesar cipher, also known as Caesar's cipher, the shift cipher, Caesar's code, or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of 3, D would be replaced by A, E would become B, and so on. The method is named after Julius Caesar, who used it in his private correspondence.

Symmetric-key algorithms are algorithms for cryptography that use the same cryptographic keys for both the encryption of plaintext and the decryption of ciphertext. The keys may be identical, or there may be a simple transformation to go between the two keys. The keys, in practice, represent a shared secret between two or more parties that can be used to maintain a private information link. The requirement that both parties have access to the secret key is one of the main drawbacks of symmetric-key encryption, in comparison to public-key encryption. However, symmetric-key encryption algorithms are usually better for bulk encryption. With exception of the one-time pad they have a smaller key size, which means less storage space and faster transmission. Due to this, asymmetric-key encryption is often used to exchange the secret key for symmetric-key encryption.

In cryptography, the ElGamal encryption system is an asymmetric key encryption algorithm for public-key cryptography which is based on the Diffie–Hellman key exchange. It was described by Taher Elgamal in 1985. ElGamal encryption is used in the free GNU Privacy Guard software, recent versions of PGP, and other cryptosystems. The Digital Signature Algorithm (DSA) is a variant of the ElGamal signature scheme, which should not be confused with ElGamal encryption.

A chosen-ciphertext attack (CCA) is an attack model for cryptanalysis where the cryptanalyst can gather information by obtaining the decryptions of chosen ciphertexts. From these pieces of information the adversary can attempt to recover the secret key used for decryption.

In cryptography, an initialization vector (IV) or starting variable is an input to a cryptographic primitive being used to provide the initial state. The IV is typically required to be random or pseudorandom, but sometimes an IV only needs to be unpredictable or unique. Randomization is crucial for some encryption schemes to achieve semantic security, a property whereby repeated usage of the scheme under the same key does not allow an attacker to infer relationships between segments of the encrypted message. For block ciphers, the use of an IV is described by the modes of operation.

In cryptography, ciphertext or cyphertext is the result of encryption performed on plaintext using an algorithm, called a cipher. Ciphertext is also known as encrypted or encoded information because it contains a form of the original plaintext that is unreadable by a human or computer without the proper cipher to decrypt it. This process prevents the loss of sensitive information via hacking. Decryption, the inverse of encryption, is the process of turning ciphertext into readable plaintext. Ciphertext is not to be confused with codetext because the latter is a result of a code, not a cipher.

In cryptography, a semantically secure cryptosystem is one where only negligible information about the plaintext can be feasibly extracted from the ciphertext. Specifically, any probabilistic, polynomial-time algorithm (PPTA) that is given the ciphertext of a certain message $, and the message's length, cannot determine any partial information on the message with probability non-negligibly higher than all other PPTA's that only have access to the message length. This concept is the computational complexity analogue to Shannon's concept of perfect secrecy. Perfect secrecy means that the ciphertext reveals no information at all about the plaintext, whereas semantic security implies that any information revealed cannot be feasibly extracted.$

Probabilistic encryption is the use of randomness in an encryption algorithm, so that when encrypting the same message several times it will, in general, yield different ciphertexts. The term "probabilistic encryption" is typically used in reference to public key encryption algorithms; however various symmetric key encryption algorithms achieve a similar property, and stream ciphers such as Freestyle which are inherently random. To be semantically secure, that is, to hide even partial information about the plaintext, an encryption algorithm must be probabilistic.

Multiple encryption is the process of encrypting an already encrypted message one or more times, either using the same or a different algorithm. It is also known as cascade encryption, cascade ciphering, multiple encryption, and superencipherment. Superencryption refers to the outer-level encryption of a multiple encryption.

Ciphertext indistinguishability is a property of many encryption schemes. Intuitively, if a cryptosystem possesses the property of indistinguishability, then an adversary will be unable to distinguish pairs of ciphertexts based on the message they encrypt. The property of indistinguishability under chosen plaintext attack is considered a basic requirement for most provably secure public key cryptosystems, though some schemes also provide indistinguishability under chosen ciphertext attack and adaptive chosen ciphertext attack. Indistinguishability under chosen plaintext attack is equivalent to the property of semantic security, and many cryptographic proofs use these definitions interchangeably.

Authenticated Encryption (AE) is an encryption scheme which simultaneously assures the data confidentiality and authenticity. Examples of encryption modes that provide AE are GCM, CCM.

A deterministic encryption scheme is a cryptosystem which always produces the same ciphertext for a given plaintext and key, even over separate executions of the encryption algorithm. Examples of deterministic encryption algorithms include RSA cryptosystem, and many block ciphers when used in ECB mode or with a constant initialization vector.

In cryptanalysis, attack models or attack types are a classification of cryptographic attacks specifying the kind of access a cryptanalyst has to a system under attack when attempting to "break" an encrypted message generated by the system. The greater the access the cryptanalyst has to the system, the more useful information they can get to utilize for breaking the cypher.

<span class="mw-page-title-main">Cryptography</span> Practice and study of secure communication techniques

Cryptography, or cryptology, is the practice and study of techniques for secure communication in the presence of adversarial behavior. More generally, cryptography is about constructing and analyzing protocols that prevent third parties or the public from reading private messages. Modern cryptography exists at the intersection of the disciplines of mathematics, computer science, information security, electrical engineering, digital signal processing, physics, and others. Core concepts related to information security are also central to cryptography. Practical applications of cryptography include electronic commerce, chip-based payment cards, digital currencies, computer passwords, and military communications.

In cryptography, a padding oracle attack is an attack which uses the padding validation of a cryptographic message to decrypt the ciphertext. In cryptography, variable-length plaintext messages often have to be padded (expanded) to be compatible with the underlying cryptographic primitive. The attack relies on having a "padding oracle" who freely responds to queries about whether a message is correctly padded or not. The information could be directly given, or leaked through a side-channel.

References

↑ Ross Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems. The first edition (2001): http://www.cl.cam.ac.uk/~rja14/book.html
↑ Barrera, John Fredy; Vargas, Carlos; Tebaldi, Myrian; Torroba, Roberto (2010-10-15). "Chosen-plaintext attack on a joint transform correlator encrypting system". Optics Communications. 283 (20): 3917–3921. Bibcode:2010OptCo.283.3917B. doi:10.1016/j.optcom.2010.06.009. ISSN 0030-4018.
1 2 3 Katz, Jonathan; Lindell, Yehuda (2007). Introduction to Modern Cryptography: Principles and Protocols. Boca Raton: Chapman and Hall/CRC. ISBN 978-1584885511. OCLC 893721520.
↑ Weadon, Patrick D. "How Cryptology enabled the United States to turn the tide in the Pacific War". www.navy.mil. US Navy. Archived from the original on 2015-01-31. Retrieved 2015-02-19.
↑ Morris, Christopher (1993), "Navy Ultra's Poor Relations", in Hinsley, F.H.; Stripp, Alan (eds.), Codebreakers: The inside story of Bletchley Park, Oxford: Oxford University Press, p. 235, ISBN 978-0-19-280132-6
↑ Kelly, Jon (27 January 2011). "The piece of paper that fooled Hitler". BBC. Retrieved 1 January 2012. The Nazis believed Pujol, whom they code named Alaric Arabel, was one of their prize assets
↑ Seaman (2004). "The first code which Garbo was given by the Germans for his wireless communications turned out to be the identical code which was currently in use in the German circuits"

Listen to this article (11 minutes)

This audio file was created from a revision of this article dated 28 December 2023, and does not reflect subsequent edits.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[RASE-1] Ross Anderson, Security Engineering: A Guide to Building Dependable Distributed Systems. The first edition (2001): http://www.cl.cam.ac.uk/~rja14/book.html

[2] Barrera, John Fredy; Vargas, Carlos; Tebaldi, Myrian; Torroba, Roberto (2010-10-15). "Chosen-plaintext attack on a joint transform correlator encrypting system". Optics Communications. 283 (20): 3917–3921. Bibcode:2010OptCo.283.3917B. doi:10.1016/j.optcom.2010.06.009. ISSN 0030-4018.

[modern-3] 1 2 3 Katz, Jonathan; Lindell, Yehuda (2007). Introduction to Modern Cryptography: Principles and Protocols. Boca Raton: Chapman and Hall/CRC. ISBN 978-1584885511. OCLC 893721520.

[Navy_Midway-4] Weadon, Patrick D. "How Cryptology enabled the United States to turn the tide in the Pacific War". www.navy.mil. US Navy. Archived from the original on 2015-01-31. Retrieved 2015-02-19.

[5] Morris, Christopher (1993), "Navy Ultra's Poor Relations", in Hinsley, F.H.; Stripp, Alan (eds.), Codebreakers: The inside story of Bletchley Park, Oxford: Oxford University Press, p. 235, ISBN 978-0-19-280132-6

[BBC_News_Magazine-6] Kelly, Jon (27 January 2011). "The piece of paper that fooled Hitler". BBC. Retrieved 1 January 2012. The Nazis believed Pujol, whom they code named Alaric Arabel, was one of their prize assets

[MarkSeaman73-7] Seaman (2004). "The first code which Garbo was given by the Germans for his wireless communications turned out to be the identical code which was currently in use in the German circuits"

[1]

[2]

[3]

[4]

[5]

[6]

[7]