Optimal asymmetric encryption padding

Last updated June 02, 2024

In cryptography, Optimal Asymmetric Encryption Padding (OAEP) is a padding scheme often used together with RSA encryption. OAEP was introduced by Bellare and Rogaway,^[1] and subsequently standardized in PKCS#1 v2 and RFC 2437.

The OAEP algorithm is a form of Feistel network which uses a pair of random oracles G and H to process the plaintext prior to asymmetric encryption. When combined with any secure trapdoor one-way permutation $f$ , this processing is proved in the random oracle model to result in a combined scheme which is semantically secure under chosen plaintext attack (IND-CPA). When implemented with certain trapdoor permutations (e.g., RSA), OAEP is also proven to be secure against chosen ciphertext attack. OAEP can be used to build an all-or-nothing transform.

OAEP satisfies the following two goals:

Add an element of randomness which can be used to convert a deterministic encryption scheme (e.g., traditional RSA) into a probabilistic scheme.
Prevent partial decryption of ciphertexts (or other information leakage) by ensuring that an adversary cannot recover any portion of the plaintext without being able to invert the trapdoor one-way permutation $f$ .

The original version of OAEP (Bellare/Rogaway, 1994) showed a form of "plaintext awareness" (which they claimed implies security against chosen ciphertext attack) in the random oracle model when OAEP is used with any trapdoor permutation. Subsequent results contradicted this claim, showing that OAEP was only IND-CCA1 secure. However, the original scheme was proved in the random oracle model to be IND-CCA2 secure when OAEP is used with the RSA permutation using standard encryption exponents, as in the case of RSA-OAEP.^[2] An improved scheme (called OAEP+) that works with any trapdoor one-way permutation was offered by Victor Shoup to solve this problem.^[3] More recent work has shown that in the standard model (that is, when hash functions are not modeled as random oracles) it is impossible to prove the IND-CCA2 security of RSA-OAEP under the assumed hardness of the RSA problem.^[4]^[5]

Algorithm

In the diagram,

MGF is the mask generating function, usually MGF1,
Hash is the chosen hash function,
hLen is the length of the output of the hash function in bytes,
k is the length of the RSA modulus n in bytes,
M is the message to be padded, with length mLen (at most $\mathrm {mLen} =k-2\cdot \mathrm {hLen} -2$ bytes),
L is an optional label to be associated with the message (the label is the empty string by default and can be used to authenticate data without requiring encryption),
PS is a byte string of $k-\mathrm {mLen} -2\cdot \mathrm {hLen} -2$ null-bytes.
⊕ is an XOR-Operation.

Encoding

RFC 8017^[6] for PKCS#1 v2.2 specifies the OAEP scheme as follows for encoding:

Hash the label L using the chosen hash function: $\mathrm {lHash} =\mathrm {Hash} (L)$
Generate a padding string PS consisting of $k-\mathrm {mLen} -2\cdot \mathrm {hLen} -2$ bytes with the value 0x00.
Concatenate lHash, PS, the single byte 0x01, and the message M to form a data block DB: $\mathrm {DB} =\mathrm {lHash} ||\mathrm {PS} ||\mathrm {0x01} ||\mathrm {M}$ . This data block has length $k-\mathrm {hLen} -1$ bytes.
Generate a random seed of length hLen.
Use the mask generating function to generate a mask of the appropriate length for the data block: $\mathrm {dbMask} =\mathrm {MGF} (\mathrm {seed} ,k-\mathrm {hLen} -1)$
Mask the data block with the generated mask: $\mathrm {maskedDB} =\mathrm {DB} \oplus \mathrm {dbMask}$
Use the mask generating function to generate a mask of length hLen for the seed: $\mathrm {seedMask} =\mathrm {MGF} (\mathrm {maskedDB} ,\mathrm {hLen} )$
Mask the seed with the generated mask: $\mathrm {maskedSeed} =\mathrm {seed} \oplus \mathrm {seedMask}$
The encoded (padded) message is the byte 0x00 concatenated with the maskedSeed and maskedDB: $\mathrm {EM} =\mathrm {0x00} ||\mathrm {maskedSeed} ||\mathrm {maskedDB}$

Decoding

Decoding works by reversing the steps taken in the encoding algorithm:

Hash the label L using the chosen hash function: $\mathrm {lHash} =\mathrm {Hash} (L)$
To reverse step 9, split the encoded message EM into the byte 0x00, the maskedSeed (with length hLen) and the maskedDB: $\mathrm {EM} =\mathrm {0x00} ||\mathrm {maskedSeed} ||\mathrm {maskedDB}$
Generate the seedMask which was used to mask the seed: $\mathrm {seedMask} =\mathrm {MGF} (\mathrm {maskedDB} ,\mathrm {hLen} )$
To reverse step 8, recover the seed with the seedMask: $\mathrm {seed} =\mathrm {maskedSeed} \oplus \mathrm {seedMask}$
Generate the dbMask which was used to mask the data block: $\mathrm {dbMask} =\mathrm {MGF} (\mathrm {seed} ,k-\mathrm {hLen} -1)$
To reverse step 6, recover the data block DB: $\mathrm {DB} =\mathrm {maskedDB} \oplus \mathrm {dbMask}$
To reverse step 3, split the data block into its parts: $\mathrm {DB} =\mathrm {lHash'} ||\mathrm {PS} ||\mathrm {0x01} ||\mathrm {M}$ $Optimal asymmetric encryption padding$ .
1. Verify that:
  - lHash' is equal to the computed lHash
  - PS only consists of bytes 0x00
  - PS and M are separated by the 0x01 byte and
  - the first byte of EM is the byte 0x00.
2. If any of these conditions aren't met, then the padding is invalid.

Usage in RSA: The encoded message can then be encrypted with RSA. The deterministic property of RSA is now avoided by using the OAEP encoding because the seed is randomly generated and influences the entire encoded message.

Security

The "all-or-nothing" security is from the fact that to recover M, one must recover the entire maskedDB and the entire maskedSeed; maskedDB is required to recover the seed from the maskedSeed, and the seed is required to recover the data block DB from maskedDB. Since any changed bit of a cryptographic hash completely changes the result, the entire maskedDB, and the entire maskedSeed must both be completely recovered.

Implementation

In the PKCS#1 standard, the random oracles are identical. The PKCS#1 standard further requires that the random oracles be MGF1 with an appropriate hash function.^[7]

Related Research Articles

In cryptography, a block cipher is a deterministic algorithm that operates on fixed-length groups of bits, called blocks. Block ciphers are the elementary building blocks of many cryptographic protocols. They are ubiquitous in the storage and exchange of data, where such data is secured and authenticated via encryption.

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

RSA (Rivest–Shamir–Adleman) is a public-key cryptosystem, one of the oldest widely used for secure data transmission. The initialism "RSA" comes from the surnames of Ron Rivest, Adi Shamir and Leonard Adleman, who publicly described the algorithm in 1977. An equivalent system was developed secretly in 1973 at Government Communications Headquarters (GCHQ), the British signals intelligence agency, by the English mathematician Clifford Cocks. That system was declassified in 1997.

In cryptography, RC4 is a stream cipher. While it is remarkable for its simplicity and speed in software, multiple vulnerabilities have been discovered in RC4, rendering it insecure. It is especially vulnerable when the beginning of the output keystream is not discarded, or when nonrandom or related keys are used. Particularly problematic uses of RC4 have led to very insecure protocols such as WEP.

In cryptography, a Feistel cipher is a symmetric structure used in the construction of block ciphers, named after the German-born physicist and cryptographer Horst Feistel, who did pioneering research while working for IBM; it is also commonly known as a Feistel network. A large number of block ciphers use the scheme, including the US Data Encryption Standard, the Soviet/Russian GOST and the more recent Blowfish and Twofish ciphers. In a Feistel cipher, encryption and decryption are very similar operations, and both consist of iteratively running a function called a "round function" a fixed number of times.

In computer science and cryptography, Whirlpool is a cryptographic hash function. It was designed by Vincent Rijmen and Paulo S. L. M. Barreto, who first described it in 2000.

In cryptography, a semantically secure cryptosystem is one where only negligible information about the plaintext can be feasibly extracted from the ciphertext. Specifically, any probabilistic, polynomial-time algorithm (PPTA) that is given the ciphertext of a certain message $, and the message's length, cannot determine any partial information on the message with probability non-negligibly higher than all other PPTA's that only have access to the message length. This concept is the computational complexity analogue to Shannon's concept of perfect secrecy. Perfect secrecy means that the ciphertext reveals no information at all about the plaintext, whereas semantic security implies that any information revealed cannot be feasibly extracted.$

The MD2 Message-Digest Algorithm is a cryptographic hash function developed by Ronald Rivest in 1989. The algorithm is optimized for 8-bit computers. MD2 is specified in IETF RFC 1319. The "MD" in MD2 stands for "Message Digest".

Probabilistic encryption is the use of randomness in an encryption algorithm, so that when encrypting the same message several times it will, in general, yield different ciphertexts. The term "probabilistic encryption" is typically used in reference to public key encryption algorithms; however various symmetric key encryption algorithms achieve a similar property, and stream ciphers such as Freestyle which are inherently random. To be semantically secure, that is, to hide even partial information about the plaintext, an encryption algorithm must be probabilistic.

The Cramer–Shoup system is an asymmetric key encryption algorithm, and was the first efficient scheme proven to be secure against adaptive chosen ciphertext attack using standard cryptographic assumptions. Its security is based on the computational intractability of the Decisional Diffie–Hellman assumption. Developed by Ronald Cramer and Victor Shoup in 1998, it is an extension of the ElGamal cryptosystem. In contrast to ElGamal, which is extremely malleable, Cramer–Shoup adds other elements to ensure non-malleability even against a resourceful attacker. This non-malleability is achieved through the use of a universal one-way hash function and additional computations, resulting in a ciphertext which is twice as large as in ElGamal.

In cryptography, a message authentication code based on universal hashing, or UMAC, is a type of message authentication code (MAC) calculated choosing a hash function from a class of hash functions according to some secret (random) process and applying it to the message. The resulting digest or fingerprint is then encrypted to hide the identity of the hash function used. As with any MAC, it may be used to simultaneously verify both the data integrity and the authenticity of a message. In contrast to traditional MACs, which are serializable, UMAC can be executed in parallel. Thus as machines continue to offer more parallel processing capabilities, the speed of implementing UMAC will increase.

Poly1305 is a universal hash family designed by Daniel J. Bernstein for use in cryptography.

Disk encryption is a special case of data at rest protection when the storage medium is a sector-addressable device. This article presents cryptographic aspects of the problem. For an overview, see disk encryption. For discussion of different software packages and hardware devices devoted to this problem, see disk encryption software and disk encryption hardware.

In cryptography, Galois/Counter Mode (GCM) is a mode of operation for symmetric-key cryptographic block ciphers which is widely adopted for its performance. GCM throughput rates for state-of-the-art, high-speed communication channels can be achieved with inexpensive hardware resources.

In cryptographic protocols, a key encapsulation mechanism (KEM) or key encapsulation method is used to secure symmetric key material for transmission using asymmetric (public-key) algorithms. It is commonly used in hybrid cryptosystems. In practice, public key systems are clumsy to use in transmitting long messages. Instead they are often used to exchange symmetric keys, which are relatively short. The symmetric key is then used to encrypt the longer message. The traditional approach to sending a symmetric key with public key systems is to first generate a random symmetric key and then encrypt it using the chosen public key algorithm. The recipient then decrypts the public key message to recover the symmetric key. As the symmetric key is generally short, padding is required for full security and proofs of security for padding schemes are often less than complete. KEMs simplify the process by generating a random element in the finite group underlying the public key system and deriving the symmetric key by hashing that element, eliminating the need for padding.

In cryptography, PKCS #1 is the first of a family of standards called Public-Key Cryptography Standards (PKCS), published by RSA Laboratories. It provides the basic definitions of and recommendations for implementing the RSA algorithm for public-key cryptography. It defines the mathematical properties of public and private keys, primitive operations for encryption and signatures, secure cryptographic schemes, and related ASN.1 syntax representations.

In cryptography, a padding oracle attack is an attack which uses the padding validation of a cryptographic message to decrypt the ciphertext. In cryptography, variable-length plaintext messages often have to be padded (expanded) to be compatible with the underlying cryptographic primitive. The attack relies on having a "padding oracle" who freely responds to queries about whether a message is correctly padded or not. The information could be directly given, or leaked through a side-channel.

ACE is the collection of units, implementing both a public key encryption scheme and a digital signature scheme. Corresponding names for these schemes — «ACE Encrypt» and «ACE Sign». Schemes are based on Cramer-Shoup public key encryption scheme and Cramer-Shoup signature scheme. Introduced variants of these schemes are intended to achieve a good balance between performance and security of the whole encryption system.

A counter-based random number generation is a kind of pseudorandom number generator that uses only an integer counter as its internal state. They are generally used for generating pseudorandom numbers for large parallel computations.

A mask generation function (MGF) is a cryptographic primitive similar to a cryptographic hash function except that while a hash function's output has a fixed size, a MGF supports output of a variable length. In this respect, a MGF can be viewed as a extendable-output function (XOF): it can accept input of any length and process it to produce output of any length. Mask generation functions are completely deterministic: for any given input and any desired output length the output is always the same.

References

↑ M. Bellare, P. Rogaway. Optimal Asymmetric Encryption -- How to encrypt with RSA. Extended abstract in Advances in Cryptology – Eurocrypt '94 Proceedings, Lecture Notes in Computer Science Vol. 950, A. De Santis ed, Springer-Verlag, 1995. full version (pdf)
↑ Eiichiro Fujisaki, Tatsuaki Okamoto, David Pointcheval, and Jacques Stern. RSA-- OAEP is secure under the RSA assumption. In J. Kilian, ed., Advances in Cryptology – CRYPTO 2001, vol. 2139 of Lecture Notes in Computer Science, SpringerVerlag, 2001. full version (pdf)
↑ Victor Shoup. OAEP Reconsidered. IBM Zurich Research Lab, Saumerstr. 4, 8803 Ruschlikon, Switzerland. September 18, 2001. full version (pdf)
↑ P. Paillier and J. Villar, Trading One-Wayness against Chosen-Ciphertext Security in Factoring-Based Encryption, Advances in Cryptology – Asiacrypt 2006.
↑ D. Brown, What Hashes Make RSA-OAEP Secure?, IACR ePrint 2006/233.
↑ "Encryption Operation". PKCS #1: RSA Cryptography Specifications Version 2.2. IETF. November 2016. p. 22. sec. 7.1.1. doi: 10.17487/RFC8017 . RFC 8017 . Retrieved 2022-06-04.
↑ Brown, Daniel R. L. (2006). "What Hashes Make RSA-OAEP Secure?" (PDF). IACR Cryptology ePrint Archive. Retrieved 2019-04-03.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] M. Bellare, P. Rogaway. Optimal Asymmetric Encryption -- How to encrypt with RSA. Extended abstract in Advances in Cryptology – Eurocrypt '94 Proceedings, Lecture Notes in Computer Science Vol. 950, A. De Santis ed, Springer-Verlag, 1995. full version (pdf)

[2] Eiichiro Fujisaki, Tatsuaki Okamoto, David Pointcheval, and Jacques Stern. RSA-- OAEP is secure under the RSA assumption. In J. Kilian, ed., Advances in Cryptology – CRYPTO 2001, vol. 2139 of Lecture Notes in Computer Science, SpringerVerlag, 2001. full version (pdf)

[3] Victor Shoup. OAEP Reconsidered. IBM Zurich Research Lab, Saumerstr. 4, 8803 Ruschlikon, Switzerland. September 18, 2001. full version (pdf)

[4] P. Paillier and J. Villar, Trading One-Wayness against Chosen-Ciphertext Security in Factoring-Based Encryption, Advances in Cryptology – Asiacrypt 2006.

[5] D. Brown, What Hashes Make RSA-OAEP Secure?, IACR ePrint 2006/233.

[6] "Encryption Operation". PKCS #1: RSA Cryptography Specifications Version 2.2. IETF. November 2016. p. 22. sec. 7.1.1. doi: 10.17487/RFC8017 . RFC 8017 . Retrieved 2022-06-04.

[7] Brown, Daniel R. L. (2006). "What Hashes Make RSA-OAEP Secure?" (PDF). IACR Cryptology ePrint Archive. Retrieved 2019-04-03.

[1]

[2]

[3]

[4]

[5]

[6]

[7]