Argon2

Last updated

Argon2 is a key derivation function that was selected as the winner of the 2015 Password Hashing Competition. [1] [2] It was designed by Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich from the University of Luxembourg. [3] The reference implementation of Argon2 is released under a Creative Commons CC0 license (i.e. public domain) or the Apache License 2.0, and provides three related versions:

Contents

All three modes allow specification by three parameters that control:

Cryptanalysis

While there is no public cryptanalysis applicable to Argon2d, there are two published attacks on the Argon2i function. The first attack is applicable only to the old version of Argon2i, while the second has been extended to the latest version (1.3). [5]

The first attack shows that it is possible to compute a single-pass Argon2i function using between a quarter and a fifth of the desired space with no time penalty, and compute a multiple-pass Argon2i using only N/e (≈ N/2.72) space with no time penalty. [6] According to the Argon2 authors, this attack vector was fixed in version 1.3. [7]

The second attack shows that Argon2i can be computed by an algorithm which has complexity O(n7/4 log(n)) for all choices of parameters σ (space cost), τ (time cost), and thread-count such that n=στ. [8] The Argon2 authors claim that this attack is not efficient if Argon2i is used with three or more passes. [7] However, Joël Alwen and Jeremiah Blocki improved the attack and showed that in order for the attack to fail, Argon2i v1.3 needs more than 10 passes over memory. [5]

To address these concerns, RFC9106 recommends using Argon2id to largely mitigate such attacks. [9]

Algorithm

Source: [4]

Function Argon2    Inputs:       password (P):       Bytes (0..232-1)    Password (or message) to be hashed       salt (S):           Bytes (8..232-1)    Salt (16 bytes recommended for password hashing)       parallelism (p):    Number (1..224-1)   Degree of parallelism (i.e. number of threads)       tagLength (T):      Number (4..232-1)   Desired number of returned bytes       memorySizeKB (m):   Number (8p..232-1)  Amount of memory (in kibibytes) to use       iterations (t):     Number (1..232-1)   Number of iterations to perform       version (v):        Number (0x13)The current version is 0x13 (19 decimal)       key (K):            Bytes (0..232-1)    Optional key (Errata: PDF says 0..32 bytes, RFC says 0..232 bytes)       associatedData (X): Bytes (0..232-1)    Optional arbitrary extra data       hashType (y):       Number (0=Argon2d, 1=Argon2i, 2=Argon2id)    Output:       tag:                Bytes (tagLength)The resulting generated bytes, tagLength bytes longGenerate initial 64-byte block H0.     All the input parameters are concatenated and input as a source of additional entropy.     Errata: RFC says H0 is 64-bits; PDF says H0 is 64-bytes.     Errata: RFC says the Hash is H^, the PDF says it's ℋ (but doesn't document what ℋ is). It's actually Blake2b.     Variable length items are prepended with their length as 32-bit little-endian integers.    buffer ← parallelism ∥ tagLength ∥ memorySizeKB ∥ iterations ∥ version ∥ hashType          ∥ Length(password)       ∥ Password          ∥ Length(salt)           ∥ salt          ∥ Length(key)            ∥ key          ∥ Length(associatedData) ∥ associatedData    H0 ← Blake2b(buffer, 64) //default hash size of Blake2b is 64-bytesCalculate number of 1 KB blocks by rounding down memorySizeKB to the nearest multiple of 4*parallelism kibibytes     blockCount ← Floor(memorySizeKB, 4*parallelism)     Allocate two-dimensional array of 1 KiB blocks (parallelism rows x columnCount columns)    columnCount ← blockCount / parallelism;   //In the RFC, columnCount is referred to as qCompute the first and second block (i.e. column zero and one ) of each lane (i.e. row)for i ← 0 to parallelism-1 dofor each row       Bi[0] ← Hash(H0 ∥ 0 ∥ i, 1024) //Generate a 1024-byte digest       Bi[1] ← Hash(H0 ∥ 1 ∥ i, 1024) //Generate a 1024-byte digestCompute remaining columns of each lanefor i ← 0 to parallelism-1 do//for each rowfor j ← 2 to columnCount-1 do//for each subsequent column//i' and j' indexes depend if it's Argon2i, Argon2d, or Argon2id (See section 3.4)          i′, j′ ← GetBlockIndexes(i, j)  //the GetBlockIndexes function is not defined          Bi[j] = G(Bi[j-1], Bi′[j′]) //the G hash function is not definedFurther passes when iterations > 1for nIteration ← 2 to iterations dofor i ← 0 to parallelism-1 dofor each rowfor j ← 0 to columnCount-1 do//for each subsequent column//i' and j' indexes depend if it's Argon2i, Argon2d, or Argon2id (See section 3.4)            i′, j′ ← GetBlockIndexes(i, j)            if j == 0 then               Bi[0] = Bi[0] xor G(Bi[columnCount-1], Bi′[j′])            else              Bi[j] = Bi[j] xor G(Bi[j-1], Bi′[j′])     Compute final block C as the XOR of the last column of each row    C ← B0[columnCount-1]    for i ← 1 to parallelism-1 do       C ← C xor Bi[columnCount-1]     Compute output tagreturn Hash(C, tagLength)

Variable-length hash function

Argon2 makes use of a hash function capable of producing digests up to 232 bytes long. This hash function is internally built upon Blake2.

Function Hash(message, digestSize)    Inputs:       message:         Bytes (0..232-1)     Message to be hashed       digestSize:      Integer (1..232)     Desired number of bytes to be returnedOutput:       digest:          Bytes (digestSize)The resulting generated bytes, digestSize bytes longHash is a variable-length hash function, built using Blake2b, capable of generating    digests up to 232 bytes.If the requested digestSize is 64-bytes or lower, then we use Blake2b directlyif (digestSize <= 64) thenreturn Blake2b(digestSize ∥ message, digestSize) //concatenate 32-bit little endian digestSize with the message bytesFor desired hashes over 64-bytes (e.g. 1024 bytes for Argon2 blocks),    we use Blake2b to generate twice the number of needed 64-byte blocks,    and then only use 32-bytes from each blockCalculate the number of whole blocks (knowing we're only going to use 32-bytes from each)    r ← Ceil(digestSize/32)-2;     Generate r whole blocks.Initial block is generated from message    V1 ← Blake2b(digestSize ∥ message, 64);    Subsequent blocks are generated from previous blocksfor i ← 2 to r do       Vi ← Blake2b(Vi-1, 64)    Generate the final (possibly partial) block    partialBytesNeeded ← digestSize – 32*r;    Vr+1 ← Blake2b(Vr, partialBytesNeeded)     Concatenate the first 32-bytes of each block Vi    (except the possibly partial last block, which we take the whole thing)Let Ai represent the lower 32-bytes of block Vireturn A1 ∥ A2 ∥ ... ∥ Ar ∥ Vr+1

As of May 2023, OWASP's Password Storage Cheat Sheet recommends that people "use Argon2id with a minimum configuration of 19 MiB of memory, an iteration count of 2, and 1 degree of parallelism." [10]

OWASP recommends that Argon2id should be preferred over Argon2d and Argon2i because it provides a balanced resistance to both GPU-based attacks and side-channel attacks. [10]

OWASP further notes that the following Argon2id options provide equivalent cryptographic strength and simply trade off memory usage for compute workload: [10]

Related Research Articles

Blowfish is a symmetric-key block cipher, designed in 1993 by Bruce Schneier and included in many cipher suites and encryption products. Blowfish provides a good encryption rate in software, and no effective cryptanalysis of it has been found to date. However, the Advanced Encryption Standard (AES) now receives more attention, and Schneier recommends Twofish for modern applications.

<span class="mw-page-title-main">HMAC</span> Computer communications hash algorithm

In cryptography, an HMAC is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret cryptographic key. As with any MAC, it may be used to simultaneously verify both the data integrity and authenticity of a message. An HMAC is a type of keyed hash function that can also be used in a key derivation scheme or a key stretching scheme.

In cryptography, SHA-1 is a hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits. It was designed by the United States National Security Agency, and is a U.S. Federal Information Processing Standard. The algorithm has been cryptographically broken but is still widely used.

<span class="mw-page-title-main">Cryptographic hash function</span> Hash function that is suitable for use in cryptography

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

<span class="mw-page-title-main">Key derivation function</span> Function that derives secret keys from a secret value

In cryptography, a key derivation function (KDF) is a cryptographic algorithm that derives one or more secret keys from a secret value such as a master key, a password, or a passphrase using a pseudorandom function. KDFs can be used to stretch keys into longer keys or to obtain keys of a required format, such as converting a group element that is the result of a Diffie–Hellman key exchange into a symmetric key for use with AES. Keyed cryptographic hash functions are popular examples of pseudorandom functions used for key derivation.

In cryptography, PBKDF1 and PBKDF2 are key derivation functions with a sliding computational cost, used to reduce vulnerability to brute-force attacks.

SHA-2 is a set of cryptographic hash functions designed by the United States National Security Agency (NSA) and first published in 2001. They are built using the Merkle–Damgård construction, from a one-way compression function itself built using the Davies–Meyer structure from a specialized block cipher.

In cryptography, a message authentication code based on universal hashing, or UMAC, is a type of message authentication code (MAC) calculated choosing a hash function from a class of hash functions according to some secret (random) process and applying it to the message. The resulting digest or fingerprint is then encrypted to hide the identity of the hash function used. As with any MAC, it may be used to simultaneously verify both the data integrity and the authenticity of a message. In contrast to traditional MACs, which are serializable, UMAC can be executed in parallel. Thus as machines continue to offer more parallel processing capabilities, the speed of implementing UMAC will increase.

A rainbow table is a precomputed table for caching the outputs of a cryptographic hash function, usually for cracking password hashes. Passwords are typically stored not in plain text form, but as hash values. If such a database of hashed passwords falls into the hands of an attacker, they can use a precomputed rainbow table to recover the plaintext passwords. A common defense against this attack is to compute the hashes using a key derivation function that adds a "salt" to each password before hashing it, with different passwords receiving different salts, which are stored in plain text along with the hash.

In cryptography, CRAM-MD5 is a challenge–response authentication mechanism (CRAM) based on the HMAC-MD5 algorithm. As one of the mechanisms supported by the Simple Authentication and Security Layer (SASL), it is often used in email software as part of SMTP Authentication and for the authentication of POP and IMAP users, as well as in applications implementing LDAP, XMPP, BEEP, and other protocols.

In cryptography, key stretching techniques are used to make a possibly weak key, typically a password or passphrase, more secure against a brute-force attack by increasing the resources it takes to test each possible key. Passwords or passphrases created by humans are often short or predictable enough to allow password cracking, and key stretching is intended to make such attacks more difficult by complicating a basic step of trying a single password candidate. Key stretching also improves security in some real-world applications where the key length has been constrained, by mimicking a longer key length from the perspective of a brute-force attacker.

In cryptography, Galois/Counter Mode (GCM) is a mode of operation for symmetric-key cryptographic block ciphers which is widely adopted for its performance. GCM throughput rates for state-of-the-art, high-speed communication channels can be achieved with inexpensive hardware resources.

bcrypt is a password-hashing function designed by Niels Provos and David Mazières, based on the Blowfish cipher and presented at USENIX in 1999. Besides incorporating a salt to protect against rainbow table attacks, bcrypt is an adaptive function: over time, the iteration count can be increased to make it slower, so it remains resistant to brute-force search attacks even with increasing computation power.

BLAKE is a cryptographic hash function based on Daniel J. Bernstein's ChaCha stream cipher, but a permuted copy of the input block, XORed with round constants, is added before each ChaCha round. Like SHA-2, there are two variants differing in the word size. ChaCha operates on a 4×4 array of words. BLAKE repeatedly combines an 8-word hash value with 16 message words, truncating the ChaCha result to obtain the next hash value. BLAKE-256 and BLAKE-224 use 32-bit words and produce digest sizes of 256 bits and 224 bits, respectively, while BLAKE-512 and BLAKE-384 use 64-bit words and produce digest sizes of 512 bits and 384 bits, respectively.

In cryptography, scrypt is a password-based key derivation function created by Colin Percival in March 2009, originally for the Tarsnap online backup service. The algorithm was specifically designed to make it costly to perform large-scale custom hardware attacks by requiring large amounts of memory. In 2016, the scrypt algorithm was published by IETF as RFC 7914. A simplified version of scrypt is used as a proof-of-work scheme by a number of cryptocurrencies, first implemented by an anonymous programmer called ArtForz in Tenebrix and followed by Fairbrix and Litecoin soon after.

crypt is a POSIX C library function. It is typically used to compute the hash of user account passwords. The function outputs a text string which also encodes the salt, and identifies the hash algorithm used. This output string forms a password record, which is usually stored in a text file.

In cryptography, the Salted Challenge Response Authentication Mechanism (SCRAM) is a family of modern, password-based challenge–response authentication mechanisms providing authentication of a user to a server. As it is specified for Simple Authentication and Security Layer (SASL), it can be used for password-based logins to services like LDAP, HTTP, SMTP, POP3, IMAP and JMAP (e-mail), XMPP (chat), or MongoDB and PostgreSQL (databases). For XMPP, supporting it is mandatory.

Lyra2 is a password hashing scheme (PHS) that can also work as a key derivation function (KDF). It received a special recognition during the Password Hashing Competition in July 2015, which was won by Argon2. Besides being used for its original purposes, it is also in the core of proof-of-work algorithms such as Lyra2REv2, adopted by Vertcoin, MonaCoin, among other cryptocurrencies Lyra2 was designed by Marcos A. Simplicio Jr., Leonardo C. Almeida, Ewerton R. Andrade, Paulo C. F. dos Santos, and Paulo S. L. M. Barreto from Escola Politécnica da Universidade de São Paulo. It is an improvement over Lyra, previously proposed by the same authors. Lyra2 preserves the security, efficiency and flexibility of its predecessor, including: (1) the ability to configure the desired amount of memory, processing time and parallelism to be used by the algorithm; and (2) the capacity of providing a high memory usage with a processing time similar to that obtained with scrypt. In addition, it brings the following improvements when compared to its predecessor:

A mask generation function (MGF) is a cryptographic primitive similar to a cryptographic hash function except that while a hash function's output has a fixed size, a MGF supports output of a variable length. In this respect, a MGF can be viewed as a extendable-output function (XOF): it can accept input of any length and process it to produce output of any length. Mask generation functions are completely deterministic: for any given input and any desired output length the output is always the same.

Balloon hashing is a key derivation function presenting proven memory-hard password-hashing and modern design. It was created by Dan Boneh, Henry Corrigan-Gibbs and Stuart Schechter in 2016. It is a recommended function in NIST password guidelines.

References

  1. "Password Hashing Competition"
  2. Jos Wetzels (2016-02-08). "Open Sesame: The Password Hashing Competition and Argon2". arXiv: 1602.03097 [cs.CR].
  3. Argon2: the memory-hard function for password hashing and other applications, Alex Biryukov, et al, October 1, 2015
  4. 1 2 Biryukov, Alex; Dinu, Daniel; Khovratovich, Dmitry; Josefsson, Simon (September 2021). "Argon2 Memory-Hard Function for Password Hashing and Proof-of-Work Applications" . Retrieved September 9, 2021.
  5. 1 2 Joël Alwen; Jeremiah Blocki (2016-08-05). Towards Practical Attacks on Argon2i and Balloon Hashing (PDF) (Report).
  6. Henry; Corrigan-Gibbs; Dan Boneh; Stuart Schechter (2016-01-14). Balloon Hashing: Provably Space-Hard Hash Functions with Data-Independent Access Patterns (PDF) (Report).
  7. 1 2 "[Cfrg] Argon2 v.1.3". www.ietf.org. Retrieved 2016-10-30.
  8. Joël Alwen; Jeremiah Blocki (2016-02-19). Efficiently Computing Data-Independent Memory-Hard Functions (PDF) (Report).
  9. "Recommendations". Argon2 Memory-Hard Function for Password Hashing and Proof-of-Work Applications. IETF. September 2021. sec. 7.4. doi: 10.17487/RFC9106 . RFC 9106 . Retrieved 12 July 2023.
  10. 1 2 3 "Password Storage Cheat Sheet". OWASP Cheat Sheet Series. OWASP. Retrieved 2023-05-17.