HKDF

Last updated

HKDF is a simple key derivation function (KDF) based on the HMAC message authentication code. [1] [2] It was initially proposed by its authors as a building block in various protocols and applications, as well as to discourage the proliferation of multiple KDF mechanisms. [2] The main approach HKDF follows is the "extract-then-expand" paradigm, where the KDF logically consists of two modules: the first stage takes the input keying material and "extracts" from it a fixed-length pseudorandom key, and then the second stage "expands" this key into several additional pseudorandom keys (the output of the KDF). [2]

Contents

It can be used, for example, to convert shared secrets exchanged via Diffie–Hellman into key material suitable for use in encryption, integrity checking or authentication. [1]

It is formally described in RFC 5869. [2] One of its authors also described the algorithm in a companion paper in 2010. [1]

NIST SP800-56Cr2 [3] specifies a parameterizable extract-then-expand scheme, noting that RFC 5869 HKDF is a version of it and citing its paper [1] for the rationale for the recommendations' extract-and-expand mechanisms.

There are implementations of HKDF for C#, Go, [4] Java, [5] JavaScript, [6] Perl, PHP, [7] Python, [8] Ruby, Rust, [9] and other programming languages.

Mechanism

HKDF is the composition of two functions, HKDF-Extract and HKDF-Expand: HKDF(salt, IKM, info, length) = HKDF-Expand(HKDF-Extract(salt, IKM), info, length)

HKDF-Extract

HKDF-Extract takes "input key material" (IKM) such as a shared secret generated using Diffie-Hellman, and an optional salt, and generates a cryptographic key called the PRK ("pseudorandom key"). This acts as a "randomness extractor", taking a potentially non-uniform value of high min-entropy and generating a value indistinguishable from a uniform random value.

HKDF-Extract is the output of HMAC with the "salt" as the key and the "IKM" as the message.

HKDF-Expand

HKDF-Expand takes the PRK, some "info", and a length, and generates output of the desired length. HKDF-Expand acts as a pseudorandom function keyed on PRK. This means that multiple outputs can be generated from a single IKM value by using different values for the "info" field.

HKDF-Expand works by repeatedly calling HMAC using the PRK as the key and the "info" field as the message. The HMAC inputs are chained by prepending the previous hash block to the "info" field and appending with an incrementing 8-bit counter. [2]

Example: Python implementation

#!/usr/bin/env python3importhashlibimporthmachash_function=hashlib.sha256# RFC5869 also includes SHA-1 test vectorsdefhmac_digest(key:bytes,data:bytes)->bytes:returnhmac.new(key,data,hash_function).digest()defhkdf_extract(salt:bytes,ikm:bytes)->bytes:iflen(salt)==0:salt=bytes([0]*hash_function().digest_size)returnhmac_digest(salt,ikm)defhkdf_expand(prk:bytes,info:bytes,length:int)->bytes:t=b""okm=b""i=0whilelen(okm)<length:i+=1t=hmac_digest(prk,t+info+bytes([i]))okm+=treturnokm[:length]defhkdf(salt:bytes,ikm:bytes,info:bytes,length:int)->bytes:prk=hkdf_extract(salt,ikm)returnhkdf_expand(prk,info,length)okm=hkdf(salt=bytes.fromhex("000102030405060708090a0b0c"),ikm=bytes.fromhex("0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b"),info=bytes.fromhex("f0f1f2f3f4f5f6f7f8f9"),length=42,)assertokm==bytes.fromhex("3cb25f25faacd57a90434f64d0362f2a""2d2d0a90cf1a5a4c5db02d56ecc4c5bf""34007208d5b887185865")# Zero-length saltasserthkdf(salt=b"",ikm=bytes.fromhex("0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b"),info=b"",length=42,)==bytes.fromhex("8da4e775a563c18f715f802a063c5a31""b8a11f5c5ee1879ec3454e5f3c738d2d""9d201395faa4b61a96c8")

Related Research Articles

<span class="mw-page-title-main">Hash function</span> Mapping arbitrary data to fixed-size values

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable-length output. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter-storage addressing.

<span class="mw-page-title-main">HMAC</span> Computer communications authentication algorithm

In cryptography, an HMAC is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret cryptographic key. As with any MAC, it may be used to simultaneously verify both the data integrity and authenticity of a message. An HMAC is a type of keyed hash function that can also be used in a key derivation scheme or a key stretching scheme.

The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as RFC 1321.

<span class="mw-page-title-main">Block cipher mode of operation</span> Cryptography algorithm

In cryptography, a block cipher mode of operation is an algorithm that uses a block cipher to provide information security such as confidentiality or authenticity. A block cipher by itself is only suitable for the secure cryptographic transformation of one fixed-length group of bits called a block. A mode of operation describes how to repeatedly apply a cipher's single-block operation to securely transform amounts of data larger than a block.

A cryptographically secure pseudorandom number generator (CSPRNG) or cryptographic pseudorandom number generator (CPRNG) is a pseudorandom number generator (PRNG) with properties that make it suitable for use in cryptography. It is also referred to as a cryptographic random number generator (CRNG).

<span class="mw-page-title-main">Cryptographic hash function</span> Hash function that is suitable for use in cryptography

A cryptographic hash function (CHF) is a hash algorithm that has special properties desirable for a cryptographic application:

<span class="mw-page-title-main">Key derivation function</span> Function that derives secret keys from a secret value

In cryptography, a key derivation function (KDF) is a cryptographic algorithm that derives one or more secret keys from a secret value such as a master key, a password, or a passphrase using a pseudorandom function. KDFs can be used to stretch keys into longer keys or to obtain keys of a required format, such as converting a group element that is the result of a Diffie–Hellman key exchange into a symmetric key for use with AES. Keyed cryptographic hash functions are popular examples of pseudorandom functions used for key derivation.

In cryptography, a message authentication code (MAC), sometimes known as an authentication tag, is a short piece of information used for authenticating and integrity-checking a message. In other words, to confirm that the message came from the stated sender and has not been changed. The MAC value allows verifiers to detect any changes to the message content.

In cryptography, PBKDF1 and PBKDF2 are key derivation functions with a sliding computational cost, used to reduce vulnerability to brute-force attacks.

SHA-2 is a set of cryptographic hash functions designed by the United States National Security Agency (NSA) and first published in 2001. They are built using the Merkle–Damgård construction, from a one-way compression function itself built using the Davies–Meyer structure from a specialized block cipher.

bcrypt is a password-hashing function designed by Niels Provos and David Mazières, based on the Blowfish cipher and presented at USENIX in 1999. Besides incorporating a salt to protect against rainbow table attacks, bcrypt is an adaptive function: over time, the iteration count can be increased to make it slower, so it remains resistant to brute-force search attacks even with increasing computation power.

BLAKE is a cryptographic hash function based on Daniel J. Bernstein's ChaCha stream cipher, but a permuted copy of the input block, XORed with round constants, is added before each ChaCha round. Like SHA-2, there are two variants differing in the word size. ChaCha operates on a 4×4 array of words. BLAKE repeatedly combines an 8-word hash value with 16 message words, truncating the ChaCha result to obtain the next hash value. BLAKE-256 and BLAKE-224 use 32-bit words and produce digest sizes of 256 bits and 224 bits, respectively, while BLAKE-512 and BLAKE-384 use 64-bit words and produce digest sizes of 512 bits and 384 bits, respectively.

In cryptography, scrypt is a password-based key derivation function created by Colin Percival in March 2009, originally for the Tarsnap online backup service. The algorithm was specifically designed to make it costly to perform large-scale custom hardware attacks by requiring large amounts of memory. In 2016, the scrypt algorithm was published by IETF as RFC 7914. A simplified version of scrypt is used as a proof-of-work scheme by a number of cryptocurrencies, first implemented by an anonymous programmer called ArtForz in Tenebrix and followed by Fairbrix and Litecoin soon after.

In cryptography and computer security, a length extension attack is a type of attack where an attacker can use Hash(message1) and the length of message1 to calculate Hash(message1message2) for an attacker-controlled message2, without needing to know the content of message1. This is problematic when the hash is used as a message authentication code with construction Hash(secretmessage), and message and the length of secret is known, because an attacker can include extra information at the end of the message and produce a valid hash without knowing the secret. Algorithms like MD5, SHA-1 and most of SHA-2 that are based on the Merkle–Damgård construction are susceptible to this kind of attack. Truncated versions of SHA-2, including SHA-384 and SHA-512/256 are not susceptible, nor is the SHA-3 algorithm. HMAC also uses a different construction and so is not vulnerable to length extension attacks. Lastly, just performing Hash(messagesecret) is enough to not be affected.

NIST SP 800-90A is a publication by the National Institute of Standards and Technology with the title Recommendation for Random Number Generation Using Deterministic Random Bit Generators. The publication contains the specification for three allegedly cryptographically secure pseudorandom number generators for use in cryptography: Hash DRBG, HMAC DRBG, and CTR DRBG. Earlier versions included a fourth generator, Dual_EC_DRBG. Dual_EC_DRBG was later reported to probably contain a kleptographic backdoor inserted by the United States National Security Agency (NSA).

In cryptography, the Salted Challenge Response Authentication Mechanism (SCRAM) is a family of modern, password-based challenge–response authentication mechanisms providing authentication of a user to a server. As it is specified for Simple Authentication and Security Layer (SASL), it can be used for password-based logins to services like LDAP, HTTP, SMTP, POP3, IMAP and JMAP (e-mail), XMPP (chat), or MongoDB and PostgreSQL (databases). For XMPP, supporting it is mandatory.

In cryptography, a pepper is a secret added to an input such as a password during hashing with a cryptographic hash function. This value differs from a salt in that it is not stored alongside a password hash, but rather the pepper is kept separate in some other medium, such as a Hardware Security Module. Note that the National Institute of Standards and Technology refers to this value as a secret key rather than a pepper. A pepper is similar in concept to a salt or an encryption key. It is like a salt in that it is a randomized value that is added to a password hash, and it is similar to an encryption key in that it should be kept secret.

Argon2 is a key derivation function that was selected as the winner of the 2015 Password Hashing Competition. It was designed by Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich from the University of Luxembourg. The reference implementation of Argon2 is released under a Creative Commons CC0 license or the Apache License 2.0, and provides three related versions:

Lyra2 is a password hashing scheme (PHS) that can also function as a key derivation function (KDF). It gained recognition during the Password Hashing Competition in July 2015, which was won by Argon2. It is also used in proof-of-work algorithms such as Lyra2REv2, adopted by Vertcoin and MonaCoin, among other cryptocurrencies.

A mask generation function (MGF) is a cryptographic primitive similar to a cryptographic hash function except that while a hash function's output has a fixed size, a MGF supports output of a variable length. In this respect, a MGF can be viewed as a extendable-output function (XOF): it can accept input of any length and process it to produce output of any length. Mask generation functions are completely deterministic: for any given input and any desired output length the output is always the same.

References

  1. 1 2 3 4 Krawczyk, Hugo (2010). "Cryptographic Extraction and Key Derivation: The HKDF Scheme". Cryptology ePrint Archive. International Association for Cryptologic Research.
  2. 1 2 3 4 5 Krawczyk, H.; Eronen, P. (May 2010). "RFC 5869". Internet Engineering Task Force. doi:10.17487/RFC5869.
  3. Elaine Barker; Lily Chen; Richard Davis (August 2020). "NIST Special Publication 800-56C: Recommendation for Key-Derivation Methods in Key-Establishment Schemes". doi:10.6028/NIST.SP.800-56Cr2.{{cite journal}}: Cite journal requires |journal= (help)
  4. "package hkdf". pkg.go.dev.
  5. "A standalone Java 7 implementation of HMAC-based key derivation function". github.com. 27 September 2022.
  6. "Node.js implementation of RFC5869: HMAC-based Extract-and-Expand Key Derivation Function". npmjs.com. 30 July 2023.
  7. "hash_hkdf — Generate a HKDF key derivation of a supplied key input". php.net.
  8. "HMAC-based Extract-and-Expand Key Derivation Function (HKDF) implemented in Python". github.com. 17 March 2022.
  9. "Module ring::hkdf". 19 October 2023. Retrieved 25 October 2023.