Mask generation function

Last updated

A mask generation function (MGF) is a cryptographic primitive similar to a cryptographic hash function except that while a hash function's output has a fixed size, a MGF supports output of a variable length. In this respect, a MGF can be viewed as a extendable-output function (XOF): it can accept input of any length and process it to produce output of any length. Mask generation functions are completely deterministic: for any given input and any desired output length the output is always the same.

Contents

Definition

A mask generation function takes an octet string of variable length and a desired output length as input, and outputs an octet string of the desired length. There may be restrictions on the length of the input and output octet strings, but such bounds are generally very large. Mask generation functions are deterministic; the octet string output is completely determined by the input octet string. The output of a mask generation function should be pseudorandom, that is, if the seed to the function is unknown, it should be infeasible to distinguish the output from a truly random string. [1]

Applications

Mask generation functions, as generalizations of hash functions, are useful wherever hash functions are. However, use of a MGF is desirable in cases where a fixed-size hash would be inadequate. Examples include generating padding, producing one-time pads or keystreams in symmetric-key encryption, and yielding outputs for pseudorandom number generators.

Padding schemes

Mask generation functions were first proposed as part of the specification for padding in the RSA-OAEP algorithm. The OAEP algorithm required a cryptographic hash function that could generate an output equal in size to a "data block" whose length was proportional to arbitrarily sized input message. [1]

Random number generators

NIST Special Publication 800-90A [2] defines a class of cryptographically secure random number generators, one of which is the "Hash DRBG", which uses a hash function with a counter to produce a requested sequence of random bits equal in size to the requested number of random bits.

Examples

Perhaps the most common and straightforward mechanism to build a MGF is to iteratively apply a hash function together with an incrementing counter value. The counter may be incremented indefinitely to yield new output blocks until a sufficient amount of output is collected. This is the approach used in MGF1.

MGF1

MGF1 is a mask generation function defined in the Public Key Cryptography Standard #1 published by RSA Laboratories: [1]

Options

hash function ( denotes the length in octets of the hash function output)

Input

seed from which mask is generated, an octet string

intended length in octets of the mask, at most

Output

mask, an octet string of length ; or "mask too long"

Steps

  1. If , output "mask too long" and stop.
  2. Let be the empty octet string.
  3. For from to , do the following:
    1. Convert to an octet string of length with the primitive :
    2. Concatenate the hash of the seed and to the octet string :
  4. Output the leading octets of as the octet string mask.

Example code

Below is Python code implementing MGF1:

importhashlibdefmgf1(seed:bytes,length:int,hash_func=hashlib.sha1)->bytes:"""Mask generation function."""hLen=hash_func().digest_size# https://www.ietf.org/rfc/rfc2437.txt# 1. If l > 2^32(hLen), output "mask too long" and stop.iflength>(hLen<<32):raiseValueError("mask too long")# 2. Let T be the empty octet string.T=b""# 3. For counter from 0 to \lceil{l / hLen}\rceil-1, do the following:# Note: \lceil{l / hLen}\rceil-1 is the number of iterations needed,#       but it's easier to check if we have reached the desired length.counter=0whilelen(T)<length:# a. Convert counter to an octet string C of length 4 with the primitive I2OSP: C = I2OSP (counter, 4)C=int.to_bytes(counter,4,"big")# b. Concatenate the hash of the seed Z and C to the octet string T: T = T || Hash (Z || C)T+=hash_func(seed+C).digest()counter+=1# 4. Output the leading l octets of T as the octet string mask.returnT[:length]

Example outputs of MGF1:

Python 3.10.4 (main, Apr 16 2022, 16:28:41) [GCC 8.3.0] on linuxType "help", "copyright", "credits" or "license" for more information.>>> frommgf1importmgf1>>> fromhashlibimportsha256>>> mgf1(b"foo",3).hex()'1ac907'>>> mgf1(b"foo",5).hex()'1ac9075cd4'>>> mgf1(b"bar",5).hex()'bc0c655e01'>>> mgf1(b"bar",50).hex()'bc0c655e016bc2931d85a2e675181adcef7f581f76df2739da74faac41627be2f7f415c89e983fd0ce80ced9878641cb4876'>>> mgf1(b"bar",50,sha256).hex()'382576a7841021cc28fc4c0948753fb8312090cea942ea4c4e735d10dc724b155f9f6069f289d61daca0cb814502ef04eae1'

Related Research Articles

<span class="mw-page-title-main">Hash function</span> Mapping arbitrary data to fixed-size values

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

<span class="mw-page-title-main">HMAC</span> Computer communications hash algorithm

In cryptography, an HMAC is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret cryptographic key. As with any MAC, it may be used to simultaneously verify both the data integrity and authenticity of a message. An HMAC is a type of keyed hash function that can also be used in a key derivation scheme or a key stretching scheme.

In cryptography, PBKDF1 and PBKDF2 are key derivation functions with a sliding computational cost, used to reduce vulnerability to brute-force attacks.

In cryptography, Optimal Asymmetric Encryption Padding (OAEP) is a padding scheme often used together with RSA encryption. OAEP was introduced by Bellare and Rogaway, and subsequently standardized in PKCS#1 v2 and RFC 2437.

In cryptography, a message authentication code based on universal hashing, or UMAC, is a type of message authentication code (MAC) calculated choosing a hash function from a class of hash functions according to some secret (random) process and applying it to the message. The resulting digest or fingerprint is then encrypted to hide the identity of the hash function used. As with any MAC, it may be used to simultaneously verify both the data integrity and the authenticity of a message. In contrast to traditional MACs, which are serializable, UMAC can be executed in parallel. Thus as machines continue to offer more parallel processing capabilities, the speed of implementing UMAC will increase.

Poly1305 is a universal hash family designed by Daniel J. Bernstein for use in cryptography.

In mathematics and computing, universal hashing refers to selecting a hash function at random from a family of hash functions with a certain mathematical property. This guarantees a low number of collisions in expectation, even if the data is chosen by an adversary. Many universal families are known, and their evaluation is often very efficient. Universal hashing has numerous uses in computer science, for example in implementations of hash tables, randomized algorithms, and cryptography.

<span class="mw-page-title-main">Merkle–Damgård construction</span> Method of building collision-resistant cryptographic hash functions

In cryptography, the Merkle–Damgård construction or Merkle–Damgård hash function is a method of building collision-resistant cryptographic hash functions from collision-resistant one-way compression functions. This construction was used in the design of many popular hash algorithms such as MD5, SHA-1 and SHA-2.

In cryptography, Galois/Counter Mode (GCM) is a mode of operation for symmetric-key cryptographic block ciphers which is widely adopted for its performance. GCM throughput rates for state-of-the-art, high-speed communication channels can be achieved with inexpensive hardware resources.

<span class="mw-page-title-main">Z-channel (information theory)</span>

In coding theory and information theory, a Z-channel or binary asymmetric channel is a communications channel used to model the behaviour of some data storage systems.

SHA-3 is the latest member of the Secure Hash Algorithm family of standards, released by NIST on August 5, 2015. Although part of the same series of standards, SHA-3 is internally different from the MD5-like structure of SHA-1 and SHA-2.

BLAKE is a cryptographic hash function based on Daniel J. Bernstein's ChaCha stream cipher, but a permuted copy of the input block, XORed with round constants, is added before each ChaCha round. Like SHA-2, there are two variants differing in the word size. ChaCha operates on a 4×4 array of words. BLAKE repeatedly combines an 8-word hash value with 16 message words, truncating the ChaCha result to obtain the next hash value. BLAKE-256 and BLAKE-224 use 32-bit words and produce digest sizes of 256 bits and 224 bits, respectively, while BLAKE-512 and BLAKE-384 use 64-bit words and produce digest sizes of 512 bits and 384 bits, respectively.

In discrete mathematics, ideal lattices are a special class of lattices and a generalization of cyclic lattices. Ideal lattices naturally occur in many parts of number theory, but also in other areas. In particular, they have a significant place in cryptography. Micciancio defined a generalization of cyclic lattices as ideal lattices. They can be used in cryptosystems to decrease by a square root the number of parameters necessary to describe a lattice, making them more efficient. Ideal lattices are a new concept, but similar lattice classes have been used for a long time. For example, cyclic lattices, a special case of ideal lattices, are used in NTRUEncrypt and NTRUSign.

ACE is the collection of units, implementing both a public key encryption scheme and a digital signature scheme. Corresponding names for these schemes — «ACE Encrypt» and «ACE Sign». Schemes are based on Cramer-Shoup public key encryption scheme and Cramer-Shoup signature scheme. Introduced variants of these schemes are intended to achieve a good balance between performance and security of the whole encryption system.

HKDF is a simple key derivation function (KDF) based on the HMAC message authentication code. It was initially proposed by its authors as a building block in various protocols and applications, as well as to discourage the proliferation of multiple KDF mechanisms. The main approach HKDF follows is the "extract-then-expand" paradigm, where the KDF logically consists of two modules: the first stage takes the input keying material and "extracts" from it a fixed-length pseudorandom key, and then the second stage "expands" this key into several additional pseudorandom keys.

Fuzzy extractors are a method that allows biometric data to be used as inputs to standard cryptographic techniques, to enhance computer security. "Fuzzy", in this context, refers to the fact that the fixed values required for cryptography will be extracted from values close to but not identical to the original key, without compromising the security required. One application is to encrypt and authenticate users records, using the biometric inputs of the user as a key.

In cryptography, scrypt is a password-based key derivation function created by Colin Percival in March 2009, originally for the Tarsnap online backup service. The algorithm was specifically designed to make it costly to perform large-scale custom hardware attacks by requiring large amounts of memory. In 2016, the scrypt algorithm was published by IETF as RFC 7914. A simplified version of scrypt is used as a proof-of-work scheme by a number of cryptocurrencies, first implemented by an anonymous programmer called ArtForz in Tenebrix and followed by Fairbrix and Litecoin soon after.

<span class="mw-page-title-main">Sponge function</span> Theory of cryptography

In cryptography, a sponge function or sponge construction is any of a class of algorithms with finite internal state that take an input bit stream of any length and produce an output bit stream of any desired length. Sponge functions have both theoretical and practical uses. They can be used to model or implement many cryptographic primitives, including cryptographic hashes, message authentication codes, mask generation functions, stream ciphers, pseudo-random number generators, and authenticated encryption.

Lyra2 is a password hashing scheme (PHS) that can also work as a key derivation function (KDF). It received a special recognition during the Password Hashing Competition in July 2015, which was won by Argon2. Besides being used for its original purposes, it is also in the core of proof-of-work algorithms such as Lyra2REv2, adopted by Vertcoin, MonaCoin, among other cryptocurrencies Lyra2 was designed by Marcos A. Simplicio Jr., Leonardo C. Almeida, Ewerton R. Andrade, Paulo C. F. dos Santos, and Paulo S. L. M. Barreto from Escola Politécnica da Universidade de São Paulo. It is an improvement over Lyra, previously proposed by the same authors. Lyra2 preserves the security, efficiency and flexibility of its predecessor, including: (1) the ability to configure the desired amount of memory, processing time and parallelism to be used by the algorithm; and (2) the capacity of providing a high memory usage with a processing time similar to that obtained with scrypt. In addition, it brings the following improvements when compared to its predecessor:

LSH is a cryptographic hash function designed in 2014 by South Korea to provide integrity in general-purpose software environments such as PCs and smart devices. LSH is one of the cryptographic algorithms approved by the Korean Cryptographic Module Validation Program (KCMVP). And it is the national standard of South Korea.

References

  1. 1 2 3 RSA Laboratories. "RFC 2437 PKCS #1".
  2. National Institute of Standards and Technology (2012). "Recommendation for Random Number Generation Using Deterministic Random Bit Generators" (PDF). doi:10.6028/NIST.SP.800-90A.{{cite journal}}: Cite journal requires |journal= (help)