UMAC

Last updated

In cryptography, a message authentication code based on universal hashing , or UMAC, is a type of message authentication code (MAC) calculated choosing a hash function from a class of hash functions according to some secret (random) process and applying it to the message. The resulting digest or fingerprint is then encrypted to hide the identity of the hash function used. As with any MAC, it may be used to simultaneously verify both the data integrity and the authenticity of a message. In contrast to traditional MACs, which are serializable, UMAC can be executed in parallel. Thus as machines continue to offer more parallel processing capabilities, the speed of implementing UMAC will increase. [1]

Contents

A specific type of UMAC, also commonly referred to just UMAC, is specified in RFC 4418, it has provable cryptographic strength and is usually a lot less computationally intensive than other MACs. UMAC's design is optimized for 32-bit architectures with SIMD support, with a performance of 1 CPU cycle per byte (cpb) with SIMD and 2 cpb without SIMD. A closely related variant of UMAC that is optimized for 64-bit architectures is given by VMAC, which has been submitted to the IETF as a draft (draft-krovetz-vmac-01) but never gathered enough attention for becoming a standardized RFC.

Background

Universal hashing

Let's say the hash function is chosen from a class of hash functions H, which maps messages into D, the set of possible message digests. This class is called universal if, for any distinct pair of messages, there are at most |H|/|D| functions that map them to the same member of D.

This means that if an attacker wants to replace one message with another and, from his point of view, the hash function was chosen completely randomly, the probability that the UMAC will not detect his modification is at most 1/|D|.

But this definition is not strong enough if the possible messages are 0 and 1, D={0,1} and H consists of the identity operation and not, H is universal. But even if the digest is encrypted by modular addition, the attacker can change the message and the digest at the same time and the receiver wouldn't know the difference.

Strongly universal hashing

A class of hash functions H that is good to use will make it difficult for an attacker to guess the correct digest d of a fake message f after intercepting one message a with digest c. In other words,

needs to be very small, preferably 1/|D|.

It is easy to construct a class of hash functions when D is field. For example, if |D| is prime, all the operations are taken modulo |D|. The message a is then encoded as an n-dimensional vector over D (a1, a2, ..., an). H then has |D|n+1 members, each corresponding to an (n + 1)-dimensional vector over D (h0, h1, ..., hn). If we let

we can use the rules of probabilities and combinatorics to prove that

If we properly encrypt all the digests (e.g. with a one-time pad), an attacker cannot learn anything from them and the same hash function can be used for all communication between the two parties. This may not be true for ECB encryption because it may be quite likely that two messages produce the same hash value. Then some kind of initialization vector should be used, which is often called the nonce. It has become common practice to set h0 = f(nonce), where f is also secret.

Notice that having massive amounts of computer power does not help the attacker at all. If the recipient limits the amount of forgeries it accepts (by sleeping whenever it detects one), |D| can be 232 or smaller.

Example

The following C function generates a 24 bit UMAC. It assumes that secret is a multiple of 24 bits, msg is not longer than secret and result already contains the 24 secret bits e.g. f(nonce). nonce does not need to be contained in msg.

C language code (original)
/* DUBIOUS: This does not seem to have anything to do with the (likely long) RFC * definition. This is probably an example for the general UMAC concept. * Who the heck from 2007 (Nroets) chooses 3 bytes in an example? * * We gotta move this along with a better definition of str. uni. hash into * uni. hash. */#define uchar uint8_tvoidUHash24(uchar*msg,uchar*secret,size_tlen,uchar*result){ucharr1=0,r2=0,r3=0,s1,s2,s3,byteCnt=0,bitCnt,byte;while(len-->0){    /* Fetch new secret for every three bytes. */    if(byteCnt--==0){s1=*secret++;s2=*secret++;s3=*secret++;byteCnt=2;}byte=*msg++;    /* Each byte of the msg controls whether a bit of the secrets make it into the hash.、     *     * I don't get the point about keeping its order under 24, because with a 3-byte thing     * it by definition only holds polynominals order 0-23. The "sec" code have identical     * behavior, although we are still doing a LOT of work for each bit     */for(ucharbitCnt=0;bitCnt<8;bitCnt++){/* The last bit controls whether a secret bit is used. */if(byte&1){        r1^=s1;/* (sec >> 16) & 0xff */r2^=s2;/* (sec >>  8) & 0xff */r3^=s3;/* (sec      ) & 0xff */}byte>>=1;/* next bit. *//* and multiply secret with x (i.e. 2), subtracting (by XOR)         the polynomial when necessary to keep its order under 24 (?!)  */uchardoSub=s3&0x80;      s3<<=1;if(s2&0x80)s3|=1;s2<<=1;if(s1&0x80)s2|=1;s1<<=1;      if(doSub){/* 0b0001 1011 --> */s1^=0x1B;/* x^24 + x^4 + x^3 + x + 1 [16777243 -- not a prime] */}}/* for each bit in the message */}/* for each byte in the message */*result++^=r1;*result++^=r2;*result++^=r3;}
C language code (revised)
#define uchar     uint8_t#define swap32(x) ((x) & 0xff) << 24 | ((x) & 0xff00) << 8 | ((x) & 0xff0000) >> 8 | (x) & 0xff000000) >> 24)/* This is the same thing, but grouped up (generating better assembly and stuff).   It is still bad and nobody has explained why it's strongly universal. */voidUHash24Ex(uchar*msg,uchar*secret,size_tlen,uchar*result){ucharbyte,read;uint32_tsec=0,ret=0,content=0;while(len>0){/* Read three in a chunk. */content=0;switch(read=(len>=3?3:len)){case2:content|=(uint32_t)msg[2]<<16;/* FALLTHRU */case1:content|=(uint32_t)msg[1]<<8;/* FALLTHRU */case0:content|=(uint32_t)msg[0];}len-=read;msg+=read;    /* Fetch new secret for every three bytes. */sec=(uint32_t)secret[2]<<16|(uint32_t)secret[1]<<8|(uint32_t)secret[0];secret+=3;/* The great compressor. */for(bitCnt=0;bitCnt<24;bitCnt++){/* A hard data dependency to remove: output depends       * on the intermediate.       * Doesn't really work with CRC byte-tables. */if(byte&1){        ret^=sec;}byte>>=1;/* next bit. *//* Shift register. */sec<<=1;if(sec&0x01000000)sec^=0x0100001B;sec&=0x00ffffff;}/* for each bit in the message */}/* for each 3 bytes in the message */result[0]^=ret&0xff;result[1]^=(ret>>8)&0xff;result[2]^=(ret>>16)&0xff;}

NH and the RFC UMAC

NH

Functions in the above unnamed[ citation needed ] strongly universal hash-function family uses n multiplies to compute a hash value.

The NH family halves the number of multiplications, which roughly translates to a two-fold speed-up in practice. [2] For speed, UMAC uses the NH hash-function family. NH is specifically designed to use SIMD instructions, and hence UMAC is the first MAC function optimized for SIMD. [1]

The following hash family is -universal: [1]

.

where

Practically, NH is done in unsigned integers. All multiplications are mod 2^w, all additions mod 2^w/2, and all inputs as are a vector of half-words (-bit integers). The algorithm will then use multiplications, where was the number of half-words in the vector. Thus, the algorithm runs at a "rate" of one multiplication per word of input.

RFC 4418

RFC 4418 does a lot to wrap NH to make it a good UMAC. The overall UHASH ("Universal Hash Function") routine produces a variable length of tags, which corresponds to the number of iterations (and the total lengths of keys) needed in all three layers of its hashing. Several calls to an AES-based key derivation function is used to provide keys for all three keyed hashes.

In RFC 4418, NH is rearranged to take a form of:

Y = 0 for (i = 0; i < t; i += 8) doend for

This definition is designed to encourage programmers to use SIMD instructions on the accumulation, since only data with four indices away are likely to not be put in the same SIMD register, and hence faster to multiply in bulk. On a hypothetical machine, it could simply translate to:

Hypothetical assembly
movq$0,regY; Y = 0movq$0,regI; i = 0loop:addreg1,regM,regI; reg1 = M + iaddreg2,regM,regIvldr.4x32vec1,reg1; load 4x32bit vals from memory *reg1 to vec1vldr.4x32vec2,reg2vmul.4x64vec3,vec1,vec2; vec3 = vec1 * vec2uaddv.4x64reg3,vec3; horizontally sum vec3 into reg3addregY,regY,reg3; regY = regY + reg3addregI,regI,$8cmpregI,regTjltloop

See also

Related Research Articles

Blowfish is a symmetric-key block cipher, designed in 1993 by Bruce Schneier and included in many cipher suites and encryption products. Blowfish provides a good encryption rate in software, and no effective cryptanalysis of it has been found to date. However, the Advanced Encryption Standard (AES) now receives more attention, and Schneier recommends Twofish for modern applications.

<span class="mw-page-title-main">Hash function</span> Mapping arbitrary data to fixed-size values

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable length output. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. The values are usually used to index a fixed-size table called a hash table. Use of a hash function to index a hash table is called hashing or scatter storage addressing.

<span class="mw-page-title-main">HMAC</span> Computer communications hash algorithm

In cryptography, an HMAC is a specific type of message authentication code (MAC) involving a cryptographic hash function and a secret cryptographic key. As with any MAC, it may be used to simultaneously verify both the data integrity and authenticity of a message. An HMAC is a type of keyed hash function that can also be used in a key derivation scheme or a key stretching scheme.

The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as RFC 1321.

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Colloquially, their names were "186", "286", "386" and "486".

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

Adler-32 is a checksum algorithm written by Mark Adler in 1995, modifying Fletcher's checksum. Compared to a cyclic redundancy check of the same length, it trades reliability for speed. Adler-32 is more reliable than Fletcher-16, and slightly less reliable than Fletcher-32.

The Fletcher checksum is an algorithm for computing a position-dependent checksum devised by John G. Fletcher (1934–2012) at Lawrence Livermore Labs in the late 1970s. The objective of the Fletcher checksum was to provide error-detection properties approaching those of a cyclic redundancy check but with the lower computational effort associated with summation techniques.

<span class="mw-page-title-main">Salsa20</span> Stream ciphers

Salsa20 and the closely related ChaCha are stream ciphers developed by Daniel J. Bernstein. Salsa20, the original cipher, was designed in 2005, then later submitted to the eSTREAM European Union cryptographic validation process by Bernstein. ChaCha is a modification of Salsa20 published in 2008. It uses a new round function that increases diffusion and increases performance on some architectures.

The GOST hash function, defined in the standards GOST R 34.11-94 and GOST 34.311-95 is a 256-bit cryptographic hash function. It was initially defined in the Russian national standard GOST R 34.11-94 Information Technology – Cryptographic Information Security – Hash Function. The equivalent standard used by other member-states of the CIS is GOST 34.311-95.

In mathematics and computing, universal hashing refers to selecting a hash function at random from a family of hash functions with a certain mathematical property. This guarantees a low number of collisions in expectation, even if the data is chosen by an adversary. Many universal families are known, and their evaluation is often very efficient. Universal hashing has numerous uses in computer science, for example in implementations of hash tables, randomized algorithms, and cryptography.

<span class="mw-page-title-main">XXTEA</span> Block cipher

In cryptography, Corrected Block TEA is a block cipher designed to correct weaknesses in the original Block TEA.

CubeHash is a cryptographic hash function submitted to the NIST hash function competition by Daniel J. Bernstein. CubeHash has a 128 byte state, uses wide pipe construction, and is ARX based. Message blocks are XORed into the initial bits of a 128-byte state, which then goes through an r-round bijective transformation between blocks. The initial NIST proposal ("Cubehash8/1") required about 200 cycles per byte. After clarifications from NIST, the author changed the proposal to Cubehash16/32, which "is approximately 16 times faster than CubeHash8/1, easily catching up to both SHA-256 and SHA-512 on the reference platform" while still maintaining a "comfortable security margin".

bcrypt is a password-hashing function designed by Niels Provos and David Mazières, based on the Blowfish cipher and presented at USENIX in 1999. Besides incorporating a salt to protect against rainbow table attacks, bcrypt is an adaptive function: over time, the iteration count can be increased to make it slower, so it remains resistant to brute-force search attacks even with increasing computation power.

MurmurHash is a non-cryptographic hash function suitable for general hash-based lookup. It was created by Austin Appleby in 2008 and is currently hosted on GitHub along with its test suite named 'SMHasher'. It also exists in a number of variants, all of which have been released into the public domain. The name comes from two basic operations, multiply (MU) and rotate (R), used in its inner loop.

VMAC is a block cipher-based message authentication code (MAC) algorithm using a universal hash proposed by Ted Krovetz and Wei Dai in April 2007. The algorithm was designed for high performance backed by a formal analysis.

BLAKE is a cryptographic hash function based on Daniel J. Bernstein's ChaCha stream cipher, but a permuted copy of the input block, XORed with round constants, is added before each ChaCha round. Like SHA-2, there are two variants differing in the word size. ChaCha operates on a 4×4 array of words. BLAKE repeatedly combines an 8-word hash value with 16 message words, truncating the ChaCha result to obtain the next hash value. BLAKE-256 and BLAKE-224 use 32-bit words and produce digest sizes of 256 bits and 224 bits, respectively, while BLAKE-512 and BLAKE-384 use 64-bit words and produce digest sizes of 512 bits and 384 bits, respectively.

Badger is a Message Authentication Code (MAC) based on the idea of universal hashing and was developed by Boesgaard, Scavenius, Pedersen, Christensen, and Zenner. It is constructed by strengthening the ∆-universal hash family MMH using an ϵ-almost strongly universal (ASU) hash function family after the application of ENH, where the value of ϵ is . Since Badger is a MAC function based on the universal hash function approach, the conditions needed for the security of Badger are the same as those for other universal hash functions such as UMAC.

Argon2 is a key derivation function that was selected as the winner of the 2015 Password Hashing Competition. It was designed by Alex Biryukov, Daniel Dinu, and Dmitry Khovratovich from the University of Luxembourg. The reference implementation of Argon2 is released under a Creative Commons CC0 license or the Apache License 2.0, and provides three related versions:

LSH is a cryptographic hash function designed in 2014 by South Korea to provide integrity in general-purpose software environments such as PCs and smart devices. LSH is one of the cryptographic algorithms approved by the Korean Cryptographic Module Validation Program (KCMVP). And it is the national standard of South Korea.

References

  1. 1 2 3 Black, J.; Halevi, S.; Krawczyk, H.; Krovetz, T. (1999). UMAC: Fast and Secure Message Authentication (PDF). Advances in Cryptology (CRYPTO '99). Archived from the original (PDF) on 2012-03-10., Equation 1 and also section 4.2 "Definition of NH".
  2. Thorup, Mikkel (2009). String hashing for linear probing. Proc. 20th ACM-SIAM Symposium on Discrete Algorithms (SODA). pp. 655–664. CiteSeerX   10.1.1.215.4253 . doi:10.1137/1.9781611973068.72. ISBN   978-0-89871-680-1. Archived (PDF) from the original on 2013-10-12., section 5.3