RDRAND

Last updated

RDRAND (for "read random"; known as Intel Secure Key Technology, [1] codename Bull Mountain [2] ) is an instruction for returning random numbers from an Intel on-chip hardware random number generator which has been seeded by an on-chip entropy source. [3] Intel introduced the feature around 2012, and AMD added support for the instruction in June 2015. (RDRAND is available in Ivy Bridge processors [lower-alpha 1] and is part of the Intel 64 and IA-32 instruction set architectures.) [5]

Contents

The random number generator is compliant with security and cryptographic standards such as NIST SP 800-90A, [6] FIPS 140-2, and ANSI X9.82. [3] Intel also requested Cryptography Research Inc. to review the random number generator in 2012, which resulted in the paper Analysis of Intel's Ivy Bridge Digital Random Number Generator. [7]

RDSEED is similar to RDRAND and provides lower-level access to the entropy-generating hardware. The RDSEED generator and processor instruction rdseed are available with Intel Broadwell CPUs [8] and AMD Zen CPUs. [9]

Overview

The CPUID instruction can be used on both AMD and Intel CPUs to check whether the RDRAND instruction is supported. If it is, bit 30 of the ECX register is set after calling CPUID standard function 01H. [10] AMD processors are checked for the feature using the same test. [11] RDSEED availability can be checked on Intel CPUs in a similar manner. If RDSEED is supported, the bit 18 of the EBX register is set after calling CPUID standard function 07H. [12]

The opcode for RDRAND is 0x0F 0xC7, followed by a ModRM byte that specifies the destination register and optionally combined with a REX prefix in 64-bit mode. [13]

Intel Secure Key is Intel's name for both the RDRAND instruction and the underlying random number generator (RNG) hardware implementation, [3] which was codenamed "Bull Mountain" during development. [14] Intel calls their RNG a "digital random number generator" or DRNG. The generator takes pairs of 256-bit raw entropy samples generated by the hardware entropy source and applies them to an Advanced Encryption Standard (AES) (in CBC-MAC mode) conditioner which reduces them to a single 256-bit conditioned entropy sample. A deterministic random-bit generator called CTR DRBG defined in NIST SP 800-90A is seeded by the output from the conditioner, providing cryptographically secure random numbers to applications requesting them via the RDRAND instruction. [3] [14] The hardware will issue a maximum of 511 128-bit samples before changing the seed value. Using the RDSEED operation provides access to the conditioned 256-bit samples from the AES-CBC-MAC.

The RDSEED instruction was added to Intel Secure Key for seeding another pseudorandom number generator, [15] available in Broadwell CPUs. The entropy source for the RDSEED instruction runs asynchronously on a self-timed circuit and uses thermal noise within the silicon to output a random stream of bits at the rate of 3 GHz, [16] slower than the effective 6.4 Gbit/s obtainable from RDRAND (both rates are shared between all cores and threads). [17] The RDSEED instruction is intended for seeding a software PRNG of arbitrary width, whereas the RDRAND is intended for applications that merely require high-quality random numbers. If cryptographic security is not required, a software PRNG such as Xorshift is usually faster. [18]

Performance

On an Intel Core i7-7700K, 4500 MHz (45 × 100 MHz) processor (Kaby Lake-S microarchitecture), a single RDRAND or RDSEED instruction takes 110 ns, or 463 clock cycles, regardless of the operand size (16/32/64 bits). This number of clock cycles applies to all processors with Skylake or Kaby Lake microarchitecture. On the Silvermont microarchitecture processors, each of the instructions take around 1472 clock cycles, regardless of the operand size; and on Ivy Bridge processors RDRAND takes up to 117 clock cycles. [19]

On an AMD Ryzen CPU, each of the instructions takes around 1200 clock cycles for 16-bit or 32-bit operand, and around 2500 clock cycles for a 64-bit operand. [19]

An astrophysical Monte Carlo simulator examined the time to generate 107 64-bit random numbers using RDRAND on a quad-core Intel i7-3740 QM processor. They found that a C implementation of RDRAND ran about 2× slower than the default random number generator in C, and about 20× slower than the Mersenne Twister. Although a Python module of RDRAND has been constructed, it was found to be 20× slower than the default random number generator in Python, [20] although a performance comparison between a PRNG and CSPRNG cannot be made.

A microcode update released by Intel in June 2020, designed to mitigate the CrossTalk vulnerability (see the security issues section below), negatively impacts the performance of RDRAND and RDSEED due to additional security controls. On processors with the mitigations applied, each affected instruction incurs additional latency and simultaneous execution of RDRAND or RDSEED across cores is effectively serialised. Intel introduced a mechanism to relax these security checks, thus reducing the performance impact in most scenarios, but Intel processors do not apply this security relaxation by default. [21]

Compilers

Visual C++ 2015 provides intrinsic wrapper support for the RDRAND and RDSEED functions. [22] GCC 4.6+ and Clang 3.2+ provide intrinsic functions for RDRAND when -mrdrnd is specified in the flags, [23] also setting __RDRND__ to allow conditional compilation. Newer versions additionally provide immintrin.h to wrap these built-ins into functions compatible with version 12.1+ of Intel's C Compiler. These functions write random data to the location pointed to by their parameter, and return 1 on success. [24]

Applications

It is an option to generate cryptographically secure random numbers using RDRAND and RDSEED in OpenSSL, to help secure communications.

Scientific application of RDRAND in a Monte Carlo simulator was evaluated, focusing on performance and reproducibility, compared to other random number generators. It led to the conclusion that using RDRAND as opposed to Mersenne Twister doesn't provide different results, but worse performance and reproducibility. [25] [20]

Reception

In September 2013, in response to a New York Times article revealing the NSA's effort to weaken encryption, [26] Theodore Ts'o publicly posted concerning the use of RDRAND for /dev/random in the Linux kernel: [27]

I am so glad I resisted pressure from Intel engineers to let /dev/random rely only on the RDRAND instruction. To quote from the [New York Times article [26] ]: "By this year, the Sigint Enabling Project had found ways inside some of the encryption chips that scramble information for businesses and governments, either by working with chipmakers to insert back doors..." Relying solely on the hardware random number generator which is using an implementation sealed inside a chip which is impossible to audit is a BAD idea.

Linus Torvalds dismissed concerns about the use of RDRAND in the Linux kernel and pointed out that it is not used as the only source of entropy for /dev/random, but rather used to improve the entropy by combining the values received from RDRAND with other sources of randomness. [28] [29] However, Taylor Hornby of Defuse Security demonstrated that the Linux random number generator could become insecure if a backdoor is introduced into the RDRAND instruction that specifically targets the code using it. Hornby's proof-of-concept implementation works on an unmodified Linux kernel prior to version 3.13. [30] [31] [32] The issue was mitigated in the Linux kernel in 2013. [33]

Developers changed the FreeBSD kernel away from using RDRAND and VIA PadLock directly with the comment "For FreeBSD 10, we are going to backtrack and remove RDRAND and Padlock backends and feed them into Yarrow instead of delivering their output directly to /dev/random. It will still be possible to access hardware random number generators, that is, RDRAND, Padlock etc., directly by inline assembly or by using OpenSSL from userland, if required, but we cannot trust them any more." [28] [34] FreeBSD /dev/random uses Fortuna and RDRAND started from FreeBSD 11. [35]

Security issues

On 9 June 2020, researchers from Vrije Universiteit Amsterdam published a side-channel attack named CrossTalk (CVE-2020-0543) that affected RDRAND on a number of Intel processors. [36] They discovered that outputs from the hardware digital random number generator (DRNG) were stored in a staging buffer that was shared across all cores. The vulnerability allowed malicious code running on an affected processor to read RDRAND and RDSEED instruction results from a victim application running on another core of that same processor, including applications running inside Intel SGX enclaves. [36] The researchers developed a proof-of-concept exploit [37] which extracted a complete ECDSA key from an SGX enclave running on a separate CPU core after only one signature operation. [36] The vulnerability affects scenarios where untrusted code runs alongside trusted code on the same processor, such as in a shared hosting environment.

Intel refers to the CrossTalk vulnerability as Special Register Buffer Data Sampling (SRBDS). In response to the research, Intel released microcode updates to mitigate the issue. The updated microcode ensures that off-core accesses are delayed until sensitive operations  specifically the RDRAND, RDSEED, and EGETKEY instructions  are completed and the staging buffer has been overwritten. [21] The SRBDS attack also affects other instructions, such as those that read MSRs, but Intel did not apply additional security protections to them due to performance concerns and the reduced need for confidentiality of those instructions' results. [21] A wide range of Intel processors released between 2012 and 2019 were affected, including desktop, mobile, and server processors. [38] The mitigations themselves resulted in negative performance impacts when using the affected instructions, particularly when executed in parallel by multi-threaded applications, due to increased latency introduced by the security checks and the effective serialisation of affected instructions across cores. Intel introduced an opt-out option, configurable via the IA32_MCU_OPT_CTRL MSR on each logical processor, which improves performance by disabling the additional security checks for instructions executing outside of an SGX enclave. [21]

See also

Notes

  1. In some Ivy Bridge versions, due to a bug, the RDRAND instruction causes an Illegal Instruction exception. [4]

Related Research Articles

Transmeta Corporation was an American fabless semiconductor company based in Santa Clara, California. It developed low power x86 compatible microprocessors based on a VLIW core and a software layer called Code Morphing Software.

A cryptographically secure pseudorandom number generator (CSPRNG) or cryptographic pseudorandom number generator (CPRNG) is a pseudorandom number generator (PRNG) with properties that make it suitable for use in cryptography. It is also loosely known as a cryptographic random number generator (CRNG).

x86-64 Type of instruction set which is a 64-bit version of the x86 instruction set

x86-64 is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mode.

In computing, Physical Address Extension (PAE), sometimes referred to as Page Address Extension, is a memory management feature for the x86 architecture. PAE was first introduced by Intel in the Pentium Pro, and later by AMD in the Athlon processor. It defines a page table hierarchy of three levels (instead of two), with table entries of 64 bits each instead of 32, allowing these CPUs to directly access a physical address space larger than 4 gigabytes (232 bytes).

The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.

<span class="mw-page-title-main">/dev/random</span> Pseudorandom number generator file in Unix-like operating systems

In Unix-like operating systems, /dev/random and /dev/urandom are special files that serve as cryptographically secure pseudorandom number generators. They allow access to environmental noise collected from device drivers and other sources. /dev/random typically blocks if there was less entropy available than requested; more recently it usually blocks at startup until sufficient entropy has been gathered, then unblocks permanently. The /dev/urandom device typically was never a blocking device, even if the pseudorandom number generator seed was not fully initialized with entropy since boot. Not all operating systems implement the same methods for /dev/random and /dev/urandom.

The security of cryptographic systems depends on some secret data that is known to authorized persons but unknown and unpredictable to others. To achieve this unpredictability, some randomization is typically employed. Modern cryptographic protocols often require frequent generation of random quantities. Cryptographic attacks that subvert or exploit weaknesses in this process are known as random number generator attacks.

<span class="mw-page-title-main">Random number generation</span> Producing a sequence that cannot be predicted better than by random chance

Random number generation is a process by which, often by means of a random number generator (RNG), a sequence of numbers or symbols that cannot be reasonably predicted better than by random chance is generated. This means that the particular outcome sequence will contain some patterns detectable in hindsight but impossible to foresee. True random number generators can be hardware random-number generators (HRNGs), wherein each generation is a function of the current value of a physical environment's attribute that is constantly changing in a manner that is practically impossible to model. This would be in contrast to so-called "random number generations" done by pseudorandom number generators (PRNGs), which generate numbers that only look random but are in fact pre-determined—these generations can be reproduced simply by knowing the state of the PRNG.

In computer security, executable-space protection marks memory regions as non-executable, such that an attempt to execute machine code in these regions will cause an exception. It makes use of hardware features such as the NX bit, or in some cases software emulation of those features. However, technologies that emulate or supply an NX bit will usually impose a measurable overhead while using a hardware-supplied NX bit imposes no measurable overhead.

In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.

The Time Stamp Counter (TSC) is a 64-bit register present on all x86 processors since the Pentium. It counts the number of CPU cycles since its reset. The instruction RDTSC returns the TSC in EDX:EAX. In x86-64 mode, RDTSC also clears the upper 32 bits of RAX and RDX. Its opcode is 0F 31. Pentium competitors such as the Cyrix 6x86 did not always have a TSC and may consider RDTSC an illegal instruction. Cyrix included a Time Stamp Counter in their MII.

CryptGenRandom is a deprecated cryptographically secure pseudorandom number generator function that is included in Microsoft CryptoAPI. In Win32 programs, Microsoft recommends its use anywhere random number generation is needed. A 2007 paper from Hebrew University suggested security problems in the Windows 2000 implementation of CryptGenRandom. Microsoft later acknowledged that the same problems exist in Windows XP, but not in Vista. Microsoft released a fix for the bug with Windows XP Service Pack 3 in mid-2008.

SSE4 is a SIMD CPU instruction set used in the Intel Core microarchitecture and AMD K10 (K8L). It was announced on September 27, 2006, at the Fall 2006 Intel Developer Forum, with vague details in a white paper; more precise details of 47 instructions became available at the Spring 2007 Intel Developer Forum in Beijing, in the presentation. SSE4 is fully compatible with software written for previous generations of Intel 64 and IA-32 architecture microprocessors. All existing software continues to run correctly without modification on microprocessors that incorporate SSE4, as well as in the presence of existing and new applications that incorporate SSE4.

In computing, entropy is the randomness collected by an operating system or application for use in cryptography or other uses that require random data. This randomness is often collected from hardware sources, either pre-existing ones such as mouse movements or specially provided randomness generators. A lack of entropy can have a negative impact on performance and security.

Advanced Vector Extensions (AVX) are extensions to the x86 instruction set architecture for microprocessors from Intel and Advanced Micro Devices (AMD). They were proposed by Intel in March 2008 and first supported by Intel with the Sandy Bridge processor shipping in Q1 2011 and later by AMD with the Bulldozer processor shipping in Q3 2011. AVX provides new features, new instructions, and a new coding scheme.

An Advanced Encryption Standard instruction set is now integrated into many processors. The purpose of the instruction set is to improve the speed and security of applications performing encryption and decryption using the Advanced Encryption Standard (AES).

Intel oneAPI Math Kernel Library is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.

VIA PadLock is a central processing unit (CPU) instruction set extension to the x86 microprocessor instruction set architecture (ISA) found on processors produced by VIA Technologies and Zhaoxin. Introduced in 2003 with the VIA Centaur CPUs, the additional instructions provide hardware-accelerated random number generation (RNG), Advanced Encryption Standard (AES), SHA-1, SHA256, and Montgomery modular multiplication.

Second Level Address Translation (SLAT), also known as nested paging, is a hardware-assisted virtualization technology which makes it possible to avoid the overhead associated with software-managed shadow page tables.

<span class="mw-page-title-main">Meltdown (security vulnerability)</span> Microprocessor security vulnerability

Meltdown is one of the two original transient execution CPU vulnerabilities. Meltdown affects Intel x86 microprocessors, IBM POWER processors, and some ARM-based microprocessors. It allows a rogue process to read all memory, even when it is not authorized to do so.

References

  1. "What is Intel® Secure Key Technology?". Intel. Retrieved 2020-09-23.
  2. Hofemeier, Gael (2011-06-22). "Find out about Intel's new RDRAND Instruction". Intel Developer Zone Blogs. Retrieved 30 December 2013.
  3. 1 2 3 4 "Intel Digital Random Number Generator (DRNG): Software Implementation Guide, Revision 1.1" (PDF). Intel Corporation. 2012-08-07. Retrieved 2012-11-25.
  4. Desktop 3rd Generation Intel Core Processor Family, Specification Update (PDF). Intel Corporation. January 2013.
  5. "AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions" (PDF). AMD Developer Guides, Manuals & ISA Documents. June 2015. Retrieved 16 October 2015.
  6. Barker, Elaine; Kelsey, John (January 2012). "Recommendation for Random Number Generation Using Deterministic Random Bit Generators" (PDF). National Institute of Standards and Technology. doi:10.6028/NIST.SP.800-90A . Retrieved September 16, 2013.{{cite journal}}: Cite journal requires |journal= (help)
  7. Hamburg, Mike; Kocher, Paul; Marson, Mark (2012-03-12). "Analysis of Intel's Ivy Bridge Digital Random Number Generator" (PDF). Cryptography Research, Inc. Archived from the original (PDF) on 2014-12-30. Retrieved 2015-08-21.
  8. Hofemeier, Gael (2012-07-26). "Introduction to Intel AES-NI and Intel SecureKey Instructions". Intel Developer Zone. Intel. Retrieved 2015-10-24.
  9. "AMD Starts Linux Enablement On Next-Gen "Zen" Architecture - Phoronix". www.phoronix.com. Retrieved 2015-10-25.
  10. "Volume 1, Section 7.3.17, 'Random Number Generator Instruction'" (PDF). Intel® 64 and IA-32 Architectures Software Developer’s Manual Combined Volumes: 1, 2A, 2B, 2C, 3A, 3B and 3C. Intel Corporation. June 2013. p. 177. Retrieved 24 June 2013. All Intel processors that support the RDRAND instruction indicate the availability of the RDRAND instruction via reporting CPUID.01H:ECX.RDRAND[bit 30] = 1
  11. "AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions" (PDF). AMD. June 2015. p. 278. Retrieved 15 October 2015. Support for the RDRAND instruction is optional. On processors that support the instruction, CPUID Fn0000_0001_ECX[RDRAND] = 1
  12. "Volume 1, Section 7.3.17, 'Random Number Generator Instruction'" (PDF). Intel® 64 and IA-32 Architectures Software Developer’s Manual Combined Volumes: 1, 2A, 2B, 2C, 3A, 3B and 3C. Intel Corporation. June 2013. p. 177. Retrieved 25 October 2015. All Intel processors that support the RDSEED instruction indicate the availability of the RDSEED instruction via reporting CPUID.(EAX=07H, ECX=0H):EBX.RDSEED[bit 18] = 1
  13. "Intel® Digital Random Number Generator (DRNG) Software Implementation Guide". Software.intel.com. Retrieved 2014-01-30.
  14. 1 2 Taylor, Greg; Cox, George (September 2011). "Behind Intel's New Random-Number Generator". IEEE Spectrum .
  15. John Mechalas (November 2012). "The Difference Between RDRAND and RDSEED". software.intel.com. Intel Corporation. Retrieved 1 January 2014.
  16. Mechalas, John. "Intel Digital Random Number Generator (DRNG) Software Implementation Guide, Section 3.2.1 Entropy Source (ES)". Intel Software. Intel. Retrieved 18 February 2015.
  17. https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide says 800 megabytes, which is 6.4 gigabits per second.
  18. The simplest 64-bit implementation of Xorshift has 3 XORs and 3 shifts; if these are executed in a tight loop on 4 cores at 2 GHz, the throughput is 80 Gb/s. In practice it will be less due to load/store overheads etc, but is still likely to exceed the 6.4 Gb/s of RDRAND. On the other hand, the quality of RDRAND's numbers should be higher than that of a software PRNG like Xorshift.
  19. 1 2 http://www.agner.org/optimize/instruction_tables.pdf [ bare URL PDF ]
  20. 1 2 Route, Matthew (August 10, 2017). "Radio-flaring Ultracool Dwarf Population Synthesis". The Astrophysical Journal. 845 (1): 66. arXiv: 1707.02212 . Bibcode:2017ApJ...845...66R. doi: 10.3847/1538-4357/aa7ede . S2CID   118895524.
  21. 1 2 3 4 "Special Register Buffer Data Sampling". Intel. Retrieved 26 December 2020.
  22. "x86 intrinsics list". docs.microsoft.com. 2020-02-28. Retrieved 2020-05-07.
  23. "X86 Built-in Functions - Using the GNU Compiler Collection (GCC)".
  24. "Intel® C++ Compiler 19.1 Developer Guide and Reference". 2019-12-23.
  25. Route, Matthew (2019). "Intel Secure Key-Powered Radio-flaring Ultracool Dwarf Population Synthesis". American Astronomical Society Meeting Abstracts #234. American Astronomical Society Meeting #234, id. 207.01. Bulletin of the American Astronomical Society, Vol. 51, No. 4. 234. Bibcode:2019AAS...23420701R.
  26. 1 2 Perlroth, Nicole; Larson, Jeff; Shane, Scott (September 5, 2013). "N.S.A. Able to Foil Basic Safeguards of Privacy on Web". The New York Times. Retrieved November 15, 2017.
  27. Ts'o, Theodore (September 6, 2013). "I am so glad I resisted pressure from Intel engineers to let /dev/random rely..." Archived from the original on 2018-06-11.
  28. 1 2 Richard Chirgwin (2013-12-09). "FreeBSD abandoning hardware randomness". The Register.
  29. Gavin Clarke (10 September 2013). "Torvalds shoots down call to yank 'backdoored' Intel RDRAND in Linux crypto". theregister.co.uk. Retrieved 12 March 2014.
  30. Taylor Hornby (6 December 2013). "RDRAND backdoor proof of concept is working! Stock kernel (3.8.13), only the RDRAND instruction is modified" . Retrieved 9 April 2015.
  31. Taylor Hornby [@DefuseSec] (10 September 2013). "I wrote a short dialogue explaining why Linux's use of RDRAND is problematic. http://pastebin.com/A07q3nL3 /cc @kaepora @voodooKobra" (Tweet). Retrieved 11 January 2016 via Twitter.
  32. Daniel J. Bernstein; Tanja Lange (16 May 2014). "Randomness generation" (PDF). Retrieved 9 April 2015.
  33. Ts'o, Theodore (2013-10-10). "random: mix in architectural randomness earlier in extract_buf()". GitHub. Retrieved 30 July 2021.
  34. "FreeBSD Quarterly Status Report". Freebsd.org. Retrieved 2014-01-30.
  35. "random(4)". www.freebsd.org. Retrieved 2020-09-25.
  36. 1 2 3 Ragab, Hany; Milburn, Alyssa; Razavi, Kaveh; Bos, Herbert; Giuffrida, Cristiano. "CrossTalk: Speculative Data Leaks Across Cores Are Real" (PDF). Systems and Network Security Group, Vrije Universiteit Amsterdam (VUSec). Retrieved 26 December 2020.
  37. "VUSec RIDL cpuid_leak PoC, modified to leak rdrand output". GitHub. Retrieved 26 December 2020.
  38. "Processors Affected: Special Register Buffer Data Sampling". Intel Developer Zone. Retrieved 26 December 2020.