AES instruction set

Last updated

An AES (Advanced Encryption Standard) instruction set is a set of instructions that are specifically designed to perform AES encryption and decryption operations efficiently. These instructions are typically found in modern processors and can greatly accelerate AES operations compared to software implementations. An AES instruction set includes instructions for key expansion, encryption, and decryption using various key sizes (128-bit, 192-bit, and 256-bit).

Contents

The instruction set is often implemented as a set of instructions that can perform a single round of AES along with a special version for the last round which has a slightly different method.

When AES is implemented as an instruction set instead of as software, it can have improved security, as its side channel attack surface is reduced.[ citation needed ]

x86 architecture processors

AES-NI (or the Intel Advanced Encryption Standard New Instructions; AES-NI) was the first major implementation. AES-NI is an extension to the x86 instruction set architecture for microprocessors from Intel and AMD proposed by Intel in March 2008. [1]

A wider version of AES-NI, AVX-512 Vector AES instructions (VAES), is found in AVX-512. [2]

Instructions

InstructionDescription [3]
AESENCPerform one round of an AES encryption flow
AESENCLASTPerform the last round of an AES encryption flow
AESDECPerform one round of an AES decryption flow
AESDECLASTPerform the last round of an AES decryption flow
AESKEYGENASSISTAssist in AES round key generation [note 1]
AESIMCAssist in AES decryption round key generation. Applies Inverse Mix Columns to round keys.

Intel

The following Intel processors support the AES-NI instruction set: [4]

AMD

Several AMD processors support AES instructions:

Hardware acceleration in other architectures

AES support with unprivileged processor instructions is also available in the latest SPARC processors (T3, T4, T5, M5, and forward) and in latest ARM processors. The SPARC T4 processor, introduced in 2011, has user-level instructions implementing AES rounds. [12] These instructions are in addition to higher level encryption commands. The ARMv8-A processor architecture, announced in 2011, including the ARM Cortex-A53 and A57 (but not previous v7 processors like the Cortex A5, 7, 8, 9, 11, 15 [ citation needed ]) also have user-level instructions which implement AES rounds. [13]

x86 CPUs offering non-AES-NI acceleration interfaces

VIA x86 CPUs and AMD Geode use driver-based accelerated AES handling instead. (See Crypto API (Linux).)

The following chips, while supporting AES hardware acceleration, do not support AES-NI:

ARM architecture

Programming information is available in ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile (Section A2.3 "The Armv8 Cryptographic Extension"). [19]

The Marvell Kirkwood was the embedded core of a range of SoC from Marvell Technology, these SoC CPUs (ARM, mv_cesa in Linux) use driver-based accelerated AES handling. (See Crypto API (Linux).)

RISC-V architecture

Whilst the RISC-V architecture does not include AES-specific instructions, a number of RISC-V chips include integrated AES co-processors. Examples include:

POWER architecture

Since the Power ISA v.2.07, the instructions vcipher and vcipherlast implement one round of AES directly. [29]

IBM z/Architecture

IBM z9 or later mainframe processors support AES as single-opcode (KM, KMC) AES ECB/CBC instructions via IBM's CryptoExpress hardware. [30] These single-instruction AES versions are therefore easier to use than Intel NI ones, but may not be extended to implement other algorithms based on AES round functions (such as the Whirlpool and Grøstl hash functions).

Other architectures

Performance

In AES-NI Performance Analyzed, Patrick Schmid and Achim Roos found "impressive results from a handful of applications already optimized to take advantage of Intel's AES-NI capability". [33] A performance analysis using the Crypto++ security library showed an increase in throughput from approximately 28.0 cycles per byte to 3.5 cycles per byte with AES/GCM versus a Pentium 4 with no acceleration. [34] [35] [ failed verification ][ better source needed ]

Supporting software

Most modern compilers can emit AES instructions.

A lot of security and cryptography software supports the AES instruction set, including the following notable core infrastructure:

Application beyond AES

A fringe use of the AES instruction set involves using it on block ciphers with a similarly-structured S-box, using affine isomorphism to convert between the two. SM4 and Camellia have been accelerated using AES-NI. [51] [52] The AVX-512 Galois Field New Instructions (GFNI) allows implementing these S-boxes in a more direct way. [53]

New cryptographic algorithms have been constructed to specifically use parts of the AES algorithm, so that the AES instruction set can be used for speedups. The AEGIS family, which offers authenticated encryption, runs with at least twice the speed of AES. [54] AEGIS is an "additional finalist for high-performance applications" in the CAESAR Competition. [55]

See also

Notes

  1. The instruction computes 4 parallel subexpressions of AES key expansion on 4 32-bit words in a double quadword (aka SSE register) on bits X[127:96] for and X[63:32] for only. Two parallel AES S-box substitutions and are used in AES-256 and 2 subexpressions and are used in AES-128, AES-192, AES-256.

Related Research Articles

<span class="mw-page-title-main">Celeron</span> Line of discontinued microprocessors made by Intel

Celeron is a discontinued series of low-end IA-32 and x86-64 computer microprocessor models targeted at low-cost personal computers, manufactured by Intel. The first Celeron-branded CPU was introduced on April 15, 1998, and was based on the Pentium II.

As of 2020, the x86 architecture is used in most high end compute-intensive computers, including cloud computing, servers, workstations, and many less powerful computers, including personal computer desktops and laptops. The ARM architecture is used in most other product categories, especially high-volume battery powered mobile devices such as smartphones and tablet computers.

Supplemental Streaming SIMD Extensions 3 is a SIMD instruction set created by Intel and is the fourth iteration of the SSE technology.

<span class="mw-page-title-main">Pentium</span> Brand of discontinued microprocessors produced by Intel

Pentium is a discontinued series of x86 architecture-compatible microprocessors produced by Intel. The original Pentium was first released on March 22, 1993. The name "Pentium" is originally derived from the Greek word pente (πεντε), meaning "five", a reference to the prior numeric naming convention of Intel's 80x86 processors (8086–80486), with the Latin ending -ium since the processor would otherwise have been named 80586 using that convention.

<span class="mw-page-title-main">Sandy Bridge</span> Intel processor microarchitecture

Sandy Bridge is the codename for Intel's 32 nm microarchitecture used in the second generation of the Intel Core processors. The Sandy Bridge microarchitecture is the successor to Nehalem and Westmere microarchitecture. Intel demonstrated an A1 stepping Sandy Bridge processor in 2009 during Intel Developer Forum (IDF), and released first products based on the architecture in January 2011 under the Core brand.

<span class="mw-page-title-main">History of general-purpose CPUs</span>

The history of general-purpose CPUs is a continuation of the earlier history of computing hardware.

<span class="mw-page-title-main">Dell Vostro</span> Line of laptop and desktop computers by Dell

Dell Vostro is a line of business-oriented laptop and desktop computers manufactured by Dell aimed at small to medium range businesses. From 2013–2015, the line was temporarily discontinued on some Dell websites but continued to be offered in other markets, such as Malaysia and India.

Advanced Vector Extensions are SIMD extensions to the x86 instruction set architecture for microprocessors from Intel and Advanced Micro Devices (AMD). They were proposed by Intel in March 2008 and first supported by Intel with the Sandy Bridge processor shipping in Q1 2011 and later by AMD with the Bulldozer processor shipping in Q3 2011. AVX provides new features, new instructions, and a new coding scheme.

<span class="mw-page-title-main">Haswell (microarchitecture)</span> Intel processor microarchitecture

Haswell is the codename for a processor microarchitecture developed by Intel as the "fourth-generation core" successor to the Ivy Bridge. Intel officially announced CPUs based on this microarchitecture on June 4, 2013, at Computex Taipei 2013, while a working Haswell chip was demonstrated at the 2011 Intel Developer Forum. With Haswell, which uses a 22 nm process, Intel also introduced low-power processors designed for convertible or "hybrid" ultrabooks, designated by the "U" suffix.

<span class="mw-page-title-main">LGA 1156</span> Intel desktop CPU socket

LGA 1156, also known as Socket H or H1, is an Intel desktop CPU socket. Its incompatible successor is LGA 1155.

<span class="mw-page-title-main">Intel Core</span> Line of CPUs by Intel

Intel Core is a line of multi-core central processing units (CPUs) for midrange, embedded, workstation, high-end and enthusiast computer markets marketed by Intel Corporation. These processors displaced the existing mid- to high-end Pentium processors at the time of their introduction, moving the Pentium to the entry level. Identical or more capable versions of Core processors are also sold as Xeon processors for the server and workstation markets.

Clarkdale is the codename for Intel's first-generation Core i5, i3 and Pentium dual-core desktop processors. It is closely related to the mobile Arrandale processor; both use dual-core dies based on the 32 nm Westmere microarchitecture and have integrated Graphics, PCI Express and DMI links built-in.

<span class="mw-page-title-main">Socket G1</span> CPU socket from Intel

Socket G1, also known as rPGA 988A, is a CPU socket introduced by Intel in 2009 for the mobile variants of the first-generation Intel Core processors. It is the successor to Socket P, and the mobile counterpart to LGA 1156 and LGA 1366.

Intel Quick Sync Video is Intel's brand for its dedicated video encoding and decoding hardware core. Quick Sync was introduced with the Sandy Bridge CPU microarchitecture on 9 January 2011 and has been found on the die of Intel CPUs ever since.

<span class="mw-page-title-main">Westmere (microarchitecture)</span> CPU microarchitecture by Intel

Westmere is the code name given to the 32 nm die shrink of Nehalem. While sharing the same CPU sockets, Westmere included Intel HD Graphics, while Nehalem did not.

<span class="mw-page-title-main">Hardware-based encryption</span> Use of computer hardware to assist software in the process of data encryption

Hardware-based encryption is the use of computer hardware to assist software, or sometimes replace software, in the process of data encryption. Typically, this is implemented as part of the processor's instruction set. For example, the AES encryption algorithm can be implemented using the AES instruction set on the ubiquitous x86 architecture. Such instructions also exist on the ARM architecture. However, more unusual systems exist where the cryptography module is separate from the central processor, instead being implemented as a coprocessor, in particular a secure cryptoprocessor or cryptographic accelerator, of which an example is the IBM 4758, or its successor, the IBM 4764. Hardware implementations can be faster and less prone to exploitation than traditional software implementations, and furthermore can be protected against tampering.

<span class="mw-page-title-main">Dell Inspiron laptop computers</span> Laptop computer series by Dell

The Dell Inspiron series is a line of laptop computers made by American company Dell under the Dell Inspiron branding. The first Inspiron laptop model was introduced before 1999. Unlike the Dell Latitude line, which is aimed mostly at business/enterprise markets, Inspiron is a consumer-oriented line, often marketed towards individual customers as computers for everyday use.

Comet Lake is Intel's codename for its 10th generation Core processors. They are manufactured using Intel's third 14 nm Skylake process revision, succeeding the Whiskey Lake U-series mobile processor and Coffee Lake desktop processor families. Intel announced low-power mobile Comet Lake-U CPUs on August 21, 2019, H-series mobile CPUs on April 2, 2020, desktop Comet Lake-S CPUs April 30, 2020, and Xeon W-1200 series workstation CPUs on May 13, 2020. Comet Lake processors and Ice Lake 10 nm processors are together branded as the Intel "10th Generation Core" family. Intel officially launched Comet Lake-Refresh CPUs on the same day as 11th Gen Core Rocket Lake launch. The low-power mobile Comet Lake-U Core and Celeron 5205U CPUs were discontinued on July 7, 2021.

References

  1. "Intel Software Network". Intel. Archived from the original on 7 April 2008. Retrieved 2008-04-05.
  2. "Intel Architecture Instruction Set Extensions and Future Features Programming Reference". Intel. Retrieved October 16, 2017.
  3. Shay Gueron (2010). "Intel Advanced Encryption Standard (AES) Instruction Set White Paper" (PDF). Intel. Retrieved 2012-09-20.
  4. "Intel Product Specification Advanced Search". Intel ARK.
  5. Shimpi, Anand Lal. "The Sandy Bridge Review: Intel Core i7-2600K, i5-2500K and Core i3-2100 Tested".
  6. "Intel Product Specification Comparison".
  7. "AES-NI support in TrueCrypt (Sandy Bridge problem)". 27 January 2022.
  8. "Some products can support AES New Instructions with a Processor Configuration update, in particular, i7-2630QM/i7-2635QM, i7-2670QM/i7-2675QM, i5-2430M/i5-2435M, i5-2410M/i5-2415M. Please contact OEM for the BIOS that includes the latest Processor configuration update".
  9. "Intel Core i3-2115C Processor (3M Cache, 2.00 GHz) Product Specifications".
  10. "Intel Core i3-4000M Processor (3M Cache, 2.40 GHz) Product Specifications".
  11. "Following Instructions". AMD. November 22, 2010. Archived from the original on November 26, 2010. Retrieved 2011-01-04.
  12. Dan Anderson (2011). "SPARC T4 OpenSSL Engine". Oracle. Retrieved 2012-09-20.
  13. Richard Grisenthwaite (2011). "ARMv8-A Technology Preview" (PDF). ARM. Archived from the original (PDF) on 2018-06-10. Retrieved 2012-09-20.
  14. "AMD Geode LX Processor Family Technical Specifications". AMD.
  15. "VIA Padlock Security Engine". VIA. Archived from the original on 2011-05-15. Retrieved 2011-11-14.
  16. 1 2 Cryptographic Hardware Accelerators on OpenWRT.org
  17. "VIA Eden-N Processors". VIA. Archived from the original on 2011-11-11. Retrieved 2011-11-14.
  18. "VIA C7 Processors". VIA. Archived from the original on 2007-04-19. Retrieved 2011-11-14.
  19. "Arm Architecture Reference Manual Armv8, for Armv8-A architecture profile". ARM. 22 January 2021.
  20. "Security System/Crypto Engine driver status". sunxi.montjoie.ovh.
  21. "Linux Cryptographic Acceleration on an i.MX6" (PDF). Linux Foundation. February 2017. Archived from the original (PDF) on 2019-08-26. Retrieved 2018-05-02.
  22. "Cryptographic module in Snapdragon 805 is FIPS 140-2 certified". Qualcomm.
  23. "RK3128 - Rockchip Wiki". Rockchip wiki. Archived from the original on 2019-01-28. Retrieved 2018-05-02.
  24. "The Samsung Exynos 7420 Deep Dive - Inside A Modern 14nm SoC". AnandTech.
  25. "Sipeed M1 Datasheet v1.1" (PDF). kamami.pl. 2019-03-06. Retrieved 2021-05-03.
  26. "ESP32 Series Datasheet" (PDF). www.espressif.com. 2021-03-19. Retrieved 2021-05-03.
  27. "ESP32-C3 WiFi & BLE RISC-V processor is pin-to-pin compatible with ESP8266". CNX-Software. Retrieved 2020-11-22.
  28. "BL602-Bouffalo Lab (Nanjing) Co., Ltd". www.bouffalolab.com. Archived from the original on 2021-06-18. Retrieved 2021-05-03.
  29. "Power ISA Version 2.07 B" . Retrieved 2022-01-07.
  30. "IBM System z10 cryptography". IBM. Retrieved 2014-01-27.
  31. "Using the XMEGA built-in AES accelerator" (PDF). Retrieved 2014-12-03.
  32. "Cavium Networks Launches Industry's Broadest Line of Single and Dual Core MIPS64-based OCTEON Processors Targeting Intelligent Next Generation Networks". Archived from the original on 2017-12-07. Retrieved 2016-09-17.
  33. P. Schmid and A. Roos (2010). "AES-NI Performance Analyzed". Tom's Hardware. Retrieved 2010-08-10.
  34. T. Krovetz, W. Dai (2010). "How to get fast AES calls?". Crypto++ user group. Retrieved 2010-08-11.
  35. "Crypto++ 5.6.0 Pentium 4 Benchmarks". Crypto++ Website. 2009. Archived from the original on 19 September 2010. Retrieved 2010-08-10.
  36. "NonStop SSH Reference Manual" . Retrieved 2020-04-09.
  37. "NonStop cF SSL Library Reference Manual" . Retrieved 2020-04-09.
  38. "BackBox H4.08Tape Encryption Option" . Retrieved 2020-04-09.
  39. "Intel Advanced Encryption Standard Instructions (AES-NI)". Intel. March 2, 2010. Archived from the original on 7 July 2010. Retrieved 2010-07-11.
  40. "AES-NI enhancements to NSS on Sandy Bridge systems". 2012-05-02. Retrieved 2012-11-25.
  41. "System Administration Guide: Security Services, Chapter 13 Solaris Cryptographic Framework (Overview)". Oracle. September 2010. Retrieved 2012-11-27.
  42. "FreeBSD 8.2 Release Notes". FreeBSD.org. 2011-02-24. Archived from the original on 2011-04-12. Retrieved 2011-12-18.
  43. OpenSSL: CVS Web Interface
  44. "Cryptographic Backend (GnuTLS 3.6.14)". gnutls.org. Retrieved 2020-06-26.
  45. "AES-GCM in libsodium". libsodium.org.
  46. "Hardware Acceleration". www.veracrypt.fr.
  47. "aes - The Go Programming Language". golang.org. Retrieved 2020-06-26.
  48. Shimpi, Anand Lal. "The Clarkdale Review: Intel's Core i5 661, i3 540 & i3 530". www.anandtech.com. Retrieved 2020-06-26.
  49. "Bloombase StoreSafe Intelligent Storage Firewall".
  50. "Vormetric Encryption Adds Support for Intel AES-NI Acceleration Technology". 15 May 2012.
  51. Saarinen, Markku-Juhani O. (17 April 2020). "mjosaarinen/sm4ni: Demonstration that AES-NI instructions can be used to implement the Chinese Encryption Standard SM4". GitHub.
  52. Kivilinna, Jussi (2013). Block Ciphers: Fast Implementations on x86-64 Architecture (PDF) (M.Sc.). University of Oulu. pp. 33, 42. Retrieved 2017-06-22.
  53. Kivilinna, Jussi (19 April 2023). "camellia-simd-aesni". GitHub . Newer x86-64 processors also support Galois Field New Instructions (GFNI) which allow implementing Camellia s-box more straightforward manner and yield even better performance.
  54. Wu, Hongjun; Preneel, Bart. "AEGIS: A Fast Authenticated Encryption Algorithm (v1.1)" (PDF).
  55. Denis, Frank. "The AEGIS Family of Authenticated Encryption Algorithms". cfrg.github.io.