EVEX prefix

Last updated

The EVEX prefix (enhanced vector extension) and corresponding coding scheme is an extension to the 32-bit x86 (IA-32) and 64-bit x86-64 (AMD64) instruction set architecture. EVEX is based on, but should not be confused with the MVEX prefix [1] used by the Knights Corner processor.

Contents

The EVEX scheme is a 4-byte extension to the VEX scheme which supports the AVX-512 instruction set and allows addressing new 512-bit ZMM registers and new 64-bit operand mask registers.

With Advanced Performance Extensions, the Extended EVEX prefix redefines the semantics of several payload bits. [2]

Features

EVEX coding can address 8 operand mask registers, 16 general-purpose registers and 32 vector registers in 64-bit mode (otherwise, 8 general-purpose and 8 vector), and can support up to 4 operands.

Like the VEX coding scheme, the EVEX prefix unifies existing opcode prefixes and escape codes, memory addressing and operand length modifiers of the x86 instruction set.

The following features are carried over from the VEX scheme:

EVEX also extends VEX with additional capabilities:

For example, the EVEX encoding scheme allows conditional vector addition in the form of

VADDPS zmm1 {k1}{z}, zmm2, zmm3

where {k1} modifier next to the destination operand encodes the use of opmask register k1 for conditional processing and updates to destination, and {z} modifier (encoded by EVEX.z) provides the two types of masking (merging and zeroing), with merging as default when no modifier is attached.

Technical description

The EVEX coding scheme uses a code prefix consisting of 4 bytes; the first byte is always 62h and derives from an unused opcode of the 32-bit BOUND instruction, which is not supported in 64-bit mode. [3]

EVEX prefix in the AVX-512 instruction format
# of bytes41114 / 11
[Prefixes]EVEXOpcodeModR/M[SIB][Disp32] / [Disp8 × N][Immediate]

The ModR/M byte specifies one operand (always a register) with reg field, and the second operand is encoded with mod and r/m fields, specifying either a register or a location in memory. Base-plus-index and scale-plus-index addressing require the SIB byte, which encodes 2-bit scale factor as well as 3-bit index and 3-bit base registers. Depending on the addressing mode, Disp8/Disp16/Disp32 field may follow with displacement that needs to be added to the address.

The EVEX prefix retains fields introduced in the VEX prefix:

New functions of the existing fields:

There are several new bit fields:

The encoding of the EVEX prefix is as follows:

76543210
Byte 0 (62h)01100010
Byte 1 (P0)R̅’00m1m0P[7:0]
Byte 2 (P1)W32101p1p0P[15:8]
Byte 3 (P2)zL’LbV̅’a2a1a0P[23:16]

The following table lists possible register addressing combinations (bit 4 is always zero when encoding the 16 general purpose registers):

Register addressing in 64-bit mode using EVEX prefix
Addressing modeBit 4Bit 3Bits [2:0]Register typeCommon usage
REGEVEX.R’EVEX.RModRM.regGeneral purpose, vectorRegister operand
RM (if ModRM.mod=11)EVEX.XEVEX.BModRM.r/mGPR, vectorRegister operand
RM0EVEX.BModRM.r/mGPRRegister memory address
BASE0EVEX.BSIB.baseGPRBase + index × scale memory address
INDEX0EVEX.XSIB.indexGPRBase + index × scale memory address
VIDXEVEX.V’EVEX.XSIB.indexVectorBase + vectorindex × scale memory address
NDS/NDDEVEX.V’EVEX.v3v2v1v0GPR, vectorRegister operand
K00EVEX.a2a1a0MaskMask register operand

A few VEX-encoded AVX blending instructions have 4 operands. To accommodate this, VEX has IS4 addressing mode, which encodes 4th operand (a vector register) in bits Imm8[7:4] of the immediate constant. Similar EVEX-encoded blend instructions have their 4th operand in a mask register. No EVEX-encoded instruction uses IS4 addressing mode encoding.

Extended EVEX prefix

Intel Advanced Performance Extensions introduce several new variants of the 3-byte payload in the EVEX prefix, which are used to encode Extended GPR registers R16-R31 and new conditional instructions.

EVEX extension of EVEX instructions:

76543210
Byte 0 (62h)01100010
Byte 1 (P0)3334B4m2m1m0P[7:0]
Byte 2 (P1)W32104p1p0P[15:8]
Byte 3 (P2)zL’Lb4a2a1a0P[23:16]

EVEX extension of VEX instructions:

76543210
Byte 0 (62h)01100010
Byte 1 (P0)3334B4m2m1m0P[7:0]
Byte 2 (P1)W32104p1p0P[15:8]
Byte 3 (P2)00L04NF00P[23:16]

EVEX extension for legacy instructions:

76543210
Byte 0 (62h)01100010
Byte 1 (P0)3334B4100P[7:0]
Byte 2 (P1)W32104p1p0P[15:8]
Byte 3 (P2)000ND4NF00P[23:16]

EVEX prefix for conditional CMP and TEST:

76543210
Byte 0 (62h)01100010
Byte 1 (P0)3334B4100P[7:0]
Byte 2 (P1)WOFSFZFCF4p1p0P[15:8]
Byte 3 (P2)000ND=0SC3SC2SC1SC0P[23:16]

When the new EGPR registers and operand destinations can be encoded by both extended EVEX and REX2 prefixes, the latter is preferred.

Related Research Articles

MIPS is a family of reduced instruction set computer (RISC) instruction set architectures (ISA) developed by MIPS Computer Systems, now MIPS Technologies, based in the United States.

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the Intel 8086 microprocessor and its 8088 variant. The 8086 was introduced in 1978 as a fully 16-bit extension of Intel's 8-bit 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Colloquially, their names were "186", "286", "386" and "486".

In computer science, an instruction set architecture (ISA) is a part of the abstract model of a computer, which generally defines how software controls the CPU. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an implementation.

<span class="mw-page-title-main">MCS-51</span> Single chip microcontroller series by Intel

The Intel MCS-51 is a single chip microcontroller (MCU) series developed by Intel in 1980 for use in embedded systems. The architect of the Intel MCS-51 instruction set was John H. Wharton. Intel's original versions were popular in the 1980s and early 1990s, and enhanced binary compatible derivatives remain popular today. It is a complex instruction set computer, but also has some of the features of RISC architectures, such as a large register set and register windows, and has separate memory spaces for program instructions and data.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

INT is an assembly language instruction for x86 processors that generates a software interrupt. It takes the interrupt number formatted as a byte value.

The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.

Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions in that architecture identify the operand(s) of each instruction. An addressing mode specifies how to calculate the effective memory address of an operand by using information held in registers and/or constants contained within a machine instruction or elsewhere.

TLCS is a prefix applied to microcontrollers made by Toshiba. The product line includes multiple families of CISC and RISC architectures. Individual components generally have a part number beginning with "TMP". E.g. the TMP8048AP is a member of the TLCS-48 family.

The SSE5 was a SIMD instruction set extension proposed by AMD on August 30, 2007 as a supplement to the 128-bit SSE core instructions in the AMD64 architecture.

Advanced Vector Extensions are SIMD extensions to the x86 instruction set architecture for microprocessors from Intel and Advanced Micro Devices (AMD). They were proposed by Intel in March 2008 and first supported by Intel with the Sandy Bridge processor shipping in Q1 2011 and later by AMD with the Bulldozer processor shipping in Q3 2011. AVX provides new features, new instructions, and a new coding scheme.

The PDP-11 architecture is a 16-bit CISC instruction set architecture (ISA) developed by Digital Equipment Corporation (DEC). It is implemented by central processing units (CPUs) and microprocessors used in PDP-11 minicomputers. It was in wide use during the 1970s, but was eventually overshadowed by the more powerful VAX architecture in the 1980s.

The XOP instruction set, announced by AMD on May 1, 2009, is an extension to the 128-bit SSE core instructions in the x86 and AMD64 instruction set for the Bulldozer processor core, which was released on October 12, 2011. However AMD removed support for XOP from Zen (microarchitecture) onward.

The VEX prefix and VEX coding scheme are an extension to the IA-32 and x86-64 instruction set architecture for microprocessors from Intel, AMD and others.

<span class="mw-page-title-main">ST6 and ST7</span> 8-bit microcontroller product lines from STMicroelectronics

The ST6 and ST7 are 8-bit microcontroller product lines from STMicroelectronics. They are commonly used in small embedded applications like washing machines.

AVX-512 are 512-bit extensions to the 256-bit Advanced Vector Extensions SIMD instructions for x86 instruction set architecture (ISA) proposed by Intel in July 2013, and first implemented in the 2016 Intel Xeon Phi x200, and then later in a number of AMD and other Intel CPUs. AVX-512 consists of multiple extensions that may be implemented independently. This policy is a departure from the historical requirement of implementing the entire instruction block. Only the core extension AVX-512F is required by all AVX-512 implementations.

The PIC instruction set refers to the set of instructions that Microchip Technology PIC or dsPIC microcontroller supports. The instructions are usually programmed into the Flash memory of the processor, and automatically executed by the microcontroller on startup.

<span class="mw-page-title-main">STM8</span>

The STM8 is an 8-bit microcontroller family by STMicroelectronics. The STM8 microcontrollers use an extended variant of the ST7 microcontroller architecture. STM8 microcontrollers are particularly low cost for a full-featured 8-bit microcontroller.

The ModR/M byte is an important part of instruction encoding for the x86 instruction set.

References

  1. Intel® Xeon Phi™ Coprocessor Instruction Set Architecture Reference Manual (PDF). Sep 7, 2012. p. 42. 327364-001. Archived (PDF) from the original on Aug 4, 2021.
  2. Intel® Advanced Performance Extensions (Intel® APX) Architecture Specification (PDF) (2 ed.). August 2023. p. 21. 355828-002US. Archived (PDF) from the original on Sep 10, 2023.
  3. Intel Corporation (July 2013). "Intel Architecture Instruction Set Extensions Programming Reference".