EVEX prefix
The EVEX prefix (Enhanced vector extension) and corresponding coding scheme is an extension to the 32-bit x86 (IA-32) and 64-bit x86-64 (AMD64) instruction set architecture. EVEX is based on, but should not be confused with the MVEX prefix used by the Knights Corner processor.
The EVEX scheme is a 4-byte extension to the VEX scheme which supports the AVX-512 instruction set and allows addressing new 512-bit ZMM registers and new 64-bit operand mask registers.
Features
EVEX coding can address 8 operand mask registers, 16 general-purpose registers and 32 vector registers in 64-bit mode (otherwise, 8 general-purpose and 8 vector), and can support up to 4 operands.
Like the VEX coding scheme, the EVEX prefix unifies existing opcode prefixes and escape codes, memory addressing and operand length modifiers of the x86 instruction set .
The following features are carried over from the VEX scheme:
- Direct encoding of three SIMD registers (XMM, YMM, or ZMM) as source operands (MMX or x87 registers are not supported);
- Compacted REX prefix for 64-bit mode;
- Compacted SIMD prefix (66H, F2H, F3H), escape opcode (0FH) and two-byte escape (0F38H, 0F3AH);
- Less strict memory alignment requirements for memory operand
EVEX also extends VEX with additional capabilities:
- Extended SIMD register encoding: a total of 32 new 512-bit SIMD registers ZMM0-ZMM31 in 64-bit mode;
- Operand mask encoding: 8 new 64-bit opmask registers k0-k7 for conditional execution and merging of destination operands;
- Broadcasting from source to destination for instructions that take memory vector as a source operand: the second operand is broadcast before being used in the actual operation;
- Direct embedded rounding control for instructions that operate on floating-point SIMD registers with rounding semantics;
- Embedded exceptions control for floating-point instructions without rounding semantics;
- Compressed displacement (DISP8*N), new memory addressing mode to improve encoding density of instruction byte stream; the scale factor N depends on vector length and broadcast mode.
For example, the EVEX encoding scheme allows conditional vector addition in the form of
VADDPS zmm1 {k1}{z}, zmm2, zmm3
where {k1} modifier next to the destination operand encodes the use of opmask register k1 for conditional processing and updates to destination, and {z} modifier (encoded by EVEX.z) provides the two types of masking (merging and zeroing), with merging as default when no modifier is attached.
Technical description
The EVEX coding scheme uses a code prefix consisting of 4 bytes; the first byte is always 62h and derives from an unused opcode of the 32-bit BOUND instruction, which is not supported in 64-bit mode. [1]
# of bytes | 4 | 1 | 1 | 1 | 4 / 1 | 1 |
---|---|---|---|---|---|---|
[Prefixes] | EVEX | Opcode | ModR/M | [SIB] | [Disp32] / [Disp8*N] | [Immediate] |
The ModR/M byte specifies the addressing of the source register with mod and r/m fields, which encode either 8 registers or 24 addressing modes, and the destination register is encoded with reg field. Base-plus-index and scale-plus-index addressing require the SIB byte, which encodes 2-bit scale factor as well as 3-bit index and 3-bit base registers. In certain SIB encodings, Disp32 contains displacements that need to be added to the base address.
The EVEX prefix retains some fields introduced in the VEX prefix:
- Four bits R, X, B, and W from the REX prefix. W expands the operand size to 64 bits or serves as an additional opcode, R expands reg, B expands r/m or reg, and X and B expand index and base in the SIB byte. Just like in VEX prefix, RXB are provided in inverted form.
- Four bits named v, specifying a second non-destructive source register operand. Just like in VEX prefix, vvvv is provided in inverted form.
- Bit L specifying 256-bit vector length.
- Two bits named p to replace operand size prefixes and operand type prefixes (66, F2, F3).
- Two of the m bits for replacing existing escape codes (0F, 0F 38 and 0F 3A;).
New functions of the existing fields:
- Bit X now expands r/m along with bit B when the SIB byte is not present, which allows 32 SIMD registers.
There are several new bit fields:
- Three bits named a, specifying the operand mask register (k0-k7) for vector instructions.
- Bit z for specifying merging mode (merge or zero).
- Bit b for source broadcast, rounding control (combined with L’L), or suppress exceptions.
- Bit L’ for specifying 512-bit vector length, or rounding control mode when combined with L.
- Bit R’ for expanding reg, which allows for 32 SIMD registers. Like the R bit, R' is provided in inverted form.
- Bit V' is an additional source register index. Like the vvvv bits, V' is provided in inverted form.
The encoding of the EVEX prefix is as follows:
7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | ||
---|---|---|---|---|---|---|---|---|---|
Byte 0 (62h) | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | |
Byte 1 (P0) | R | X | B | R’ | 0 | 0 | m1 | m0 | P[7:0] |
Byte 2 (P1) | W | v3 | v2 | v1 | v0 | 1 | p1 | p0 | P[15:8] |
Byte 3 (P2) | z | L’ | L | b | V’ | a2 | a1 | a0 | P[23:16] |
The following table lists allowed register addressing combinations (bit 4 is always zero when encoding the 16 general purpose registers):
Addressing mode | Bit 4 | Bit 3 | Bits [2:0] | Register type | Common usage |
---|---|---|---|---|---|
REG | EVEX.R’ | EVEX.R | ModRM.reg | General purpose, Vector | Destination or Source |
NDS/NDD | EVEX.V’ | EVEX.v3v2v1v0 | GPR, Vector | 2nd Source or Destination | |
RM | EVEX.X | EVEX.B | ModRM.r/m | GPR, Vector | 1st Source or Destination |
BASE | 0 | EVEX.B | ModRM.r/m | GPR | Memory addressing |
INDEX | 0 | EVEX.X | SIB.index | GPR | Memory addressing |
VIDX | EVEX.V’ | EVEX.X | SIB.index | Vector | VSIB memory addressing |
IS4 | Imm8[3] | Imm8[7:4] | Vector | 3rd Source |
References
- Intel Corporation (July 2013). "Intel Architecture Instruction Set Extensions Programming Reference".