EVEX prefix

The EVEX prefix (Enhanced vector extension) and corresponding coding scheme is an extension to the 32-bit x86 (IA-32) and 64-bit x86-64 (AMD64) instruction set architecture. EVEX is based on, but should not be confused with the MVEX prefix used by the Knights Corner processor.

The EVEX scheme is a 4-byte extension to the VEX scheme which supports the AVX-512 instruction set and allows addressing new 512-bit ZMM registers and new 64-bit operand mask registers.

Features

EVEX coding can address 8 operand mask registers, 16 general-purpose registers and 32 vector registers in 64-bit mode (otherwise, 8 general-purpose and 8 vector), and can support up to 4 operands.

Like the VEX coding scheme, the EVEX prefix unifies existing opcode prefixes and escape codes, memory addressing and operand length modifiers of the x86 instruction set .

The following features are carried over from the VEX scheme:

Direct encoding of three SIMD registers (XMM, YMM, or ZMM) as source operands (MMX or x87 registers are not supported);
Compacted REX prefix for 64-bit mode;
Compacted SIMD prefix (66H, F2H, F3H), escape opcode (0FH) and two-byte escape (0F38H, 0F3AH);
Less strict memory alignment requirements for memory operand

EVEX also extends VEX with additional capabilities:

Extended SIMD register encoding: a total of 32 new 512-bit SIMD registers ZMM0-ZMM31 in 64-bit mode;
Operand mask encoding: 8 new 64-bit opmask registers k0-k7 for conditional execution and merging of destination operands;
Broadcasting from source to destination for instructions that take memory vector as a source operand: the second operand is broadcast before being used in the actual operation;
Direct embedded rounding control for instructions that operate on floating-point SIMD registers with rounding semantics;
Embedded exceptions control for floating-point instructions without rounding semantics;
Compressed displacement (DISP8*N), new memory addressing mode to improve encoding density of instruction byte stream; the scale factor N depends on vector length and broadcast mode.

For example, the EVEX encoding scheme allows conditional vector addition in the form of

VADDPS zmm1 {k1}{z}, zmm2, zmm3

where {k1} modifier next to the destination operand encodes the use of opmask register k1 for conditional processing and updates to destination, and {z} modifier (encoded by EVEX.z) provides the two types of masking (merging and zeroing), with merging as default when no modifier is attached.

Technical description

The EVEX coding scheme uses a code prefix consisting of 4 bytes; the first byte is always 62h and derives from an unused opcode of the 32-bit BOUND instruction, which is not supported in 64-bit mode. [1]

EVEX Prefix in the AVX-512 Instruction Format
# of bytes	4	1	1	1	4 / 1	1
[Prefixes]	EVEX	Opcode	ModR/M	[SIB]	[Disp32] / [Disp8*N]	[Immediate]

The ModR/M byte specifies the addressing of the source register with mod and r/m fields, which encode either 8 registers or 24 addressing modes, and the destination register is encoded with reg field. Base-plus-index and scale-plus-index addressing require the SIB byte, which encodes 2-bit scale factor as well as 3-bit index and 3-bit base registers. In certain SIB encodings, Disp32 contains displacements that need to be added to the base address.

The EVEX prefix retains some fields introduced in the VEX prefix:

Four bits R, X, B, and W from the REX prefix. W expands the operand size to 64 bits or serves as an additional opcode, R expands reg, B expands r/m or reg, and X and B expand index and base in the SIB byte. Just like in VEX prefix, RXB are provided in inverted form.
Four bits named v, specifying a second non-destructive source register operand. Just like in VEX prefix, vvvv is provided in inverted form.
Bit L specifying 256-bit vector length.
Two bits named p to replace operand size prefixes and operand type prefixes (66, F2, F3).
Two of the m bits for replacing existing escape codes (0F, 0F 38 and 0F 3A;).

New functions of the existing fields:

Bit X now expands r/m along with bit B when the SIB byte is not present, which allows 32 SIMD registers.

There are several new bit fields:

Three bits named a, specifying the operand mask register (k0-k7) for vector instructions.
Bit z for specifying merging mode (merge or zero).
Bit b for source broadcast, rounding control (combined with L’L), or suppress exceptions.
Bit L’ for specifying 512-bit vector length, or rounding control mode when combined with L.
Bit R’ for expanding reg, which allows for 32 SIMD registers. Like the R bit, R' is provided in inverted form.
Bit V' is an additional source register index. Like the vvvv bits, V' is provided in inverted form.

The encoding of the EVEX prefix is as follows:

	7	6	5	4	3	2	1	0
Byte 0 (62h)	0	1	1	0	0	0	1	0
Byte 1 (P0)	R	X	B	R’	0	0	m₁	m₀	P[7:0]
Byte 2 (P1)	W	v₃	v₂	v₁	v₀	1	p₁	p₀	P[15:8]
Byte 3 (P2)	z	L’	L	b	V’	a₂	a₁	a₀	P[23:16]

The following table lists allowed register addressing combinations (bit 4 is always zero when encoding the 16 general purpose registers):

Register addressing in 64-bit mode using EVEX prefix
Addressing mode	Bit 4	Bit 3	Bits [2:0]	Register type	Common usage
REG	EVEX.R’	EVEX.R	ModRM.reg	General purpose, Vector	Destination or Source
NDS/NDD	EVEX.V’	EVEX.v₃v₂v₁v₀		GPR, Vector	2nd Source or Destination
RM	EVEX.X	EVEX.B	ModRM.r/m	GPR, Vector	1st Source or Destination
BASE	0	EVEX.B	ModRM.r/m	GPR	Memory addressing
INDEX	0	EVEX.X	SIB.index	GPR	Memory addressing
VIDX	EVEX.V’	EVEX.X	SIB.index	Vector	VSIB memory addressing
IS4	Imm8[3]	Imm8[7:4]		Vector	3rd Source

gollark: Quack,#

gollark: No, they are large and boring.

gollark: Well, do you know any standalone mobspawney stuff?

gollark: Mob Grinding Utils, right.

gollark: Or the RWTema-gets-free-stuff-upon-login feature.

References

Intel Corporation (July 2013). "Intel Architecture Instruction Set Extensions Programming Reference".

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Intel Corporation (July 2013). "Intel Architecture Instruction Set Extensions Programming Reference".