SIMD
Created | |
---|---|
Tags |
Overview
Single Instruction, Multiple Data (SIMD) is a set of instructions to perform vectorized computations.
MMX
- Introduced in 1997.
- 57 instructions.
- 2-operand instructions.
- 8 registers: MM0–MM7.
- 64-bit registers.
A single instruction can be applied to:
- one 64-bit integer or
- two 32-bit integers or
- four 16-bit integers or
- eight 8-bit integers.
Issues:
- Re-used existing x87 floating point registers.
- Unable to work on both floating point and SIMD data at the same time.
- Only supported operations on integers.
SSE
Streaming SIMD Extensions
- Introduced in 1999.
- 70 instructions.
- 8 new registers: XMM0–XMM7
- 128-bit registers.
- four 32-bit single-precision floating point numbers.
A single instruction can be applied to:
- two 64-bit double-precision floating point numbers or
- two 64-bit integers or
- four 32-bit integers or
- eight 16-bit short integers or
- sixteen 8-bit bytes.
SSE2
- Introduced in 2000.
- 144 instructions.
- 8 registers: XMM0–XMM7
- 16 registers in x86-64 mode: XMM0–XMM15.
- 128-bit registers.
- Double-precision floating point operations.
- MMX integer operations on 128-bit XMM registers.
Advantages:
- MMX and x87 register do not alias one another.
Issues:
- Slow access to data in memory not aligned to a 16-byte boundary.
SSE3
- Introduced in 2004.
- 13 new instructions over SS2.
Advantages:
- Unaligned load instructions are faster.
- Horizontal instructions to speed up the several DSP and 3D operations.
SSE4
- Introduced in 2006.
- 54 instructions.
Advantages:
- Unaligned load instructions are as fast as aligned versions.
- Dot product instruction.
- Additional integer instructions.
AVX
Advanced Vector Extensions (AVX)
- Introduced in 2008.
- 8 registers.
- Renamed XMM0–XMM7 to YMM0–YMM7.
- 16 registers in x86-64 mode: YMM0–YMM15.
- 256-bit registers.
- 3-operand instructions.
A single instruction can be applied to:
- eight 32-bit single-precision floating point numbers or
- four 64-bit double-precision floating point numbers.
Advantages:
- Supports 128-bit and 256-bit SIMD in AVX-128 mode.
AVX2
- Introduced in 2013.
- Fused Multiply-Accumulate (FMA) instructions.
- 3-operand FMA (FMA3) instructions.
AVX-512 (or AVX3)
- Introduced in 2013.
- 512-bit registers.
- 32 registers in x86-64 mode: ZMM0-ZMM31.
- 4-operand instructions.