forked from NRZCode/ia32-64
206 lines
10 KiB
HTML
206 lines
10 KiB
HTML
<!DOCTYPE html>
|
||
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VFMADD132SD/VFMADD213SD/VFMADD231SD
|
||
— Fused Multiply-Add of Scalar DoublePrecision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VFMADD132SD/VFMADD213SD/VFMADD231SD
|
||
— Fused Multiply-Add of Scalar DoublePrecision Floating-Point Values</h1>
|
||
|
||
|
||
|
||
<table>
|
||
<tr>
|
||
<th>Opcode/Instruction</th>
|
||
<th>Op/En</th>
|
||
<th>64/32 Bit Mode Support</th>
|
||
<th>CPUID Feature Flag</th>
|
||
<th>Description</th></tr>
|
||
<tr>
|
||
<td>VEX.LIG.66.0F38.W1 99 /r VFMADD132SD xmm1, xmm2, xmm3/m64</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>FMA</td>
|
||
<td>Multiply scalar double precision floating-point value from xmm1 and xmm3/m64, add to xmm2 and put result in xmm1.</td></tr>
|
||
<tr>
|
||
<td>VEX.LIG.66.0F38.W1 A9 /r VFMADD213SD xmm1, xmm2, xmm3/m64</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>FMA</td>
|
||
<td>Multiply scalar double precision floating-point value from xmm1 and xmm2, add to xmm3/m64 and put result in xmm1.</td></tr>
|
||
<tr>
|
||
<td>VEX.LIG.66.0F38.W1 B9 /r VFMADD231SD xmm1, xmm2, xmm3/m64</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>FMA</td>
|
||
<td>Multiply scalar double precision floating-point value from xmm2 and xmm3/m64, add to xmm1 and put result in xmm1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.LLIG.66.0F38.W1 99 /r VFMADD132SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}</td>
|
||
<td>B</td>
|
||
<td>V/V</td>
|
||
<td>AVX512F</td>
|
||
<td>Multiply scalar double precision floating-point value from xmm1 and xmm3/m64, add to xmm2 and put result in xmm1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.LLIG.66.0F38.W1 A9 /r VFMADD213SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}</td>
|
||
<td>B</td>
|
||
<td>V/V</td>
|
||
<td>AVX512F</td>
|
||
<td>Multiply scalar double precision floating-point value from xmm1 and xmm2, add to xmm3/m64 and put result in xmm1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.LLIG.66.0F38.W1 B9 /r VFMADD231SD xmm1 {k1}{z}, xmm2, xmm3/m64{er}</td>
|
||
<td>B</td>
|
||
<td>V/V</td>
|
||
<td>AVX512F</td>
|
||
<td>Multiply scalar double precision floating-point value from xmm2 and xmm3/m64, add to xmm1 and put result in xmm1.</td></tr></table>
|
||
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
|
||
¶
|
||
</a></h2>
|
||
<table>
|
||
<tr>
|
||
<th>Op/En</th>
|
||
<th>Tuple Type</th>
|
||
<th>Operand 1</th>
|
||
<th>Operand 2</th>
|
||
<th>Operand 3</th>
|
||
<th>Operand 4</th></tr>
|
||
<tr>
|
||
<td>A</td>
|
||
<td>N/A</td>
|
||
<td>ModRM:reg (r, w)</td>
|
||
<td>VEX.vvvv (r)</td>
|
||
<td>ModRM:r/m (r)</td>
|
||
<td>N/A</td></tr>
|
||
<tr>
|
||
<td>B</td>
|
||
<td>Tuple1 Scalar</td>
|
||
<td>ModRM:reg (r, w)</td>
|
||
<td>EVEX.vvvv (r)</td>
|
||
<td>ModRM:r/m (r)</td>
|
||
<td>N/A</td></tr></table>
|
||
<h3 id="description">Description<a class="anchor" href="#description">
|
||
¶
|
||
</a></h3>
|
||
<p>Performs a SIMD multiply-add computation on the low double precision floating-point values using three source operands and writes the multiply-add result in the destination operand. The destination operand is also the first source operand. The first and second operand are XMM registers. The third source operand can be an XMM register or a 64-bit memory location.</p>
|
||
<p>VFMADD132SD: Multiplies the low double precision floating-point value from the first source operand to the low double precision floating-point value in the third source operand, adds the infinite precision intermediate result to the low double precision floating-point values in the second source operand, performs rounding and stores the resulting double precision floating-point value to the destination operand (first source operand).</p>
|
||
<p>VFMADD213SD: Multiplies the low double precision floating-point value from the second source operand to the low double precision floating-point value in the first source operand, adds the infinite precision intermediate result to the low double precision floating-point value in the third source operand, performs rounding and stores the resulting double precision floating-point value to the destination operand (first source operand).</p>
|
||
<p>VFMADD231SD: Multiplies the low double precision floating-point value from the second source to the low double precision floating-point value in the third source operand, adds the infinite precision intermediate result to the low double precision floating-point value in the first source operand, performs rounding and stores the resulting double precision floating-point value to the destination operand (first source operand).</p>
|
||
<p>VEX.128 and EVEX encoded version: The destination operand (also first source operand) is encoded in reg_field. The second source operand is encoded in VEX.vvvv/EVEX.vvvv. The third source operand is encoded in rm_field. Bits 127:64 of the destination are unchanged. Bits MAXVL-1:128 of the destination register are zeroed.</p>
|
||
<p>EVEX encoded version: The low quadword element of the destination is updated according to the writemask.</p>
|
||
<h3 id="operation">Operation<a class="anchor" href="#operation">
|
||
¶
|
||
</a></h3>
|
||
<pre>In the operations below, “*” and “+” symbols represent multiplication and addition with infinite precision inputs and outputs (no
|
||
rounding).
|
||
</pre>
|
||
<h4 id="vfmadd132sd-dest--src2--src3--evex-encoded-version-">VFMADD132SD DEST, SRC2, SRC3 (EVEX encoded version)<a class="anchor" href="#vfmadd132sd-dest--src2--src3--evex-encoded-version-">
|
||
¶
|
||
</a></h4>
|
||
<pre>IF (EVEX.b = 1) and SRC3 *is a register*
|
||
THEN
|
||
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
|
||
ELSE
|
||
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
|
||
FI;
|
||
IF k1[0] or *no writemask*
|
||
THEN DEST[63:0] := RoundFPControl(DEST[63:0]*SRC3[63:0] + SRC2[63:0])
|
||
ELSE
|
||
IF *merging-masking* ; merging-masking
|
||
THEN *DEST[63:0] remains unchanged*
|
||
ELSE ; zeroing-masking
|
||
THEN DEST[63:0] := 0
|
||
FI;
|
||
FI;
|
||
DEST[127:64] := DEST[127:64]
|
||
DEST[MAXVL-1:128] := 0
|
||
</pre>
|
||
<h4 id="vfmadd213sd-dest--src2--src3--evex-encoded-version-">VFMADD213SD DEST, SRC2, SRC3 (EVEX encoded version)<a class="anchor" href="#vfmadd213sd-dest--src2--src3--evex-encoded-version-">
|
||
¶
|
||
</a></h4>
|
||
<pre>IF (EVEX.b = 1) and SRC3 *is a register*
|
||
THEN
|
||
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
|
||
ELSE
|
||
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
|
||
FI;
|
||
IF k1[0] or *no writemask*
|
||
THEN DEST[63:0] := RoundFPControl(SRC2[63:0]*DEST[63:0] + SRC3[63:0])
|
||
ELSE
|
||
IF *merging-masking* ; merging-masking
|
||
THEN *DEST[63:0] remains unchanged*
|
||
ELSE ; zeroing-masking
|
||
THEN DEST[63:0] := 0
|
||
FI;
|
||
FI;
|
||
DEST[127:64] := DEST[127:64]
|
||
DEST[MAXVL-1:128] := 0
|
||
</pre>
|
||
<h4 id="vfmadd231sd-dest--src2--src3--evex-encoded-version-">VFMADD231SD DEST, SRC2, SRC3 (EVEX encoded version)<a class="anchor" href="#vfmadd231sd-dest--src2--src3--evex-encoded-version-">
|
||
¶
|
||
</a></h4>
|
||
<pre>IF (EVEX.b = 1) and SRC3 *is a register*
|
||
THEN
|
||
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
|
||
ELSE
|
||
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
|
||
FI;
|
||
IF k1[0] or *no writemask*
|
||
THEN DEST[63:0] := RoundFPControl(SRC2[63:0]*SRC3[63:0] + DEST[63:0])
|
||
ELSE
|
||
IF *merging-masking* ; merging-masking
|
||
THEN *DEST[63:0] remains unchanged*
|
||
ELSE ; zeroing-masking
|
||
THEN DEST[63:0] := 0
|
||
FI;
|
||
FI;
|
||
DEST[127:64] := DEST[127:64]
|
||
DEST[MAXVL-1:128] := 0
|
||
</pre>
|
||
<h4 id="vfmadd132sd-dest--src2--src3--vex-encoded-version-">VFMADD132SD DEST, SRC2, SRC3 (VEX encoded version)<a class="anchor" href="#vfmadd132sd-dest--src2--src3--vex-encoded-version-">
|
||
¶
|
||
</a></h4>
|
||
<pre>DEST[63:0] := MAXVL-1:128RoundFPControl_MXCSR(DEST[63:0]*SRC3[63:0] + SRC2[63:0])
|
||
DEST[127:63] := DEST[127:63]
|
||
DEST[MAXVL-1:128] := 0
|
||
</pre>
|
||
<h4 id="vfmadd213sd-dest--src2--src3--vex-encoded-version-">VFMADD213SD DEST, SRC2, SRC3 (VEX encoded version)<a class="anchor" href="#vfmadd213sd-dest--src2--src3--vex-encoded-version-">
|
||
¶
|
||
</a></h4>
|
||
<pre>DEST[63:0] := RoundFPControl_MXCSR(SRC2[63:0]*DEST[63:0] + SRC3[63:0])
|
||
DEST[127:63] := DEST[127:63]
|
||
DEST[MAXVL-1:128] := 0
|
||
</pre>
|
||
<h4 id="vfmadd231sd-dest--src2--src3--vex-encoded-version-">VFMADD231SD DEST, SRC2, SRC3 (VEX encoded version)<a class="anchor" href="#vfmadd231sd-dest--src2--src3--vex-encoded-version-">
|
||
¶
|
||
</a></h4>
|
||
<pre>DEST[63:0] := RoundFPControl_MXCSR(SRC2[63:0]*SRC3[63:0] + DEST[63:0])
|
||
DEST[127:63] := DEST[127:63]
|
||
DEST[MAXVL-1:128] := 0
|
||
</pre>
|
||
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
|
||
¶
|
||
</a></h3>
|
||
<pre>VFMADDxxxSD __m128d _mm_fmadd_round_sd(__m128d a, __m128d b, __m128d c, int r);
|
||
</pre>
|
||
<pre>VFMADDxxxSD __m128d _mm_mask_fmadd_sd(__m128d a, __mmask8 k, __m128d b, __m128d c);
|
||
</pre>
|
||
<pre>VFMADDxxxSD __m128d _mm_maskz_fmadd_sd(__mmask8 k, __m128d a, __m128d b, __m128d c);
|
||
</pre>
|
||
<pre>VFMADDxxxSD __m128d _mm_mask3_fmadd_sd(__m128d a, __m128d b, __m128d c, __mmask8 k);
|
||
</pre>
|
||
<pre>VFMADDxxxSD __m128d _mm_mask_fmadd_round_sd(__m128d a, __mmask8 k, __m128d b, __m128d c, int r);
|
||
</pre>
|
||
<pre>VFMADDxxxSD __m128d _mm_maskz_fmadd_round_sd(__mmask8 k, __m128d a, __m128d b, __m128d c, int r);
|
||
</pre>
|
||
<pre>VFMADDxxxSD __m128d _mm_mask3_fmadd_round_sd(__m128d a, __m128d b, __m128d c, __mmask8 k, int r);
|
||
</pre>
|
||
<pre>VFMADDxxxSD __m128d _mm_fmadd_sd (__m128d a, __m128d b, __m128d c);
|
||
</pre>
|
||
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
|
||
¶
|
||
</a></h3>
|
||
<p>Overflow, Underflow, Invalid, Precision, Denormal</p>
|
||
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
|
||
¶
|
||
</a></h3>
|
||
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-20</span>, “Type 3 Class Exception Conditions.”</p>
|
||
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-47</span>, “Type E3 Class Exception Conditions.”</p><footer><p>
|
||
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
|
||
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
|
||
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developer’s Manual</a> for anything serious.
|
||
</p></footer></body></html>
|