ia32-64/x86/vreducepd.html
2025-07-08 02:23:29 -03:00

226 lines
15 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VREDUCEPD
— Perform Reduction Transformation on Packed Float64 Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VREDUCEPD
— Perform Reduction Transformation on Packed Float64 Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>EVEX.128.66.0F3A.W1 56 /r ib VREDUCEPD xmm1 {k1}{z}, xmm2/m128/m64bcst, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Perform reduction transformation on packed double precision floating-point values in xmm2/m128/m32bcst by subtracting a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register under writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F3A.W1 56 /r ib VREDUCEPD ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Perform reduction transformation on packed double precision floating-point values in ymm2/m256/m32bcst by subtracting a number of fraction bits specified by the imm8 field. Stores the result in ymm1 register under writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F3A.W1 56 /r ib VREDUCEPD zmm1 {k1}{z}, zmm2/m512/m64bcst{sae}, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512DQ</td>
<td>Perform reduction transformation on double precision floating-point values in zmm2/m512/m32bcst by subtracting a number of fraction bits specified by the imm8 field. Stores the result in zmm1 register under writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr></table>
<h3 id="description">Description<a class="anchor" href="#description">
</a></h3>
<p>Perform reduction transformation of the packed binary encoded double precision floating-point values in the source operand (the second operand) and store the reduced results in binary floating-point format to the destination operand (the first operand) under the writemask k1.</p>
<p>The reduction transformation subtracts the integer part and the leading M fractional bits from the binary floating-point source value, where M is a unsigned integer specified by imm8[7:4], see <a href='vreducepd.html#fig-5-28'>Figure 5-28</a>. Specifically, the reduction transformation can be expressed as:</p>
<p>dest = src (ROUND(2<sup>M</sup>*src))*2<sup>-M</sup>;</p>
<p>where “Round()” treats “src”, “2<sup>M</sup>”, and their product as binary floating-point numbers with normalized significand and biased exponents.</p>
<p>The magnitude of the reduced result can be expressed by considering src= 2<sup>p</sup>*man2,</p>
<p>where man2 is the normalized significand and p is the unbiased exponent</p>
<p>Then if RC = RNE: 0&lt;=|Reduced Result|&lt;=2<sup>p-M-1</sup></p>
<p>Then if RC ≠ RNE: 0&lt;=|Reduced Result|&lt;2<sup>p-M</sup></p>
<p>This instruction might end up with a precision exception set. However, in case of SPE set (i.e., Suppress Precision Exception, which is imm8[3]=1), no precision exception is reported.</p>
<p>EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>
<figure id="fig-5-28">
<svg style="width: 619.0559999999999pt; height: 129.74399999999997pt" viewBox="41.54 0.0 520.88 113.11999999999998">
<g xmlns="http://www.w3.org/2000/svg" style="fill: none; stroke: none">
<rect height="107.10000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="44.04" y="0.5399999999999636"></rect>
<rect height="107.10000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="559.44" y="0.5399999999999636"></rect>
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="515.88" x="44.04" y="0.0"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="515.88" x="44.04" y="107.64001999999998"></rect>
<rect height="13.44" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="465.84000000000003" y="27.599999999999966"></rect>
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="373.26" x="92.82000000000001" y="40.559999999999974"></rect>
<rect height="13.44" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="92.82000000000001" y="27.359999999999985"></rect>
<rect height="0.23999" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="372.6" y="27.60000999999997"></rect>
<rect height="13.200000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="372.6" y="27.839999999999975"></rect>
<rect height="0.23999" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="279.3" y="27.60000999999997"></rect>
<rect height="13.200000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="279.3" y="27.839999999999975"></rect>
<path d="M 137.64000000000001 61.49999999999997 L 133.44000000000003 64.43999999999997" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 139.32000000000002 63.89999999999998 L 133.44000000000003 64.43999999999997 L 135.90000000000003 59.099999999999966" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 159.54000000000002 46.25999999999996 L 137.64000000000001 61.49999999999997" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 440.46000000000004 53.51999999999995 L 444.48 56.69999999999996" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 442.32000000000005 51.23999999999995 L 444.4800000000001 56.69999999999996 L 438.66 55.85999999999996" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 423.6 40.25999999999996 L 440.46000000000004 53.51999999999995" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<rect height="0.24001000000000003" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="327.3" y="28.49998999999997"></rect>
<rect height="11.76" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="327.3" y="28.73999999999998"></rect>
<path d="M 350.46 57.059999999999974 L 351.0 62.21999999999997" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 353.4 56.75999999999996 L 351.0 62.21999999999997 L 347.52 57.41999999999996" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 348.59999999999997 40.73999999999995 L 350.46 57.059999999999945" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 280.68 52.97999999999999 L 276.06 55.19999999999999" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 281.94 55.619999999999976 L 276.06 55.19999999999999 L 279.36 50.339999999999975" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 302.52 42.23999999999998 L 280.68 52.97999999999999" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="49.829999999999984" x="394.86" y="23.855999999999966">10</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="14.071200000000005" x="62.58000000000004" y="36.03599999999997">imm8</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="6.602400000000046" x="297.6" y="37.23599999999996">SP</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="41.36760000000004" x="307.35240000000005" y="37.23599999999996"> R</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.998000000000019pt; fill: #000" textLength="98.74019999999996" x="351.90600000000006" y="37.71599999999998"> Round Control Override</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="46.021199999999965" x="164.58000000000004" y="37.71599999999998">Fixed point length</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="98.29080000000008" x="224.04" y="62.61599999999996">Suppress Precision Exception: Imm8[3]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="97.96980000000008" x="439.08000000000004" y="65.19599999999997">Imm8[1:0] = 00b : Round nearest even</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="57.37379999999985" x="342.12" y="71.55599999999995">Round Select: Imm8[2]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="108.87119999999979" x="224.04" y="72.39599999999996">Imm8[3] = 0b : Use MXCSR exception mask</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="118.37339999999998" x="82.08000000000004" y="74.73599999999996">Imm8[7:4] : Number of fixed points to subtract</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="78.48359999999991" x="439.08000000000004" y="75.03599999999997">Imm8[1:0] = 01b : Round down</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.5179999999999865pt; fill: #000" textLength="75.62339999999995" x="342.12" y="81.39599999999997">Imm8[2] = 0b : Use Imm8[1:0]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.5179999999999865pt; fill: #000" textLength="61.77899999999991" x="224.04" y="82.23599999999998">Imm8[3] = 1b : Suppress</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.5179999999999865pt; fill: #000" textLength="70.73160000000013" x="439.08000000000004" y="84.87599999999999">Imm8[1:0] = 10b : Round up</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="67.66259999999977" x="342.12" y="91.23599999999998">Imm8[2] = 1b : Use MXCSR</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000001pt; fill: #000" textLength="69.68579999999992" x="439.08000000000004" y="94.716">Imm8[1:0] = 11b : Truncate</text></g></svg>
<figcaption><a href='vreducepd.html#fig-5-28'>Figure 5-28</a>. Imm8 Controls for VREDUCEPD/SD/PS/SS</figcaption></figure>
<p>Handling of special case of input values are listed in <a href='vreducepd.html#tbl-5-29'>Table 5-29</a>.</p>
<figure id="tbl-5-29">
<table>
<tr>
<th></th>
<th>Round Mode</th>
<th>Returned value</th></tr>
<tr>
<td>|Src1| &lt; 2<sup>-M-1</sup></td>
<td>RNE</td>
<td>Src1</td></tr>
<tr>
<td rowspan="4">|Src1| &lt; 2<sup>-M</sup></td>
<td>RPI, Src1 &gt; 0</td>
<td>Round (Src1-2<sup>-M</sup>) *</td></tr>
<tr>
<td>RPI, Src1 ≤ 0</td>
<td>Src1</td></tr>
<tr>
<td>RNI, Src1 ≥ 0</td>
<td>Src1</td></tr>
<tr>
<td>RNI, Src1 &lt; 0</td>
<td>Round (Src1+2<sup>-M</sup>) *</td></tr>
<tr>
<td rowspan="2">Src1 = ±0, or Dest = ±0 (Src1!=INF)</td>
<td>NOT RNI</td>
<td>+0.0</td></tr>
<tr>
<td>RNI</td>
<td>-0.0</td></tr>
<tr>
<td>Src1 = ±INF</td>
<td>any</td>
<td>+0.0</td></tr>
<tr>
<td>Src1= ±NAN</td>
<td>n/a</td>
<td>QNaN(Src1)</td></tr></table>
<figcaption><a href='vreducepd.html#tbl-5-29'>Table 5-29</a>. VREDUCEPD/SD/PS/SS Special Cases</figcaption></figure>
<p>* Round control = (imm8.MS1)? MXCSR.RC: imm8.RC</p>
<h3 id="operation">Operation<a class="anchor" href="#operation">
</a></h3>
<pre>ReduceArgumentDP(SRC[63:0], imm8[7:0])
{
// Check for NaN
IF (SRC [63:0] = NAN) THEN
RETURN (Convert SRC[63:0] to QNaN); FI;
M := imm8[7:4]; // Number of fraction bits of the normalized significand to be subtracted
RC := imm8[1:0];// Round Control for ROUND() operation
RC source := imm[2];
SPE := imm[3];// Suppress Precision Exception
TMP[63:0] := 2<sup>-M</sup> *{ROUND(2<sup>M</sup>*SRC[63:0], SPE, RC_source, RC)}; // ROUND() treats SRC and 2<sup>M</sup> as standard binary FP values
TMP[63:0] := SRC[63:0] TMP[63:0]; // subtraction under the same RC,SPE controls
RETURN TMP[63:0]; // binary encoded FP with biased exponent and normalized significand
}
</pre>
<h4 id="vreducepd">VREDUCEPD<a class="anchor" href="#vreducepd">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask* THEN
IF (EVEX.b == 1) AND (SRC *is memory*)
THEN DEST[i+63:i] := ReduceArgumentDP(SRC[63:0], imm8[7:0]);
ELSE DEST[i+63:i] := ReduceArgumentDP(SRC[i+63:i], imm8[7:0]);
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+63:i] remains unchanged*
ELSE
; zeroing-masking
DEST[i+63:i] = 0
FI;
FI;
ENDFOR;
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h3>
<pre>VREDUCEPD __m512d _mm512_mask_reduce_pd( __m512d a, int imm, int sae)
</pre>
<pre>VREDUCEPD __m512d _mm512_mask_reduce_pd(__m512d s, __mmask8 k, __m512d a, int imm, int sae)
</pre>
<pre>VREDUCEPD __m512d _mm512_maskz_reduce_pd(__mmask8 k, __m512d a, int imm, int sae)
</pre>
<pre>VREDUCEPD __m256d _mm256_mask_reduce_pd( __m256d a, int imm)
</pre>
<pre>VREDUCEPD __m256d _mm256_mask_reduce_pd(__m256d s, __mmask8 k, __m256d a, int imm)
</pre>
<pre>VREDUCEPD __m256d _mm256_maskz_reduce_pd(__mmask8 k, __m256d a, int imm)
</pre>
<pre>VREDUCEPD __m128d _mm_mask_reduce_pd( __m128d a, int imm)
</pre>
<pre>VREDUCEPD __m128d _mm_mask_reduce_pd(__m128d s, __mmask8 k, __m128d a, int imm)
</pre>
<pre>VREDUCEPD __m128d _mm_maskz_reduce_pd(__mmask8 k, __m128d a, int imm)
</pre>
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h3>
<p>Invalid, Precision.</p>
<p>If SPE is enabled, precision exception is not reported (regardless of MXCSR exception mask).</p>
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h3>
<p>See <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>