ia32-64/x86/vrndscalepd.html

222 lines
15 KiB
HTML
Raw Normal View History

2025-07-08 02:23:29 -03:00
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VRNDSCALEPD
— Round Packed Float64 Values to Include a Given Number of Fraction Bits</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VRNDSCALEPD
— Round Packed Float64 Values to Include a Given Number of Fraction Bits</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>EVEX.128.66.0F3A.W1 09 /r ib VRNDSCALEPD xmm1 {k1}{z}, xmm2/m128/m64bcst, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Rounds packed double precision floating-point values in xmm2/m128/m64bcst to a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register. Under writemask.</td></tr>
<tr>
<td>EVEX.256.66.0F3A.W1 09 /r ib VRNDSCALEPD ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Rounds packed double precision floating-point values in ymm2/m256/m64bcst to a number of fraction bits specified by the imm8 field. Stores the result in ymm1 register. Under writemask.</td></tr>
<tr>
<td>EVEX.512.66.0F3A.W1 09 /r ib VRNDSCALEPD zmm1 {k1}{z}, zmm2/m512/m64bcst{sae}, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Rounds packed double precision floating-point values in zmm2/m512/m64bcst to a number of fraction bits specified by the imm8 field. Stores the result in zmm1 register using writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr></table>
<h3 id="description">Description<a class="anchor" href="#description">
</a></h3>
<p>Round the double precision floating-point values in the source operand by the rounding mode specified in the immediate operand (see <a href='vrndscalepd.html#fig-5-29'>Figure 5-29</a>) and places the result in the destination operand.</p>
<p>The destination operand (the first operand) is a ZMM/YMM/XMM register conditionally updated according to the writemask. The source operand (the second operand) can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 64-bit memory location.</p>
<p>The rounding process rounds the input to an integral value, plus number bits of fraction that are specified by imm8[7:4] (to be included in the result) and returns the result as a double precision floating-point value.</p>
<p>It should be noticed that no overflow is induced while executing this instruction (although the source is scaled by the imm8[7:4] value).</p>
<p>The immediate operand also specifies control fields for the rounding operation, three bit fields are defined and shown in the “Immediate Control Description” figure below. Bit 3 of the immediate byte controls the processor behavior for a precision exception, bit 2 selects the source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (immediate control table below lists the encoded values for rounding-mode field).</p>
<p>The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN. If DAZ is set to 1 then denormals will be converted to zero before rounding.</p>
<p>The sign of the result of this instruction is preserved, including the sign of zero.</p>
<p>The formula of the operation on each data element for VRNDSCALEPD is</p>
<p>ROUND(x) = 2<sup>-M</sup>*Round_to_INT(x*2<sup>M</sup>, round_ctrl),</p>
<p>round_ctrl = imm[3:0];</p>
<p>M=imm[7:4];</p>
<p>The operation of x*2<sup>M</sup> is computed as if the exponent range is unlimited (i.e., no overflow ever occurs).</p>
<p>VRNDSCALEPD is a more general form of the VEX-encoded VROUNDPD instruction. In VROUNDPD, the formula of the operation on each element is</p>
<p>ROUND(x) = Round_to_INT(x, round_ctrl),</p>
<p>round_ctrl = imm[3:0];</p>
<p>Note: EVEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>
<figure id="fig-5-29">
<svg style="width: 618.9840000000002pt; height: 129.671976pt" viewBox="41.06 0.0 520.8200000000002 113.05998">
<g xmlns="http://www.w3.org/2000/svg" style="stroke: none; fill: none">
<rect height="107.10000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="43.56" y="0.479979999999955"></rect>
<rect height="107.10000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="558.9" y="0.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="515.82" x="43.56" y="0.0"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="515.82" x="43.56" y="107.58000000000004"></rect>
<rect height="13.5" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="465.3" y="27.5399799999999"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="373.26" x="92.28" y="40.559999999999945"></rect>
<rect height="13.5" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="92.28" y="27.299980000000005"></rect>
<rect height="0.23999" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="372.06" y="27.539989999999875"></rect>
<rect height="13.26" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="372.06" y="27.77997999999991"></rect>
<rect height="0.23999" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="278.82" y="27.539989999999875"></rect>
<rect height="13.26" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="278.82" y="27.77997999999991"></rect>
<path d="M 137.1 61.43997999999999 L 132.9 64.37998000000005" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 138.78 63.89998000000003 L 132.9 64.37998000000005 L 135.42 59.039980000000014" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 159.0 46.19997999999998 L 137.1 61.43997999999999" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 439.92 53.51998000000003 L 443.94 56.69997999999998" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 441.78000000000003 51.17998 L 443.94000000000005 56.69997999999998 L 438.12 55.799980000000005" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 423.06000000000006 40.19997999999998 L 439.9200000000001 53.51998000000003" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<rect height="0.24005" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="326.76" y="28.439930000000004"></rect>
<rect height="11.76" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="326.76" y="28.67998"></rect>
<path d="M 349.92 57.059979999999996 L 350.52000000000004 62.15998000000002" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 352.86 56.75998000000004 L 350.52000000000004 62.15998000000002 L 346.98 57.41998000000001" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 348.06 40.67998 L 349.92 57.059979999999996" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 280.14 52.919979999999896 L 275.52 55.19997999999987" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 281.46 55.55997999999988 L 275.52 55.19997999999987 L 278.82 50.27997999999991" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 301.97999999999996 42.17997999999989 L 280.14 52.919979999999896" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518pt; fill: #000" textLength="3.509999999999991" x="440.70000000000005" y="23.795979999999957">0</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="14.071200000000005" x="62.04000000000008" y="35.97598000000005">imm8</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="9.752400000000023" x="297.06" y="37.17597999999998">SPE</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="6.786000000000001" x="344.5800000000001" y="37.17597999999998">RS</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="59.56619999999998" x="390.6000000000001" y="37.65598">Round Control Override</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="46.021199999999965" x="164.10000000000008" y="37.65598">Fixed point length</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="98.2908000000001" x="223.56" y="62.55597999999998">Suppress Precision Exception: Imm8[3]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="97.96139999999986" x="438.6000000000001" y="65.19597999999996">Imm8[1:0] = 00b : Round nearest even</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="57.37379999999985" x="341.58" y="71.49598000000003">Round Select: Imm8[2]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="108.87119999999982" x="223.56" y="72.39598000000001">Imm8[3] = 0b : Use MXCSR exception mask</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="119.8158" x="81.54000000000008" y="74.67597999999998">Imm8[7:4] : Number of fixed points to preserve</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="78.48359999999985" x="438.6000000000001" y="75.03598">Imm8[1:0] = 01b : Round down</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="75.62339999999995" x="341.58" y="81.33598000000006">Imm8[2] = 0b : Use Imm8[1:0]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="61.77899999999994" x="223.56" y="82.23598000000004">Imm8[3] = 1b : Suppress</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="70.73160000000013" x="438.6000000000001" y="84.81597999999997">Imm8[1:0] = 10b : Round up</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="67.66859999999991" x="341.58" y="91.1759800000001">Imm8[2] = 1b : Use MXCSR</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="69.68639999999994" x="438.6000000000001" y="94.65598">Imm8[1:0] = 11b : Truncate</text></g></svg>
<figcaption><a href='vrndscalepd.html#fig-5-29'>Figure 5-29</a>. Imm8 Controls for VRNDSCALEPD/SD/PS/SS</figcaption></figure>
<p>Handling of special case of input values are listed in <a href='vrndscalepd.html#tbl-5-31'>Table 5-31</a>.</p>
<figure id="tbl-5-31">
<table>
<tr>
<th></th>
<th>Returned value</th></tr>
<tr>
<td><strong>Src1=±inf</strong></td>
<td>Src1</td></tr>
<tr>
<td><strong>Src1=±NAN</strong></td>
<td>Src1 converted to QNAN</td></tr>
<tr>
<td><strong>Src1=±0</strong></td>
<td>Src1</td></tr></table>
<figcaption><a href='vrndscalepd.html#tbl-5-31'>Table 5-31</a>. VRNDSCALEPD/SD/PS/SS Special Cases</figcaption></figure>
<h3 id="operation">Operation<a class="anchor" href="#operation">
</a></h3>
<pre>RoundToIntegerDP(SRC[63:0], imm8[7:0]) {
if (imm8[2] = 1)
rounding_direction := MXCSR:RC
; get round control from MXCSR
else
rounding_direction := imm8[1:0]
; get round control from imm8[1:0]
FI
M := imm8[7:4] ; get the scaling factor
case (rounding_direction)
00: TMP[63:0] := round_to_nearest_even_integer(2<sup>M</sup>*SRC[63:0])
01: TMP[63:0] := round_to_equal_or_smaller_integer(2<sup>M</sup>*SRC[63:0])
10: TMP[63:0] := round_to_equal_or_larger_integer(2<sup>M</sup>*SRC[63:0])
11: TMP[63:0] := round_to_nearest_smallest_magnitude_integer(2<sup>M</sup>*SRC[63:0])
ESAC
Dest[63:0] := 2<sup>-M</sup>* TMP[63:0]
; scale down back to 2<sup>-M</sup>
if (imm8[3] = 0) Then ; check SPE
if (SRC[63:0] != Dest[63:0]) Then
; check precision lost
set_precision()
; set #PE
FI;
FI;
return(Dest[63:0])
}
</pre>
<h4 id="vrndscalepd--evex-encoded-versions-">VRNDSCALEPD (EVEX encoded versions)<a class="anchor" href="#vrndscalepd--evex-encoded-versions-">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
IF *src is a memory operand*
THEN TMP_SRC := BROADCAST64(SRC, VL, k1)
ELSE TMP_SRC := SRC
FI;
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+63:i] := RoundToIntegerDP((TMP_SRC[i+63:i], imm8[7:0])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+63:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+63:i] := 0
FI;
FI;
ENDFOR;
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h3>
<pre>VRNDSCALEPD __m512d _mm512_roundscale_pd( __m512d a, int imm);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_roundscale_round_pd( __m512d a, int imm, int sae);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_mask_roundscale_pd(__m512d s, __mmask8 k, __m512d a, int imm);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_mask_roundscale_round_pd(__m512d s, __mmask8 k, __m512d a, int imm, int sae);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_maskz_roundscale_pd( __mmask8 k, __m512d a, int imm);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_maskz_roundscale_round_pd( __mmask8 k, __m512d a, int imm, int sae);
</pre>
<pre>VRNDSCALEPD __m256d _mm256_roundscale_pd( __m256d a, int imm);
</pre>
<pre>VRNDSCALEPD __m256d _mm256_mask_roundscale_pd(__m256d s, __mmask8 k, __m256d a, int imm);
</pre>
<pre>VRNDSCALEPD __m256d _mm256_maskz_roundscale_pd( __mmask8 k, __m256d a, int imm);
</pre>
<pre>VRNDSCALEPD __m128d _mm_roundscale_pd( __m128d a, int imm);
</pre>
<pre>VRNDSCALEPD __m128d _mm_mask_roundscale_pd(__m128d s, __mmask8 k, __m128d a, int imm);
</pre>
<pre>VRNDSCALEPD __m128d _mm_maskz_roundscale_pd( __mmask8 k, __m128d a, int imm);
</pre>
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h3>
<p>Invalid, Precision.</p>
<p>If SPE is enabled, precision exception is not reported (regardless of MXCSR exception mask).</p>
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h3>
<p>See <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>