ia32-64/x86/vrndscalepd.html
2025-07-08 02:23:29 -03:00

221 lines
15 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VRNDSCALEPD
— Round Packed Float64 Values to Include a Given Number of Fraction Bits</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VRNDSCALEPD
— Round Packed Float64 Values to Include a Given Number of Fraction Bits</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>EVEX.128.66.0F3A.W1 09 /r ib VRNDSCALEPD xmm1 {k1}{z}, xmm2/m128/m64bcst, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Rounds packed double precision floating-point values in xmm2/m128/m64bcst to a number of fraction bits specified by the imm8 field. Stores the result in xmm1 register. Under writemask.</td></tr>
<tr>
<td>EVEX.256.66.0F3A.W1 09 /r ib VRNDSCALEPD ymm1 {k1}{z}, ymm2/m256/m64bcst, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Rounds packed double precision floating-point values in ymm2/m256/m64bcst to a number of fraction bits specified by the imm8 field. Stores the result in ymm1 register. Under writemask.</td></tr>
<tr>
<td>EVEX.512.66.0F3A.W1 09 /r ib VRNDSCALEPD zmm1 {k1}{z}, zmm2/m512/m64bcst{sae}, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Rounds packed double precision floating-point values in zmm2/m512/m64bcst to a number of fraction bits specified by the imm8 field. Stores the result in zmm1 register using writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr></table>
<h3 id="description">Description<a class="anchor" href="#description">
</a></h3>
<p>Round the double precision floating-point values in the source operand by the rounding mode specified in the immediate operand (see <a href='vrndscalepd.html#fig-5-29'>Figure 5-29</a>) and places the result in the destination operand.</p>
<p>The destination operand (the first operand) is a ZMM/YMM/XMM register conditionally updated according to the writemask. The source operand (the second operand) can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 64-bit memory location.</p>
<p>The rounding process rounds the input to an integral value, plus number bits of fraction that are specified by imm8[7:4] (to be included in the result) and returns the result as a double precision floating-point value.</p>
<p>It should be noticed that no overflow is induced while executing this instruction (although the source is scaled by the imm8[7:4] value).</p>
<p>The immediate operand also specifies control fields for the rounding operation, three bit fields are defined and shown in the “Immediate Control Description” figure below. Bit 3 of the immediate byte controls the processor behavior for a precision exception, bit 2 selects the source of rounding mode control. Bits 1:0 specify a non-sticky rounding-mode value (immediate control table below lists the encoded values for rounding-mode field).</p>
<p>The Precision Floating-Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN. If DAZ is set to 1 then denormals will be converted to zero before rounding.</p>
<p>The sign of the result of this instruction is preserved, including the sign of zero.</p>
<p>The formula of the operation on each data element for VRNDSCALEPD is</p>
<p>ROUND(x) = 2<sup>-M</sup>*Round_to_INT(x*2<sup>M</sup>, round_ctrl),</p>
<p>round_ctrl = imm[3:0];</p>
<p>M=imm[7:4];</p>
<p>The operation of x*2<sup>M</sup> is computed as if the exponent range is unlimited (i.e., no overflow ever occurs).</p>
<p>VRNDSCALEPD is a more general form of the VEX-encoded VROUNDPD instruction. In VROUNDPD, the formula of the operation on each element is</p>
<p>ROUND(x) = Round_to_INT(x, round_ctrl),</p>
<p>round_ctrl = imm[3:0];</p>
<p>Note: EVEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>
<figure id="fig-5-29">
<svg style="width: 618.9840000000002pt; height: 129.671976pt" viewBox="41.06 0.0 520.8200000000002 113.05998">
<g xmlns="http://www.w3.org/2000/svg" style="stroke: none; fill: none">
<rect height="107.10000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="43.56" y="0.479979999999955"></rect>
<rect height="107.10000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="558.9" y="0.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="515.82" x="43.56" y="0.0"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="515.82" x="43.56" y="107.58000000000004"></rect>
<rect height="13.5" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="465.3" y="27.5399799999999"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="373.26" x="92.28" y="40.559999999999945"></rect>
<rect height="13.5" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="92.28" y="27.299980000000005"></rect>
<rect height="0.23999" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="372.06" y="27.539989999999875"></rect>
<rect height="13.26" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="372.06" y="27.77997999999991"></rect>
<rect height="0.23999" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="278.82" y="27.539989999999875"></rect>
<rect height="13.26" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="278.82" y="27.77997999999991"></rect>
<path d="M 137.1 61.43997999999999 L 132.9 64.37998000000005" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 138.78 63.89998000000003 L 132.9 64.37998000000005 L 135.42 59.039980000000014" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 159.0 46.19997999999998 L 137.1 61.43997999999999" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 439.92 53.51998000000003 L 443.94 56.69997999999998" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 441.78000000000003 51.17998 L 443.94000000000005 56.69997999999998 L 438.12 55.799980000000005" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 423.06000000000006 40.19997999999998 L 439.9200000000001 53.51998000000003" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<rect height="0.24005" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="326.76" y="28.439930000000004"></rect>
<rect height="11.76" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="326.76" y="28.67998"></rect>
<path d="M 349.92 57.059979999999996 L 350.52000000000004 62.15998000000002" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 352.86 56.75998000000004 L 350.52000000000004 62.15998000000002 L 346.98 57.41998000000001" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 348.06 40.67998 L 349.92 57.059979999999996" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 280.14 52.919979999999896 L 275.52 55.19997999999987" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 281.46 55.55997999999988 L 275.52 55.19997999999987 L 278.82 50.27997999999991" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 301.97999999999996 42.17997999999989 L 280.14 52.919979999999896" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518pt; fill: #000" textLength="3.509999999999991" x="440.70000000000005" y="23.795979999999957">0</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="14.071200000000005" x="62.04000000000008" y="35.97598000000005">imm8</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="9.752400000000023" x="297.06" y="37.17597999999998">SPE</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="6.786000000000001" x="344.5800000000001" y="37.17597999999998">RS</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="59.56619999999998" x="390.6000000000001" y="37.65598">Round Control Override</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="46.021199999999965" x="164.10000000000008" y="37.65598">Fixed point length</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="98.2908000000001" x="223.56" y="62.55597999999998">Suppress Precision Exception: Imm8[3]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="97.96139999999986" x="438.6000000000001" y="65.19597999999996">Imm8[1:0] = 00b : Round nearest even</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="57.37379999999985" x="341.58" y="71.49598000000003">Round Select: Imm8[2]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="108.87119999999982" x="223.56" y="72.39598000000001">Imm8[3] = 0b : Use MXCSR exception mask</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="119.8158" x="81.54000000000008" y="74.67597999999998">Imm8[7:4] : Number of fixed points to preserve</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="78.48359999999985" x="438.6000000000001" y="75.03598">Imm8[1:0] = 01b : Round down</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="75.62339999999995" x="341.58" y="81.33598000000006">Imm8[2] = 0b : Use Imm8[1:0]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="61.77899999999994" x="223.56" y="82.23598000000004">Imm8[3] = 1b : Suppress</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="70.73160000000013" x="438.6000000000001" y="84.81597999999997">Imm8[1:0] = 10b : Round up</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="67.66859999999991" x="341.58" y="91.1759800000001">Imm8[2] = 1b : Use MXCSR</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.518000000000029pt; fill: #000" textLength="69.68639999999994" x="438.6000000000001" y="94.65598">Imm8[1:0] = 11b : Truncate</text></g></svg>
<figcaption><a href='vrndscalepd.html#fig-5-29'>Figure 5-29</a>. Imm8 Controls for VRNDSCALEPD/SD/PS/SS</figcaption></figure>
<p>Handling of special case of input values are listed in <a href='vrndscalepd.html#tbl-5-31'>Table 5-31</a>.</p>
<figure id="tbl-5-31">
<table>
<tr>
<th></th>
<th>Returned value</th></tr>
<tr>
<td><strong>Src1=±inf</strong></td>
<td>Src1</td></tr>
<tr>
<td><strong>Src1=±NAN</strong></td>
<td>Src1 converted to QNAN</td></tr>
<tr>
<td><strong>Src1=±0</strong></td>
<td>Src1</td></tr></table>
<figcaption><a href='vrndscalepd.html#tbl-5-31'>Table 5-31</a>. VRNDSCALEPD/SD/PS/SS Special Cases</figcaption></figure>
<h3 id="operation">Operation<a class="anchor" href="#operation">
</a></h3>
<pre>RoundToIntegerDP(SRC[63:0], imm8[7:0]) {
if (imm8[2] = 1)
rounding_direction := MXCSR:RC
; get round control from MXCSR
else
rounding_direction := imm8[1:0]
; get round control from imm8[1:0]
FI
M := imm8[7:4] ; get the scaling factor
case (rounding_direction)
00: TMP[63:0] := round_to_nearest_even_integer(2<sup>M</sup>*SRC[63:0])
01: TMP[63:0] := round_to_equal_or_smaller_integer(2<sup>M</sup>*SRC[63:0])
10: TMP[63:0] := round_to_equal_or_larger_integer(2<sup>M</sup>*SRC[63:0])
11: TMP[63:0] := round_to_nearest_smallest_magnitude_integer(2<sup>M</sup>*SRC[63:0])
ESAC
Dest[63:0] := 2<sup>-M</sup>* TMP[63:0]
; scale down back to 2<sup>-M</sup>
if (imm8[3] = 0) Then ; check SPE
if (SRC[63:0] != Dest[63:0]) Then
; check precision lost
set_precision()
; set #PE
FI;
FI;
return(Dest[63:0])
}
</pre>
<h4 id="vrndscalepd--evex-encoded-versions-">VRNDSCALEPD (EVEX encoded versions)<a class="anchor" href="#vrndscalepd--evex-encoded-versions-">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
IF *src is a memory operand*
THEN TMP_SRC := BROADCAST64(SRC, VL, k1)
ELSE TMP_SRC := SRC
FI;
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+63:i] := RoundToIntegerDP((TMP_SRC[i+63:i], imm8[7:0])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+63:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+63:i] := 0
FI;
FI;
ENDFOR;
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h3>
<pre>VRNDSCALEPD __m512d _mm512_roundscale_pd( __m512d a, int imm);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_roundscale_round_pd( __m512d a, int imm, int sae);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_mask_roundscale_pd(__m512d s, __mmask8 k, __m512d a, int imm);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_mask_roundscale_round_pd(__m512d s, __mmask8 k, __m512d a, int imm, int sae);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_maskz_roundscale_pd( __mmask8 k, __m512d a, int imm);
</pre>
<pre>VRNDSCALEPD __m512d _mm512_maskz_roundscale_round_pd( __mmask8 k, __m512d a, int imm, int sae);
</pre>
<pre>VRNDSCALEPD __m256d _mm256_roundscale_pd( __m256d a, int imm);
</pre>
<pre>VRNDSCALEPD __m256d _mm256_mask_roundscale_pd(__m256d s, __mmask8 k, __m256d a, int imm);
</pre>
<pre>VRNDSCALEPD __m256d _mm256_maskz_roundscale_pd( __mmask8 k, __m256d a, int imm);
</pre>
<pre>VRNDSCALEPD __m128d _mm_roundscale_pd( __m128d a, int imm);
</pre>
<pre>VRNDSCALEPD __m128d _mm_mask_roundscale_pd(__m128d s, __mmask8 k, __m128d a, int imm);
</pre>
<pre>VRNDSCALEPD __m128d _mm_maskz_roundscale_pd( __mmask8 k, __m128d a, int imm);
</pre>
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h3>
<p>Invalid, Precision.</p>
<p>If SPE is enabled, precision exception is not reported (regardless of MXCSR exception mask).</p>
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h3>
<p>See <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>