ia32-64/x86/vrcpph.html
2025-07-08 02:23:29 -03:00

139 lines
5.5 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VRCPPH
— Compute Reciprocals of Packed FP16 Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VRCPPH
— Compute Reciprocals of Packed FP16 Values</h1>
<table>
<tr>
<th> Instruction En bit Mode Flag
Support Instruction En bit Mode Flag
Support 64/32 CPUID Feature Instruction En bit Mode Flag CPUID Feature Instruction En bit Mode Flag Op/ 64/32 CPUID Feature Instruction En bit Mode Flag 64/32 CPUID Feature Instruction En bit Mode Flag CPUID Feature Instruction En bit Mode Flag Op/ 64/32 CPUID Feature </th>
<th></th>
<th>Support</th>
<th></th>
<th>Description</th></tr>
<tr>
<td>EVEX.128.66.MAP6.W0 4C /r VRCPPH xmm1{k1}{z}, xmm2/m128/m16bcst</td>
<td>A</td>
<td>V/V</td>
<td>AVX512-FP16 AVX512VL</td>
<td>Compute the approximate reciprocals of packed FP16 values in xmm2/m128/m16bcst and store the result in xmm1 subject to writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.MAP6.W0 4C /r VRCPPH ymm1{k1}{z}, ymm2/m256/m16bcst</td>
<td>A</td>
<td>V/V</td>
<td>AVX512-FP16 AVX512VL</td>
<td>Compute the approximate reciprocals of packed FP16 values in ymm2/m256/m16bcst and store the result in ymm1 subject to writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.MAP6.W0 4C /r VRCPPH zmm1{k1}{z}, zmm2/m512/m16bcst</td>
<td>A</td>
<td>V/V</td>
<td>AVX512-FP16</td>
<td>Compute the approximate reciprocals of packed FP16 values in zmm2/m512/m16bcst and store the result in zmm1 subject to writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h3 id="description">Description<a class="anchor" href="#description">
</a></h3>
<p>This instruction performs a SIMD computation of the approximate reciprocals of 8/16/32 packed FP16 values in the source operand (the second operand) and stores the packed FP16 results in the destination operand. The maximum relative error for this approximation is less than 2<sup>11</sup> + 2<sup>14</sup>.</p>
<p>For special cases, see <a href='vrcpph.html#tbl-5-28'>Table 5-28</a>.</p>
<figure id="tbl-5-28">
<table>
<tr>
<th>Input Value</th>
<th>Result Value</th>
<th>Comments</th></tr>
<tr>
<td>0 ≤ X ≤ 2<sup>-16</sup></td>
<td>INF</td>
<td>Very small denormal</td></tr>
<tr>
<td>2<sup>-16</sup> ≤ X ≤ -0</td>
<td>INF</td>
<td>Very small denormal</td></tr>
<tr>
<td>X &gt; +∞</td>
<td>+0</td>
<td></td></tr>
<tr>
<td>X &lt; −∞</td>
<td>0</td>
<td></td></tr>
<tr>
<td><sub>X = 2</sub>-n</td>
<td><sub>2</sub><sup>n</sup></td>
<td></td></tr>
<tr>
<td>X = 2<sup>-n</sup></td>
<td>2<sup>n</sup></td>
<td></td></tr></table>
<figcaption><a href='vrcpph.html#tbl-5-28'>Table 5-28</a>. VRCPPH/VRCPSH Special Cases</figcaption></figure>
<h3 id="operation">Operation<a class="anchor" href="#operation">
</a></h3>
<h4 id="vrcpph-dest-k1---src">VRCPPH dest{k1}, src<a class="anchor" href="#vrcpph-dest-k1---src">
</a></h4>
<pre>VL = 128, 256 or 512
KL := VL/16
FOR i := 0 to KL-1:
IF k1[i] or *no writemask*:
IF SRC is memory and (EVEX.b = 1):
tsrc := src.fp16[0]
ELSE:
tsrc := src.fp16[i]
DEST.fp16[i] := APPROXIMATE(1.0 / tsrc)
ELSE IF *zeroing*:
DEST.fp16[i] := 0
//else DEST.fp16[i] remains unchanged
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h3>
<pre>VRCPPH __m128h _mm_mask_rcp_ph (__m128h src, __mmask8 k, __m128h a);
</pre>
<pre>VRCPPH __m128h _mm_maskz_rcp_ph (__mmask8 k, __m128h a);
</pre>
<pre>VRCPPH __m128h _mm_rcp_ph (__m128h a);
</pre>
<pre>VRCPPH __m256h _mm256_mask_rcp_ph (__m256h src, __mmask16 k, __m256h a);
</pre>
<pre>VRCPPH __m256h _mm256_maskz_rcp_ph (__mmask16 k, __m256h a);
</pre>
<pre>VRCPPH __m256h _mm256_rcp_ph (__m256h a);
</pre>
<pre>VRCPPH __m512h _mm512_mask_rcp_ph (__m512h src, __mmask32 k, __m512h a);
</pre>
<pre>VRCPPH __m512h _mm512_maskz_rcp_ph (__mmask32 k, __m512h a);
</pre>
<pre>VRCPPH __m512h _mm512_rcp_ph (__m512h a);
</pre>
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h3>
<p>None.</p>
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h3>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>