forked from NRZCode/ia32-64
129 lines
5.8 KiB
HTML
129 lines
5.8 KiB
HTML
<!DOCTYPE html>
|
||
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VSUBPH
|
||
— Subtract Packed FP16 Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VSUBPH
|
||
— Subtract Packed FP16 Values</h1>
|
||
|
||
<table>
|
||
<tr>
|
||
<th> Instruction En bit Mode Flag
|
||
Support Instruction En bit Mode Flag
|
||
Support 64/32 CPUID Feature Instruction En bit Mode Flag CPUID Feature Instruction En bit Mode Flag Op/ 64/32 CPUID Feature Instruction En bit Mode Flag 64/32 CPUID Feature Instruction En bit Mode Flag CPUID Feature Instruction En bit Mode Flag Op/ 64/32 CPUID Feature </th>
|
||
<th></th>
|
||
<th>Support</th>
|
||
<th></th>
|
||
<th>Description</th></tr>
|
||
<tr>
|
||
<td>EVEX.128.NP.MAP5.W0 5C /r VSUBPH xmm1{k1}{z}, xmm2, xmm3/m128/m16bcst</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>AVX512-FP16 AVX512VL</td>
|
||
<td>Subtract packed FP16 values from xmm3/m128/m16bcst to xmm2, and store the result in xmm1 subject to writemask k1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.256.NP.MAP5.W0 5C /r VSUBPH ymm1{k1}{z}, ymm2, ymm3/m256/m16bcst</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>AVX512-FP16 AVX512VL</td>
|
||
<td>Subtract packed FP16 values from ymm3/m256/m16bcst to ymm2, and store the result in ymm1 subject to writemask k1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.512.NP.MAP5.W0 5C /r VSUBPH zmm1{k1}{z}, zmm2, zmm3/m512/m16bcst {er}</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>AVX512-FP16</td>
|
||
<td>Subtract packed FP16 values from zmm3/m512/m16bcst to zmm2, and store the result in zmm1 subject to writemask k1.</td></tr></table>
|
||
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
|
||
¶
|
||
</a></h2>
|
||
<table>
|
||
<tr>
|
||
<th>Op/En</th>
|
||
<th>Tuple</th>
|
||
<th>Operand 1</th>
|
||
<th>Operand 2</th>
|
||
<th>Operand 3</th>
|
||
<th>Operand 4</th></tr>
|
||
<tr>
|
||
<td>A</td>
|
||
<td>Full</td>
|
||
<td>ModRM:reg (w)</td>
|
||
<td>VEX.vvvv (r)</td>
|
||
<td>ModRM:r/m (r)</td>
|
||
<td>N/A</td></tr></table>
|
||
<h3 id="description">Description<a class="anchor" href="#description">
|
||
¶
|
||
</a></h3>
|
||
<p>This instruction subtracts packed FP16 values from second source operand from the corresponding elements in the first source operand, storing the packed FP16 result in the destination operand. The destination elements are updated according to the writemask.</p>
|
||
<h3 id="operation">Operation<a class="anchor" href="#operation">
|
||
¶
|
||
</a></h3>
|
||
<h4 id="vsubph--evex-encoded-versions--when-src2-operand-is-a-register">VSUBPH (EVEX encoded versions) when src2 operand is a register<a class="anchor" href="#vsubph--evex-encoded-versions--when-src2-operand-is-a-register">
|
||
¶
|
||
</a></h4>
|
||
<pre>VL = 128, 256 or 512
|
||
KL := VL/16
|
||
IF (VL = 512) AND (EVEX.b = 1):
|
||
SET_RM(EVEX.RC)
|
||
ELSE
|
||
SET_RM(MXCSR.RC)
|
||
FOR j := 0 TO KL-1:
|
||
IF k1[j] OR *no writemask*:
|
||
DEST.fp16[j] := SRC1.fp16[j] - SRC2.fp16[j]
|
||
ELSE IF *zeroing*:
|
||
DEST.fp16[j] := 0
|
||
// else dest.fp16[j] remains unchanged
|
||
DEST[MAXVL-1:VL] := 0
|
||
</pre>
|
||
<h4 id="vsubph--evex-encoded-versions--when-src2-operand-is-a-memory-source">VSUBPH (EVEX encoded versions) when src2 operand is a memory source<a class="anchor" href="#vsubph--evex-encoded-versions--when-src2-operand-is-a-memory-source">
|
||
¶
|
||
</a></h4>
|
||
<pre>VL = 128, 256 or 512
|
||
KL := VL/16
|
||
FOR j := 0 TO KL-1:
|
||
IF k1[j] OR *no writemask*:
|
||
IF EVEX.b = 1:
|
||
DEST.fp16[j] := SRC1.fp16[j] - SRC2.fp16[0]
|
||
ELSE:
|
||
DEST.fp16[j] := SRC1.fp16[j] - SRC2.fp16[j]
|
||
ELSE IF *zeroing*:
|
||
DEST.fp16[j] := 0
|
||
// else dest.fp16[j] remains unchanged
|
||
DEST[MAXVL-1:VL] := 0
|
||
</pre>
|
||
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
|
||
¶
|
||
</a></h3>
|
||
<pre>VSUBPH __m128h _mm_mask_sub_ph (__m128h src, __mmask8 k, __m128h a, __m128h b);
|
||
</pre>
|
||
<pre>VSUBPH __m128h _mm_maskz_sub_ph (__mmask8 k, __m128h a, __m128h b);
|
||
</pre>
|
||
<pre>VSUBPH __m128h _mm_sub_ph (__m128h a, __m128h b);
|
||
</pre>
|
||
<pre>VSUBPH __m256h _mm256_mask_sub_ph (__m256h src, __mmask16 k, __m256h a, __m256h b);
|
||
</pre>
|
||
<pre>VSUBPH __m256h _mm256_maskz_sub_ph (__mmask16 k, __m256h a, __m256h b);
|
||
</pre>
|
||
<pre>VSUBPH __m256h _mm256_sub_ph (__m256h a, __m256h b);
|
||
</pre>
|
||
<pre>VSUBPH __m512h _mm512_mask_sub_ph (__m512h src, __mmask32 k, __m512h a, __m512h b);
|
||
</pre>
|
||
<pre>VSUBPH __m512h _mm512_maskz_sub_ph (__mmask32 k, __m512h a, __m512h b);
|
||
</pre>
|
||
<pre>VSUBPH __m512h _mm512_sub_ph (__m512h a, __m512h b);
|
||
</pre>
|
||
<pre>VSUBPH __m512h _mm512_mask_sub_round_ph (__m512h src, __mmask32 k, __m512h a, __m512h b, int rounding);
|
||
</pre>
|
||
<pre>VSUBPH __m512h _mm512_maskz_sub_round_ph (__mmask32 k, __m512h a, __m512h b, int rounding);
|
||
</pre>
|
||
<pre>VSUBPH __m512h _mm512_sub_round_ph (__m512h a, __m512h b, int rounding);
|
||
</pre>
|
||
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
|
||
¶
|
||
</a></h3>
|
||
<p>Invalid, Underflow, Overflow, Precision, Denormal.</p>
|
||
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
|
||
¶
|
||
</a></h3>
|
||
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p><footer><p>
|
||
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
|
||
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
|
||
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developer’s Manual</a> for anything serious.
|
||
</p></footer></body></html>
|