ia32-64/x86/vaddph.html
2025-07-08 02:23:29 -03:00

131 lines
5.8 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VADDPH
— Add Packed FP16 Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VADDPH
— Add Packed FP16 Values</h1>
<table>
<tr>
<th> Instruction En Bit Mode Flag
Support Instruction En Bit Mode Flag
Support 64/32 CPUID Feature Instruction En Bit Mode Flag CPUID Feature Instruction En Bit Mode Flag Op/ 64/32 CPUID Feature Instruction En Bit Mode Flag 64/32 CPUID Feature Instruction En Bit Mode Flag CPUID Feature Instruction En Bit Mode Flag Op/ 64/32 CPUID Feature </th>
<th></th>
<th>Support</th>
<th></th>
<th>Description</th></tr>
<tr>
<td>EVEX.128.NP.MAP5.W0 58 /r VADDPH xmm1{k1}{z}, xmm2, xmm3/m128/m16bcst</td>
<td>A</td>
<td>V/V</td>
<td>AVX512-FP16 AVX512VL</td>
<td>Add packed FP16 value from xmm3/m128/m16bcst to xmm2, and store result in xmm1 subject to writemask k1.</td></tr>
<tr>
<td>EVEX.256.NP.MAP5.W0 58 /r VADDPH ymm1{k1}{z}, ymm2, ymm3/m256/m16bcst</td>
<td>A</td>
<td>V/V</td>
<td>AVX512-FP16 AVX512VL</td>
<td>Add packed FP16 value from ymm3/m256/m16bcst to ymm2, and store result in ymm1 subject to writemask k1.</td></tr>
<tr>
<td>EVEX.512.NP.MAP5.W0 58 /r VADDPH zmm1{k1}{z}, zmm2, zmm3/m512/m16bcst {er}</td>
<td>A</td>
<td>V/V</td>
<td>AVX512-FP16</td>
<td>Add packed FP16 value from zmm3/m512/m16bcst to zmm2, and store result in zmm1 subject to writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h3 id="description">Description<a class="anchor" href="#description">
</a></h3>
<p>This instruction adds packed FP16 values from source operands and stores the packed FP16 result in the destination operand. The destination elements are updated according to the writemask.</p>
<h3 id="operation">Operation<a class="anchor" href="#operation">
</a></h3>
<h4 id="vaddph--evex-encoded-versions--when-src2-operand-is-a-register">VADDPH (EVEX Encoded Versions) When SRC2 Operand is a Register<a class="anchor" href="#vaddph--evex-encoded-versions--when-src2-operand-is-a-register">
</a></h4>
<pre>VL = 128, 256 or 512
KL := VL/16
IF (VL = 512) AND (EVEX.b = 1):
SET_RM(EVEX.RC)
ELSE
SET_RM(MXCSR.RC)
FOR j := 0 TO KL-1:
IF k1[j] OR *no writemask*:
DEST.fp16[j] := SRC1.fp16[j] + SRC2.fp16[j]
ELSEIF *zeroing*:
DEST.fp16[j] := 0
// else dest.fp16[j] remains unchanged
DEST[MAXVL-1:VL] := 0
</pre>
<h4 id="vaddph--evex-encoded-versions--when-src2-operand-is-a-memory-source">VADDPH (EVEX Encoded Versions) When SRC2 Operand is a Memory Source<a class="anchor" href="#vaddph--evex-encoded-versions--when-src2-operand-is-a-memory-source">
</a></h4>
<pre>VL = 128, 256 or 512
KL := VL/16
FOR j := 0 TO KL-1:
IF k1[j] OR *no writemask*:
IF EVEX.b = 1:
DEST.fp16[j] := SRC1.fp16[j] + SRC2.fp16[0]
ELSE:
DEST.fp16[j] := SRC1.fp16[j] + SRC2.fp16[j]
ELSE IF *zeroing*:
DEST.fp16[j] := 0
// else dest.fp16[j] remains unchanged
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h3>
<pre>VADDPH __m128h _mm_add_ph (__m128h a, __m128h b);
</pre>
<pre>VADDPH __m128h _mm_mask_add_ph (__m128h src, __mmask8 k, __m128h a, __m128h b);
</pre>
<pre>VADDPH __m128h _mm_maskz_add_ph (__mmask8 k, __m128h a, __m128h b);
</pre>
<pre>VADDPH __m256h _mm256_add_ph (__m256h a, __m256h b);
</pre>
<pre>VADDPH __m256h _mm256_mask_add_ph (__m256h src, __mmask16 k, __m256h a, __m256h b);
</pre>
<pre>VADDPH __m256h _mm256_maskz_add_ph (__mmask16 k, __m256h a, __m256h b);
</pre>
<pre>VADDPH __m512h _mm512_add_ph (__m512h a, __m512h b);
</pre>
<pre>VADDPH __m512h _mm512_add_ph (__m512h a, __m512h b);
</pre>
<pre>VADDPH __m512h _mm512_mask_add_ph (__m512h src, __mmask32 k, __m512h a, __m512h b);
</pre>
<pre>VADDPH __m512h _mm512_maskz_add_ph (__mmask32 k, __m512h a, __m512h b);
</pre>
<pre>VADDPH __m512h _mm512_add_round_ph (__m512h a, __m512h b, int rounding);
</pre>
<pre>VADDPH __m512h _mm512_mask_add_round_ph (__m512h src, __mmask32 k, __m512h a, __m512h b, int rounding);
</pre>
<pre>VADDPH __m512h _mm512_maskz_add_round_ph (__mmask32 k, __m512h a, __m512h b, int rounding);
</pre>
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h3>
<p>Invalid, Underflow, Overflow, Precision, Denormal.</p>
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h3>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>