ia32-64/x86/gf2p8mulb.html
2025-07-08 02:23:29 -03:00

162 lines
6.6 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>GF2P8MULB
— Galois Field Multiply Bytes</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>GF2P8MULB
— Galois Field Multiply Bytes</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F38 CF /r GF2P8MULB xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>GFNI</td>
<td>Multiplies elements in the finite field GF(2^8).</td></tr>
<tr>
<td>VEX.128.66.0F38.W0 CF /r VGF2P8MULB xmm1, xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AVX GFNI</td>
<td>Multiplies elements in the finite field GF(2^8).</td></tr>
<tr>
<td>VEX.256.66.0F38.W0 CF /r VGF2P8MULB ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>AVX GFNI</td>
<td>Multiplies elements in the finite field GF(2^8).</td></tr>
<tr>
<td>EVEX.128.66.0F38.W0 CF /r VGF2P8MULB xmm1{k1}{z}, xmm2, xmm3/m128</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL GFNI</td>
<td>Multiplies elements in the finite field GF(2^8).</td></tr>
<tr>
<td>EVEX.256.66.0F38.W0 CF /r VGF2P8MULB ymm1{k1}{z}, ymm2, ymm3/m256</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL GFNI</td>
<td>Multiplies elements in the finite field GF(2^8).</td></tr>
<tr>
<td>EVEX.512.66.0F38.W0 CF /r VGF2P8MULB zmm1{k1}{z}, zmm2, zmm3/m512</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F GFNI</td>
<td>Multiplies elements in the finite field GF(2^8).</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full Mem</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h3 id="description">Description<a class="anchor" href="#description">
</a></h3>
<p>The instruction multiplies elements in the finite field GF(2<sup>8</sup>), operating on a byte (field element) in the first source operand and the corresponding byte in a second source operand. The field GF(2<sup>8</sup>) is represented in polynomial representation with the reduction polynomial x<sup>8</sup> + x<sup>4</sup> + x<sup>3</sup> + x + 1.</p>
<p>This instruction does not support broadcasting.</p>
<p>The EVEX encoded form of this instruction supports memory fault suppression. The SSE encoded forms of the instruction require16B alignment on their memory operations.</p>
<h3 id="operation">Operation<a class="anchor" href="#operation">
</a></h3>
<pre>define gf2p8mul_byte(src1byte, src2byte):
tword := 0
FOR i := 0 to 7:
IF src2byte.bit[i]:
tword := tword XOR (src1byte&lt;&lt; i)
* carry out polynomial reduction by the characteristic polynomial p*
FOR i := 14 downto 8:
p := 0x11B &lt;&lt; (i-8)
*0x11B = 0000_0001_0001_1011 in binary*
IF tword.bit[i]:
tword := tword XOR p
return tword.byte[0]
</pre>
<h4 id="vgf2p8mulb-dest--src1--src2--evex-encoded-version-">VGF2P8MULB dest, src1, src2 (EVEX Encoded Version)<a class="anchor" href="#vgf2p8mulb-dest--src1--src2--evex-encoded-version-">
</a></h4>
<pre>(KL, VL) = (16, 128), (32, 256), (64, 512)
FOR j := 0 TO KL-1:
IF k1[j] OR *no writemask*:
DEST.byte[j] := gf2p8mul_byte(SRC1.byte[j], SRC2.byte[j])
ELSE iF *zeroing*:
DEST.byte[j] := 0
* ELSE DEST.byte[j] remains unchanged*
DEST[MAX_VL-1:VL] := 0
</pre>
<h4 id="vgf2p8mulb-dest--src1--src2--128b-and-256b-vex-encoded-versions-">VGF2P8MULB dest, src1, src2 (128b and 256b VEX Encoded Versions)<a class="anchor" href="#vgf2p8mulb-dest--src1--src2--128b-and-256b-vex-encoded-versions-">
</a></h4>
<pre>(KL, VL) = (16, 128), (32, 256)
FOR j := 0 TO KL-1:
DEST.byte[j] := gf2p8mul_byte(SRC1.byte[j], SRC2.byte[j])
DEST[MAX_VL-1:VL] := 0
</pre>
<h4 id="gf2p8mulb-srcdest--src1--128b-sse-encoded-version-">GF2P8MULB srcdest, src1 (128b SSE Encoded Version)<a class="anchor" href="#gf2p8mulb-srcdest--src1--128b-sse-encoded-version-">
</a></h4>
<pre>FOR j := 0 TO 15:
SRCDEST.byte[j] :=gf2p8mul_byte(SRCDEST.byte[j], SRC1.byte[j])
</pre>
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h3>
<pre>(V)GF2P8MULB __m128i _mm_gf2p8mul_epi8(__m128i, __m128i);
</pre>
<pre>(V)GF2P8MULB __m128i _mm_mask_gf2p8mul_epi8(__m128i, __mmask16, __m128i, __m128i);
</pre>
<pre>(V)GF2P8MULB __m128i _mm_maskz_gf2p8mul_epi8(__mmask16, __m128i, __m128i);
</pre>
<pre>VGF2P8MULB __m256i _mm256_gf2p8mul_epi8(__m256i, __m256i);
</pre>
<pre>VGF2P8MULB __m256i _mm256_mask_gf2p8mul_epi8(__m256i, __mmask32, __m256i, __m256i);
</pre>
<pre>VGF2P8MULB __m256i _mm256_maskz_gf2p8mul_epi8(__mmask32, __m256i, __m256i);
</pre>
<pre>VGF2P8MULB __m512i _mm512_gf2p8mul_epi8(__m512i, __m512i);
</pre>
<pre>VGF2P8MULB __m512i _mm512_mask_gf2p8mul_epi8(__m512i, __mmask64, __m512i, __m512i);
</pre>
<pre>VGF2P8MULB __m512i _mm512_maskz_gf2p8mul_epi8(__mmask64, __m512i, __m512i);
</pre>
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h3>
<p>None.</p>
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h3>
<p>Legacy-encoded and VEX-encoded: See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>EVEX-encoded: See <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>