forked from NRZCode/ia32-64
188 lines
8.9 KiB
HTML
188 lines
8.9 KiB
HTML
<!DOCTYPE html>
|
||
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VPCMPW/VPCMPUW
|
||
— Compare Packed Word Values Into Mask</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VPCMPW/VPCMPUW
|
||
— Compare Packed Word Values Into Mask</h1>
|
||
|
||
|
||
<table>
|
||
<tr>
|
||
<th>Opcode/Instruction</th>
|
||
<th>Op/En</th>
|
||
<th>64/32 bit Mode Support</th>
|
||
<th>CPUID Feature Flag</th>
|
||
<th>Description</th></tr>
|
||
<tr>
|
||
<td>EVEX.128.66.0F3A.W1 3F /r ib VPCMPW k1 {k2}, xmm2, xmm3/m128, imm8</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>AVX512VL AVX512BW</td>
|
||
<td>Compare packed signed word integers in xmm3/m128 and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.256.66.0F3A.W1 3F /r ib VPCMPW k1 {k2}, ymm2, ymm3/m256, imm8</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>AVX512VL AVX512BW</td>
|
||
<td>Compare packed signed word integers in ymm3/m256 and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.512.66.0F3A.W1 3F /r ib VPCMPW k1 {k2}, zmm2, zmm3/m512, imm8</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>AVX512BW</td>
|
||
<td>Compare packed signed word integers in zmm3/m512 and zmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.128.66.0F3A.W1 3E /r ib VPCMPUW k1 {k2}, xmm2, xmm3/m128, imm8</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>AVX512VL AVX512BW</td>
|
||
<td>Compare packed unsigned word integers in xmm3/m128 and xmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.256.66.0F3A.W1 3E /r ib VPCMPUW k1 {k2}, ymm2, ymm3/m256, imm8</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>AVX512VL AVX512BW</td>
|
||
<td>Compare packed unsigned word integers in ymm3/m256 and ymm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.512.66.0F3A.W1 3E /r ib VPCMPUW k1 {k2}, zmm2, zmm3/m512, imm8</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>AVX512BW</td>
|
||
<td>Compare packed unsigned word integers in zmm3/m512 and zmm2 using bits 2:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr></table>
|
||
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
|
||
¶
|
||
</a></h2>
|
||
<table>
|
||
<tr>
|
||
<th>Op/En</th>
|
||
<th>Tuple Type</th>
|
||
<th>Operand 1</th>
|
||
<th>Operand 2</th>
|
||
<th>Operand 3</th>
|
||
<th>Operand 4</th></tr>
|
||
<tr>
|
||
<td>A</td>
|
||
<td>Full Mem</td>
|
||
<td>ModRM:reg (w)</td>
|
||
<td>EVEX.vvvv (r)</td>
|
||
<td>ModRM:r/m (r)</td>
|
||
<td>N/A</td></tr></table>
|
||
<h3 id="description">Description<a class="anchor" href="#description">
|
||
¶
|
||
</a></h3>
|
||
<p>Performs a SIMD compare of the packed integer word in the second source operand and the first source operand and returns the results of the comparison to the mask destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands. The result of each comparison is a single mask bit result of 1 (comparison true) or 0 (comparison false).</p>
|
||
<p>VPCMPW performs a comparison between pairs of signed word values.</p>
|
||
<p>VPCMPUW performs a comparison between pairs of unsigned word values.</p>
|
||
<p>The first source operand (second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register or a 512/256/128-bit memory location. The destination operand (first operand) is a mask register k1. Up to 32/16/8 comparisons are performed with results written to the destination operand under the writemask k2.</p>
|
||
<p>The comparison predicate operand is an 8-bit immediate: bits 2:0 define the type of comparison to be performed. Bits 3 through 7 of the immediate are reserved. Compiler can implement the pseudo-op mnemonic listed in Table 5-21.</p>
|
||
<h3 id="operation">Operation<a class="anchor" href="#operation">
|
||
¶
|
||
</a></h3>
|
||
<pre>CASE (COMPARISON PREDICATE) OF
|
||
0: OP := EQ;
|
||
1: OP := LT;
|
||
2: OP := LE;
|
||
3: OP := FALSE;
|
||
4: OP := NEQ;
|
||
5: OP := NLT;
|
||
6: OP := NLE;
|
||
7: OP := TRUE;
|
||
ESAC;
|
||
</pre>
|
||
<h4 id="vpcmpw--evex-encoded-versions-">VPCMPW (EVEX encoded versions)<a class="anchor" href="#vpcmpw--evex-encoded-versions-">
|
||
¶
|
||
</a></h4>
|
||
<pre>(KL, VL) = (8, 128), (16, 256), (32, 512)
|
||
FOR j := 0 TO KL-1
|
||
i := j * 16
|
||
IF k2[j] OR *no writemask*
|
||
THEN
|
||
ICMP := SRC1[i+15:i] OP SRC2[i+15:i];
|
||
IF CMP = TRUE
|
||
THEN DEST[j] := 1;
|
||
ELSE DEST[j] := 0; FI;
|
||
ELSE DEST[j] = 0
|
||
; zeroing-masking only
|
||
FI;
|
||
ENDFOR
|
||
DEST[MAX_KL-1:KL] := 0
|
||
</pre>
|
||
<h4 id="vpcmpuw--evex-encoded-versions-">VPCMPUW (EVEX encoded versions)<a class="anchor" href="#vpcmpuw--evex-encoded-versions-">
|
||
¶
|
||
</a></h4>
|
||
<pre>(KL, VL) = (8, 128), (16, 256), (32, 512)
|
||
FOR j := 0 TO KL-1
|
||
i := j * 16
|
||
IF k2[j] OR *no writemask*
|
||
THEN
|
||
CMP := SRC1[i+15:i] OP SRC2[i+15:i];
|
||
IF CMP = TRUE
|
||
THEN DEST[j] := 1;
|
||
ELSE DEST[j] := 0; FI;
|
||
ELSE DEST[j] = 0
|
||
; zeroing-masking only
|
||
FI;
|
||
ENDFOR
|
||
DEST[MAX_KL-1:KL] := 0
|
||
</pre>
|
||
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
|
||
¶
|
||
</a></h3>
|
||
<pre>VPCMPW __mmask32 _mm512_cmp_epi16_mask( __m512i a, __m512i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPW __mmask32 _mm512_mask_cmp_epi16_mask( __mmask32 m, __m512i a, __m512i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPW __mmask16 _mm256_cmp_epi16_mask( __m256i a, __m256i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPW __mmask16 _mm256_mask_cmp_epi16_mask( __mmask16 m, __m256i a, __m256i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPW __mmask8 _mm_cmp_epi16_mask( __m128i a, __m128i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPW __mmask8 _mm_mask_cmp_epi16_mask( __mmask8 m, __m128i a, __m128i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPW __mmask32 _mm512_cmp[eq|ge|gt|le|lt|neq]_epi16_mask( __m512i a, __m512i b);
|
||
</pre>
|
||
<pre>VPCMPW __mmask32 _mm512_mask_cmp[eq|ge|gt|le|lt|neq]_epi16_mask( __mmask32 m, __m512i a, __m512i b);
|
||
</pre>
|
||
<pre>VPCMPW __mmask16 _mm256_cmp[eq|ge|gt|le|lt|neq]_epi16_mask( __m256i a, __m256i b);
|
||
</pre>
|
||
<pre>VPCMPW __mmask16 _mm256_mask_cmp[eq|ge|gt|le|lt|neq]_epi16_mask( __mmask16 m, __m256i a, __m256i b);
|
||
</pre>
|
||
<pre>VPCMPW __mmask8 _mm_cmp[eq|ge|gt|le|lt|neq]_epi16_mask( __m128i a, __m128i b);
|
||
</pre>
|
||
<pre>VPCMPW __mmask8 _mm_mask_cmp[eq|ge|gt|le|lt|neq]_epi16_mask( __mmask8 m, __m128i a, __m128i b);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask32 _mm512_cmp_epu16_mask( __m512i a, __m512i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask32 _mm512_mask_cmp_epu16_mask( __mmask32 m, __m512i a, __m512i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask16 _mm256_cmp_epu16_mask( __m256i a, __m256i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask16 _mm256_mask_cmp_epu16_mask( __mmask16 m, __m256i a, __m256i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask8 _mm_cmp_epu16_mask( __m128i a, __m128i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask8 _mm_mask_cmp_epu16_mask( __mmask8 m, __m128i a, __m128i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask32 _mm512_cmp[eq|ge|gt|le|lt|neq]_epu16_mask( __m512i a, __m512i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask32 _mm512_mask_cmp[eq|ge|gt|le|lt|neq]_epu16_mask( __mmask32 m, __m512i a, __m512i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask16 _mm256_cmp[eq|ge|gt|le|lt|neq]_epu16_mask( __m256i a, __m256i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask16 _mm256_mask_cmp[eq|ge|gt|le|lt|neq]_epu16_mask( __mmask16 m, __m256i a, __m256i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask8 _mm_cmp[eq|ge|gt|le|lt|neq]_epu16_mask( __m128i a, __m128i b, int cmp);
|
||
</pre>
|
||
<pre>VPCMPUW __mmask8 _mm_mask_cmp[eq|ge|gt|le|lt|neq]_epu16_mask( __mmask8 m, __m128i a, __m128i b, int cmp);
|
||
</pre>
|
||
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
|
||
¶
|
||
</a></h3>
|
||
<p>None</p>
|
||
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
|
||
¶
|
||
</a></h3>
|
||
<p>EVEX-encoded instruction, see Exceptions Type E4.nb in <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p><footer><p>
|
||
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
|
||
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
|
||
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developer’s Manual</a> for anything serious.
|
||
</p></footer></body></html>
|