ia32-64/x86/vpmovqb.vpmovsqb.vpmovusqb.html
2025-07-08 02:23:29 -03:00

289 lines
12 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VPMOVQB/VPMOVSQB/VPMOVUSQB
— Down Convert QWord to Byte</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VPMOVQB/VPMOVSQB/VPMOVUSQB
— Down Convert QWord to Byte</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>EVEX.128.F3.0F38.W0 32 /r VPMOVQB xmm1/m16 {k1}{z}, xmm2</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Converts 2 packed quad-word integers from xmm2 into 2 packed byte integers in xmm1/m16 with truncation under writemask k1.</td></tr>
<tr>
<td>EVEX.128.F3.0F38.W0 22 /r VPMOVSQB xmm1/m16 {k1}{z}, xmm2</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Converts 2 packed signed quad-word integers from xmm2 into 2 packed signed byte integers in xmm1/m16 using signed saturation under writemask k1.</td></tr>
<tr>
<td>EVEX.128.F3.0F38.W0 12 /r VPMOVUSQB xmm1/m16 {k1}{z}, xmm2</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Converts 2 packed unsigned quad-word integers from xmm2 into 2 packed unsigned byte integers in xmm1/m16 using unsigned saturation under writemask k1.</td></tr>
<tr>
<td>EVEX.256.F3.0F38.W0 32 /r VPMOVQB xmm1/m32 {k1}{z}, ymm2</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Converts 4 packed quad-word integers from ymm2 into 4 packed byte integers in xmm1/m32 with truncation under writemask k1.</td></tr>
<tr>
<td>EVEX.256.F3.0F38.W0 22 /r VPMOVSQB xmm1/m32 {k1}{z}, ymm2</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Converts 4 packed signed quad-word integers from ymm2 into 4 packed signed byte integers in xmm1/m32 using signed saturation under writemask k1.</td></tr>
<tr>
<td>EVEX.256.F3.0F38.W0 12 /r VPMOVUSQB xmm1/m32 {k1}{z}, ymm2</td>
<td>A</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Converts 4 packed unsigned quad-word integers from ymm2 into 4 packed unsigned byte integers in xmm1/m32 using unsigned saturation under writemask k1.</td></tr>
<tr>
<td>EVEX.512.F3.0F38.W0 32 /r VPMOVQB xmm1/m64 {k1}{z}, zmm2</td>
<td>A</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Converts 8 packed quad-word integers from zmm2 into 8 packed byte integers in xmm1/m64 with truncation under writemask k1.</td></tr>
<tr>
<td>EVEX.512.F3.0F38.W0 22 /r VPMOVSQB xmm1/m64 {k1}{z}, zmm2</td>
<td>A</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Converts 8 packed signed quad-word integers from zmm2 into 8 packed signed byte integers in xmm1/m64 using signed saturation under writemask k1.</td></tr>
<tr>
<td>EVEX.512.F3.0F38.W0 12 /r VPMOVUSQB xmm1/m64 {k1}{z}, zmm2</td>
<td>A</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Converts 8 packed unsigned quad-word integers from zmm2 into 8 packed unsigned byte integers in xmm1/m64 using unsigned saturation under writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>Eighth Mem</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h3 id="description">Description<a class="anchor" href="#description">
</a></h3>
<p>VPMOVQB down converts 64-bit integer elements in the source operand (the second operand) into packed byte elements using truncation. VPMOVSQB converts signed 64-bit integers into packed signed bytes using signed saturation. VPMOVUSQB convert unsigned quad-word values into unsigned byte values using unsigned saturation. The source operand is a vector register. The destination operand is an XMM register or a memory location.</p>
<p>Down-converted byte elements are written to the destination operand (the first operand) from the least-significant byte. Byte elements of the destination operand are updated according to the writemask. Bits (MAXVL-1:64) of the destination are zeroed.</p>
<p>EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>
<h3 id="operation">Operation<a class="anchor" href="#operation">
</a></h3>
<h4 id="vpmovqb-instruction--evex-encoded-versions--when-dest-is-a-register">VPMOVQB instruction (EVEX encoded versions) when dest is a register<a class="anchor" href="#vpmovqb-instruction--evex-encoded-versions--when-dest-is-a-register">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 8
m := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+7:i] := TruncateQuadWordToByte (SRC[m+63:m])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+7:i] remains unchanged*
ELSE *zeroing-masking*
; zeroing-masking
DEST[i+7:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL/8] := 0;
</pre>
<h4 id="vpmovqb-instruction--evex-encoded-versions--when-dest-is-memory">VPMOVQB instruction (EVEX encoded versions) when dest is memory<a class="anchor" href="#vpmovqb-instruction--evex-encoded-versions--when-dest-is-memory">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 8
m := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+7:i] := TruncateQuadWordToByte (SRC[m+63:m])
ELSE
*DEST[i+7:i] remains unchanged* ; merging-masking
FI;
ENDFOR
</pre>
<h4 id="vpmovsqb-instruction--evex-encoded-versions--when-dest-is-a-register">VPMOVSQB instruction (EVEX encoded versions) when dest is a register<a class="anchor" href="#vpmovsqb-instruction--evex-encoded-versions--when-dest-is-a-register">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 8
m := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+7:i] := SaturateSignedQuadWordToByte (SRC[m+63:m])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+7:i] remains unchanged*
ELSE *zeroing-masking*
; zeroing-masking
DEST[i+7:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL/8] := 0;
</pre>
<h4 id="vpmovsqb-instruction--evex-encoded-versions--when-dest-is-memory">VPMOVSQB instruction (EVEX encoded versions) when dest is memory<a class="anchor" href="#vpmovsqb-instruction--evex-encoded-versions--when-dest-is-memory">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 8
m := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+7:i] := SaturateSignedQuadWordToByte (SRC[m+63:m])
ELSE
*DEST[i+7:i] remains unchanged* ; merging-masking
FI;
ENDFOR
</pre>
<h4 id="vpmovusqb-instruction--evex-encoded-versions--when-dest-is-a-register">VPMOVUSQB instruction (EVEX encoded versions) when dest is a register<a class="anchor" href="#vpmovusqb-instruction--evex-encoded-versions--when-dest-is-a-register">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 8
m := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+7:i] := SaturateUnsignedQuadWordToByte (SRC[m+63:m])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+7:i] remains unchanged*
ELSE *zeroing-masking*
; zeroing-masking
DEST[i+7:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL/8] := 0;
</pre>
<h4 id="vpmovusqb-instruction--evex-encoded-versions--when-dest-is-memory">VPMOVUSQB instruction (EVEX encoded versions) when dest is memory<a class="anchor" href="#vpmovusqb-instruction--evex-encoded-versions--when-dest-is-memory">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 8
m := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+7:i] := SaturateUnsignedQuadWordToByte (SRC[m+63:m])
ELSE
*DEST[i+7:i] remains unchanged* ; merging-masking
FI;
ENDFOR
</pre>
<h3 id="intel-c-c++-compiler-intrinsic-equivalents">Intel C/C++ Compiler Intrinsic Equivalents<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalents">
</a></h3>
<pre>VPMOVQB __m128i _mm512_cvtepi64_epi8( __m512i a);
</pre>
<pre>VPMOVQB __m128i _mm512_mask_cvtepi64_epi8(__m128i s, __mmask8 k, __m512i a);
</pre>
<pre>VPMOVQB __m128i _mm512_maskz_cvtepi64_epi8( __mmask8 k, __m512i a);
</pre>
<pre>VPMOVQB void _mm512_mask_cvtepi64_storeu_epi8(void * d, __mmask8 k, __m512i a);
</pre>
<pre>VPMOVSQB __m128i _mm512_cvtsepi64_epi8( __m512i a);
</pre>
<pre>VPMOVSQB __m128i _mm512_mask_cvtsepi64_epi8(__m128i s, __mmask8 k, __m512i a);
</pre>
<pre>VPMOVSQB __m128i _mm512_maskz_cvtsepi64_epi8( __mmask8 k, __m512i a);
</pre>
<pre>VPMOVSQB void _mm512_mask_cvtsepi64_storeu_epi8(void * d, __mmask8 k, __m512i a);
</pre>
<pre>VPMOVUSQB __m128i _mm512_cvtusepi64_epi8( __m512i a);
</pre>
<pre>VPMOVUSQB __m128i _mm512_mask_cvtusepi64_epi8(__m128i s, __mmask8 k, __m512i a);
</pre>
<pre>VPMOVUSQB __m128i _mm512_maskz_cvtusepi64_epi8( __mmask8 k, __m512i a);
</pre>
<pre>VPMOVUSQB void _mm512_mask_cvtusepi64_storeu_epi8(void * d, __mmask8 k, __m512i a);
</pre>
<pre>VPMOVUSQB __m128i _mm256_cvtusepi64_epi8(__m256i a);
</pre>
<pre>VPMOVUSQB __m128i _mm256_mask_cvtusepi64_epi8(__m128i a, __mmask8 k, __m256i b);
</pre>
<pre>VPMOVUSQB __m128i _mm256_maskz_cvtusepi64_epi8( __mmask8 k, __m256i b);
</pre>
<pre>VPMOVUSQB void _mm256_mask_cvtusepi64_storeu_epi8(void * , __mmask8 k, __m256i b);
</pre>
<pre>VPMOVUSQB __m128i _mm_cvtusepi64_epi8(__m128i a);
</pre>
<pre>VPMOVUSQB __m128i _mm_mask_cvtusepi64_epi8(__m128i a, __mmask8 k, __m128i b);
</pre>
<pre>VPMOVUSQB __m128i _mm_maskz_cvtusepi64_epi8( __mmask8 k, __m128i b);
</pre>
<pre>VPMOVUSQB void _mm_mask_cvtusepi64_storeu_epi8(void * , __mmask8 k, __m128i b);
</pre>
<pre>VPMOVSQB __m128i _mm256_cvtsepi64_epi8(__m256i a);
</pre>
<pre>VPMOVSQB __m128i _mm256_mask_cvtsepi64_epi8(__m128i a, __mmask8 k, __m256i b);
</pre>
<pre>VPMOVSQB __m128i _mm256_maskz_cvtsepi64_epi8( __mmask8 k, __m256i b);
</pre>
<pre>VPMOVSQB void _mm256_mask_cvtsepi64_storeu_epi8(void * , __mmask8 k, __m256i b);
</pre>
<pre>VPMOVSQB __m128i _mm_cvtsepi64_epi8(__m128i a);
</pre>
<pre>VPMOVSQB __m128i _mm_mask_cvtsepi64_epi8(__m128i a, __mmask8 k, __m128i b);
</pre>
<pre>VPMOVSQB __m128i _mm_maskz_cvtsepi64_epi8( __mmask8 k, __m128i b);
</pre>
<pre>VPMOVSQB void _mm_mask_cvtsepi64_storeu_epi8(void * , __mmask8 k, __m128i b);
</pre>
<pre>VPMOVQB __m128i _mm256_cvtepi64_epi8(__m256i a);
</pre>
<pre>VPMOVQB __m128i _mm256_mask_cvtepi64_epi8(__m128i a, __mmask8 k, __m256i b);
</pre>
<pre>VPMOVQB __m128i _mm256_maskz_cvtepi64_epi8( __mmask8 k, __m256i b);
</pre>
<pre>VPMOVQB void _mm256_mask_cvtepi64_storeu_epi8(void * , __mmask8 k, __m256i b);
</pre>
<pre>VPMOVQB __m128i _mm_cvtepi64_epi8(__m128i a);
</pre>
<pre>VPMOVQB __m128i _mm_mask_cvtepi64_epi8(__m128i a, __mmask8 k, __m128i b);
</pre>
<pre>VPMOVQB __m128i _mm_maskz_cvtepi64_epi8( __mmask8 k, __m128i b);
</pre>
<pre>VPMOVQB void _mm_mask_cvtepi64_storeu_epi8(void * , __mmask8 k, __m128i b);
</pre>
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h3>
<p>None.</p>
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h3>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-53</span>, “Type E6 Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>