ia32-64/x86/movdqa.vmovdqa32.vmovdqa64.html
2025-07-08 02:23:29 -03:00

384 lines
16 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>MOVDQA/VMOVDQA32/VMOVDQA64
— Move Aligned Packed Integer Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>MOVDQA/VMOVDQA32/VMOVDQA64
— Move Aligned Packed Integer Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 6F /r MOVDQA xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Move aligned packed integer values from xmm2/mem to xmm1.</td></tr>
<tr>
<td>66 0F 7F /r MOVDQA xmm2/m128, xmm1</td>
<td>B</td>
<td>V/V</td>
<td>SSE2</td>
<td>Move aligned packed integer values from xmm1 to xmm2/mem.</td></tr>
<tr>
<td>VEX.128.66.0F.WIG 6F /r VMOVDQA xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Move aligned packed integer values from xmm2/mem to xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F.WIG 7F /r VMOVDQA xmm2/m128, xmm1</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Move aligned packed integer values from xmm1 to xmm2/mem.</td></tr>
<tr>
<td>VEX.256.66.0F.WIG 6F /r VMOVDQA ymm1, ymm2/m256</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Move aligned packed integer values from ymm2/mem to ymm1.</td></tr>
<tr>
<td>VEX.256.66.0F.WIG 7F /r VMOVDQA ymm2/m256, ymm1</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Move aligned packed integer values from ymm1 to ymm2/mem.</td></tr>
<tr>
<td>EVEX.128.66.0F.W0 6F /r VMOVDQA32 xmm1 {k1}{z}, xmm2/m128</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed doubleword integer values from xmm2/m128 to xmm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W0 6F /r VMOVDQA32 ymm1 {k1}{z}, ymm2/m256</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed doubleword integer values from ymm2/m256 to ymm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W0 6F /r VMOVDQA32 zmm1 {k1}{z}, zmm2/m512</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Move aligned packed doubleword integer values from zmm2/m512 to zmm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.128.66.0F.W0 7F /r VMOVDQA32 xmm2/m128 {k1}{z}, xmm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed doubleword integer values from xmm1 to xmm2/m128 using writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W0 7F /r VMOVDQA32 ymm2/m256 {k1}{z}, ymm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed doubleword integer values from ymm1 to ymm2/m256 using writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W0 7F /r VMOVDQA32 zmm2/m512 {k1}{z}, zmm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Move aligned packed doubleword integer values from zmm1 to zmm2/m512 using writemask k1.</td></tr>
<tr>
<td>EVEX.128.66.0F.W1 6F /r VMOVDQA64 xmm1 {k1}{z}, xmm2/m128</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed quadword integer values from xmm2/m128 to xmm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W1 6F /r VMOVDQA64 ymm1 {k1}{z}, ymm2/m256</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed quadword integer values from ymm2/m256 to ymm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W1 6F /r VMOVDQA64 zmm1 {k1}{z}, zmm2/m512</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Move aligned packed quadword integer values from zmm2/m512 to zmm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.128.66.0F.W1 7F /r VMOVDQA64 xmm2/m128 {k1}{z}, xmm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed quadword integer values from xmm1 to xmm2/m128 using writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W1 7F /r VMOVDQA64 ymm2/m256 {k1}{z}, ymm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed quadword integer values from ymm1 to ymm2/m256 using writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W1 7F /r VMOVDQA64 zmm2/m512 {k1}{z}, zmm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Move aligned packed quadword integer values from zmm1 to zmm2/m512 using writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full Mem</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>D</td>
<td>Full Mem</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p>
<p>EVEX encoded versions:</p>
<p>Moves 128, 256 or 512 bits of packed doubleword/quadword integer values from the source operand (the second operand) to the destination operand (the first operand). This instruction can be used to load a vector register from an int32/int64 memory location, to store the contents of a vector register into an int32/int64 memory location, or to move data between two ZMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 16 (EVEX.128)/32(EVEX.256)/64(EVEX.512)-byte boundary or a general-protection exception (#GP) will be generated. To move integer data to and from unaligned memory locations, use the VMOVDQU instruction.</p>
<p>The destination operand is updated at 32-bit (VMOVDQA32) or 64-bit (VMOVDQA64) granularity according to the writemask.</p>
<p>VEX.256 encoded version:</p>
<p>Moves 256 bits of packed integer values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a YMM register from a 256-bit memory location, to store the contents of a YMM register into a 256-bit memory location, or to move data between two YMM registers.</p>
<p>When the source or destination operand is a memory operand, the operand must be aligned on a 32-byte boundary or a general-protection exception (#GP) will be generated. To move integer data to and from unaligned memory locations, use the VMOVDQU instruction. Bits (MAXVL-1:256) of the destination register are zeroed.</p>
<p>128-bit versions:</p>
<p>Moves 128 bits of packed integer values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, to store the contents of an XMM register into a 128-bit memory location, or to move data between two XMM registers.</p>
<p>When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated. To move integer data to and from unaligned memory locations, use the VMOVDQU instruction.</p>
<p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding ZMM destination register remain unchanged.</p>
<p>VEX.128 encoded version: Bits (MAXVL-1:128) of the destination register are zeroed.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vmovdqa32--evex-encoded-versions--register-copy-form-">VMOVDQA32 (EVEX Encoded Versions, Register-Copy Form)<a class="anchor" href="#vmovdqa32--evex-encoded-versions--register-copy-form-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] := SRC[i+31:i]
ELSE
IF *merging-masking*
; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE DEST[i+31:i] := 0
; zeroing-masking
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vmovdqa32--evex-encoded-versions--store-form-">VMOVDQA32 (EVEX Encoded Versions, Store-Form)<a class="anchor" href="#vmovdqa32--evex-encoded-versions--store-form-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] := SRC[i+31:i]
ELSE *DEST[i+31:i] remains unchanged*
; merging-masking
FI;
ENDFOR;
</pre>
<h3 id="vmovdqa32--evex-encoded-versions--load-form-">VMOVDQA32 (EVEX Encoded Versions, Load-Form)<a class="anchor" href="#vmovdqa32--evex-encoded-versions--load-form-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] := SRC[i+31:i]
ELSE
IF *merging-masking*
THEN *DEST[i+31:i] remains unchanged*
ELSE DEST[i+31:i] := 0 ; zeroing-masking
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vmovdqa64--evex-encoded-versions--register-copy-form-">VMOVDQA64 (EVEX Encoded Versions, Register-Copy Form)<a class="anchor" href="#vmovdqa64--evex-encoded-versions--register-copy-form-">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+63:i] := SRC[i+63:i]
ELSE
IF *merging-masking*
THEN *DEST[i+63:i] remains unchanged*
ELSE DEST[i+63:i] := 0 ; zeroing-masking
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vmovdqa64--evex-encoded-versions--store-form-">VMOVDQA64 (EVEX Encoded Versions, Store-Form)<a class="anchor" href="#vmovdqa64--evex-encoded-versions--store-form-">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+63:i] := SRC[i+63:i]
ELSE *DEST[i+63:i] remains unchanged*
; merging-masking
FI;
ENDFOR;
</pre>
<h3 id="vmovdqa64--evex-encoded-versions--load-form-">VMOVDQA64 (EVEX Encoded Versions, Load-Form)<a class="anchor" href="#vmovdqa64--evex-encoded-versions--load-form-">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+63:i] := SRC[i+63:i]
ELSE
IF *merging-masking*
THEN *DEST[i+63:i] remains unchanged*
ELSE DEST[i+63:i] := 0 ; zeroing-masking
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vmovdqa--vex-256-encoded-version--load---and-register-copy-">VMOVDQA (VEX.256 Encoded Version, Load - and Register Copy)<a class="anchor" href="#vmovdqa--vex-256-encoded-version--load---and-register-copy-">
</a></h3>
<pre>DEST[255:0] := SRC[255:0]
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vmovdqa--vex-256-encoded-version--store-form-">VMOVDQA (VEX.256 Encoded Version, Store-Form)<a class="anchor" href="#vmovdqa--vex-256-encoded-version--store-form-">
</a></h3>
<pre>DEST[255:0] := SRC[255:0]
</pre>
<h3 id="vmovdqa--vex-128-encoded-version-">VMOVDQA (VEX.128 Encoded Version)<a class="anchor" href="#vmovdqa--vex-128-encoded-version-">
</a></h3>
<pre>DEST[127:0] := SRC[127:0]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vmovdqa--128-bit-load--and-register-copy--form-legacy-sse-version-">VMOVDQA (128-bit Load- and Register-Copy- Form Legacy SSE Version)<a class="anchor" href="#vmovdqa--128-bit-load--and-register-copy--form-legacy-sse-version-">
</a></h3>
<pre>DEST[127:0] := SRC[127:0]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="-v-movdqa--128-bit-store-form-version-">(V)MOVDQA (128-bit Store-Form Version)<a class="anchor" href="#-v-movdqa--128-bit-store-form-version-">
</a></h3>
<pre>DEST[127:0] := SRC[127:0]
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VMOVDQA32 __m512i _mm512_load_epi32( void * sa);
</pre>
<pre>VMOVDQA32 __m512i _mm512_mask_load_epi32(__m512i s, __mmask16 k, void * sa);
</pre>
<pre>VMOVDQA32 __m512i _mm512_maskz_load_epi32( __mmask16 k, void * sa);
</pre>
<pre>VMOVDQA32 void _mm512_store_epi32(void * d, __m512i a);
</pre>
<pre>VMOVDQA32 void _mm512_mask_store_epi32(void * d, __mmask16 k, __m512i a);
</pre>
<pre>VMOVDQA32 __m256i _mm256_mask_load_epi32(__m256i s, __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA32 __m256i _mm256_maskz_load_epi32( __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA32 void _mm256_store_epi32(void * d, __m256i a);
</pre>
<pre>VMOVDQA32 void _mm256_mask_store_epi32(void * d, __mmask8 k, __m256i a);
</pre>
<pre>VMOVDQA32 __m128i _mm_mask_load_epi32(__m128i s, __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA32 __m128i _mm_maskz_load_epi32( __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA32 void _mm_store_epi32(void * d, __m128i a);
</pre>
<pre>VMOVDQA32 void _mm_mask_store_epi32(void * d, __mmask8 k, __m128i a);
</pre>
<pre>VMOVDQA64 __m512i _mm512_load_epi64( void * sa);
</pre>
<pre>VMOVDQA64 __m512i _mm512_mask_load_epi64(__m512i s, __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA64 __m512i _mm512_maskz_load_epi64( __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA64 void _mm512_store_epi64(void * d, __m512i a);
</pre>
<pre>VMOVDQA64 void _mm512_mask_store_epi64(void * d, __mmask8 k, __m512i a);
</pre>
<pre>VMOVDQA64 __m256i _mm256_mask_load_epi64(__m256i s, __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA64 __m256i _mm256_maskz_load_epi64( __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA64 void _mm256_store_epi64(void * d, __m256i a);
</pre>
<pre>VMOVDQA64 void _mm256_mask_store_epi64(void * d, __mmask8 k, __m256i a);
</pre>
<pre>VMOVDQA64 __m128i _mm_mask_load_epi64(__m128i s, __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA64 __m128i _mm_maskz_load_epi64( __mmask8 k, void * sa);
</pre>
<pre>VMOVDQA64 void _mm_store_epi64(void * d, __m128i a);
</pre>
<pre>VMOVDQA64 void _mm_mask_store_epi64(void * d, __mmask8 k, __m128i a);
</pre>
<pre>MOVDQA void __m256i _mm256_load_si256 (__m256i * p);
</pre>
<pre>MOVDQA _mm256_store_si256(_m256i *p, __m256i a);
</pre>
<pre>MOVDQA __m128i _mm_load_si128 (__m128i * p);
</pre>
<pre>MOVDQA void _mm_store_si128(__m128i *p, __m128i a);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>Non-EVEX-encoded instruction, see Exceptions Type1.SSE2 in <span class="not-imported">Table 2-18</span>, “Type 1 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-44</span>, “Type E1 Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If EVEX.vvvv != 1111B or VEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>