ia32-64/x86/movaps.html
2025-07-08 02:23:29 -03:00

265 lines
12 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>MOVAPS
— Move Aligned Packed Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>MOVAPS
— Move Aligned Packed Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 28 /r MOVAPS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE</td>
<td>Move aligned packed single precision floating-point values from xmm2/mem to xmm1.</td></tr>
<tr>
<td>NP 0F 29 /r MOVAPS xmm2/m128, xmm1</td>
<td>B</td>
<td>V/V</td>
<td>SSE</td>
<td>Move aligned packed single precision floating-point values from xmm1 to xmm2/mem.</td></tr>
<tr>
<td>VEX.128.0F.WIG 28 /r VMOVAPS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Move aligned packed single precision floating-point values from xmm2/mem to xmm1.</td></tr>
<tr>
<td>VEX.128.0F.WIG 29 /r VMOVAPS xmm2/m128, xmm1</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Move aligned packed single precision floating-point values from xmm1 to xmm2/mem.</td></tr>
<tr>
<td>VEX.256.0F.WIG 28 /r VMOVAPS ymm1, ymm2/m256</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Move aligned packed single precision floating-point values from ymm2/mem to ymm1.</td></tr>
<tr>
<td>VEX.256.0F.WIG 29 /r VMOVAPS ymm2/m256, ymm1</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Move aligned packed single precision floating-point values from ymm1 to ymm2/mem.</td></tr>
<tr>
<td>EVEX.128.0F.W0 28 /r VMOVAPS xmm1 {k1}{z}, xmm2/m128</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed single precision floating-point values from xmm2/m128 to xmm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.256.0F.W0 28 /r VMOVAPS ymm1 {k1}{z}, ymm2/m256</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed single precision floating-point values from ymm2/m256 to ymm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.512.0F.W0 28 /r VMOVAPS zmm1 {k1}{z}, zmm2/m512</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Move aligned packed single precision floating-point values from zmm2/m512 to zmm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.128.0F.W0 29 /r VMOVAPS xmm2/m128 {k1}{z}, xmm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed single precision floating-point values from xmm1 to xmm2/m128 using writemask k1.</td></tr>
<tr>
<td>EVEX.256.0F.W0 29 /r VMOVAPS ymm2/m256 {k1}{z}, ymm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move aligned packed single precision floating-point values from ymm1 to ymm2/m256 using writemask k1.</td></tr>
<tr>
<td>EVEX.512.0F.W0 29 /r VMOVAPS zmm2/m512 {k1}{z}, zmm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Move aligned packed single precision floating-point values from zmm1 to zmm2/m512 using writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full Mem</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>D</td>
<td>Full Mem</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Moves 4, 8 or 16 single precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM, YMM or ZMM register from an 128-bit, 256-bit or 512-bit memory location, to store the contents of an XMM, YMM or ZMM register into a 128-bit, 256-bit or 512-bit memory location, or to move data between two XMM, two YMM or two ZMM registers.</p>
<p>When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte (128-bit version), 32-byte (VEX.256 encoded version) or 64-byte (EVEX.512 encoded version) boundary or a general-protection exception (#GP) will be generated. For EVEX.512 encoded versions, the operand must be aligned to the size of the memory operand. To move single precision floating-point values to and from unaligned memory locations, use the VMOVUPS instruction.</p>
<p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p>
<p>EVEX.512 encoded version:</p>
<p>Moves 512 bits of packed single precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a ZMM register from a 512-bit float32 memory location, to store the contents of a ZMM register into a float32 memory location, or to move data between two ZMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 64-byte boundary or a general-protection exception (#GP) will be generated. To move single precision floating-point values to and from unaligned memory locations, use the VMOVUPS instruction.</p>
<p>VEX.256 and EVEX.256 encoded version:</p>
<p>Moves 256 bits of packed single precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a YMM register from a 256-bit memory location, to store the contents of a YMM register into a 256-bit memory location, or to move data between two YMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 32-byte boundary or a general-protection exception (#GP) will be generated.</p>
<p>128-bit versions:</p>
<p>Moves 128 bits of packed single precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, to store the contents of an XMM register into a 128-bit memory location, or to move data between two XMM registers. When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated. To move single precision floating-point values to and from unaligned memory locations, use the VMOVUPS instruction.</p>
<p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding ZMM destination register remain unchanged.</p>
<p>(E)VEX.128 encoded version: Bits (MAXVL-1:128) of the destination ZMM register are zeroed.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vmovaps--evex-encoded-versions--register-copy-form-">VMOVAPS (EVEX Encoded Versions, Register-Copy Form)<a class="anchor" href="#vmovaps--evex-encoded-versions--register-copy-form-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] := SRC[i+31:i]
ELSE
IF *merging-masking*
THEN *DEST[i+31:i] remains unchanged*
ELSE DEST[i+31:i] := 0 ; zeroing-masking
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vmovaps--evex-encoded-versions--store-form-">VMOVAPS (EVEX Encoded Versions, Store Form)<a class="anchor" href="#vmovaps--evex-encoded-versions--store-form-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] :=
SRC[i+31:i]
ELSE *DEST[i+31:i] remains unchanged*
; merging-masking
ENDFOR;
</pre>
<h3 id="vmovaps--evex-encoded-versions--load-form-">VMOVAPS (EVEX Encoded Versions, Load Form)<a class="anchor" href="#vmovaps--evex-encoded-versions--load-form-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] := SRC[i+31:i]
ELSE
IF *merging-masking*
THEN *DEST[i+31:i] remains unchanged*
ELSE DEST[i+31:i] := 0 ; zeroing-masking
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vmovaps--vex-256-encoded-version--load---and-register-copy-">VMOVAPS (VEX.256 Encoded Version, Load - and Register Copy)<a class="anchor" href="#vmovaps--vex-256-encoded-version--load---and-register-copy-">
</a></h3>
<pre>DEST[255:0] := SRC[255:0]
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vmovaps--vex-256-encoded-version--store-form-">VMOVAPS (VEX.256 Encoded Version, Store-Form)<a class="anchor" href="#vmovaps--vex-256-encoded-version--store-form-">
</a></h3>
<pre>DEST[255:0] := SRC[255:0]
</pre>
<h3 id="vmovaps--vex-128-encoded-version--load---and-register-copy-">VMOVAPS (VEX.128 Encoded Version, Load - and Register Copy)<a class="anchor" href="#vmovaps--vex-128-encoded-version--load---and-register-copy-">
</a></h3>
<pre>DEST[127:0] := SRC[127:0]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="movaps--128-bit-load--and-register-copy--form-legacy-sse-version-">MOVAPS (128-bit Load- and Register-Copy- Form Legacy SSE Version)<a class="anchor" href="#movaps--128-bit-load--and-register-copy--form-legacy-sse-version-">
</a></h3>
<pre>DEST[127:0] := SRC[127:0]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="-v-movaps--128-bit-store-form-version-">(V)MOVAPS (128-bit Store-Form Version)<a class="anchor" href="#-v-movaps--128-bit-store-form-version-">
</a></h3>
<pre>DEST[127:0] := SRC[127:0]
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VMOVAPS __m512 _mm512_load_ps( void * m);
</pre>
<pre>VMOVAPS __m512 _mm512_mask_load_ps(__m512 s, __mmask16 k, void * m);
</pre>
<pre>VMOVAPS __m512 _mm512_maskz_load_ps( __mmask16 k, void * m);
</pre>
<pre>VMOVAPS void _mm512_store_ps( void * d, __m512 a);
</pre>
<pre>VMOVAPS void _mm512_mask_store_ps( void * d, __mmask16 k, __m512 a);
</pre>
<pre>VMOVAPS __m256 _mm256_mask_load_ps(__m256 a, __mmask8 k, void * s);
</pre>
<pre>VMOVAPS __m256 _mm256_maskz_load_ps( __mmask8 k, void * s);
</pre>
<pre>VMOVAPS void _mm256_mask_store_ps( void * d, __mmask8 k, __m256 a);
</pre>
<pre>VMOVAPS __m128 _mm_mask_load_ps(__m128 a, __mmask8 k, void * s);
</pre>
<pre>VMOVAPS __m128 _mm_maskz_load_ps( __mmask8 k, void * s);
</pre>
<pre>VMOVAPS void _mm_mask_store_ps( void * d, __mmask8 k, __m128 a);
</pre>
<pre>MOVAPS __m256 _mm256_load_ps (float * p);
</pre>
<pre>MOVAPS void _mm256_store_ps(float * p, __m256 a);
</pre>
<pre>MOVAPS __m128 _mm_load_ps (float * p);
</pre>
<pre>MOVAPS void _mm_store_ps(float * p, __m128 a);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>Non-EVEX-encoded instruction, see Exceptions Type1.SSE in <span class="not-imported">Table 2-18</span>, “Type 1 Class Exception Conditions,” additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv != 1111B.</td></tr></table>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-44</span>, “Type E1 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>