ia32-64/x86/movups.html
2025-07-08 02:23:29 -03:00

266 lines
11 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>MOVUPS
— Move Unaligned Packed Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>MOVUPS
— Move Unaligned Packed Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 10 /r MOVUPS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE</td>
<td>Move unaligned packed single precision floating-point from xmm2/mem to xmm1.</td></tr>
<tr>
<td>NP 0F 11 /r MOVUPS xmm2/m128, xmm1</td>
<td>B</td>
<td>V/V</td>
<td>SSE</td>
<td>Move unaligned packed single precision floating-point from xmm1 to xmm2/mem.</td></tr>
<tr>
<td>VEX.128.0F.WIG 10 /r VMOVUPS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Move unaligned packed single precision floating-point from xmm2/mem to xmm1.</td></tr>
<tr>
<td>VEX.128.0F.WIG 11 /r VMOVUPS xmm2/m128, xmm1</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Move unaligned packed single precision floating-point from xmm1 to xmm2/mem.</td></tr>
<tr>
<td>VEX.256.0F.WIG 10 /r VMOVUPS ymm1, ymm2/m256</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Move unaligned packed single precision floating-point from ymm2/mem to ymm1.</td></tr>
<tr>
<td>VEX.256.0F.WIG 11 /r VMOVUPS ymm2/m256, ymm1</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Move unaligned packed single precision floating-point from ymm1 to ymm2/mem.</td></tr>
<tr>
<td>EVEX.128.0F.W0 10 /r VMOVUPS xmm1 {k1}{z}, xmm2/m128</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move unaligned packed single precision floating-point values from xmm2/m128 to xmm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.256.0F.W0 10 /r VMOVUPS ymm1 {k1}{z}, ymm2/m256</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move unaligned packed single precision floating-point values from ymm2/m256 to ymm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.512.0F.W0 10 /r VMOVUPS zmm1 {k1}{z}, zmm2/m512</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Move unaligned packed single precision floating-point values from zmm2/m512 to zmm1 using writemask k1.</td></tr>
<tr>
<td>EVEX.128.0F.W0 11 /r VMOVUPS xmm2/m128 {k1}{z}, xmm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move unaligned packed single precision floating-point values from xmm1 to xmm2/m128 using writemask k1.</td></tr>
<tr>
<td>EVEX.256.0F.W0 11 /r VMOVUPS ymm2/m256 {k1}{z}, ymm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move unaligned packed single precision floating-point values from ymm1 to ymm2/m256 using writemask k1.</td></tr>
<tr>
<td>EVEX.512.0F.W0 11 /r VMOVUPS zmm2/m512 {k1}{z}, zmm1</td>
<td>D</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Move unaligned packed single precision floating-point values from zmm1 to zmm2/m512 using writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full Mem</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>D</td>
<td>Full Mem</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Note: VEX.vvvv and EVEX.vvvv is reserved and must be 1111b otherwise instructions will #UD.</p>
<p>EVEX.512 encoded version:</p>
<p>Moves 512 bits of packed single precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a ZMM register from a 512-bit float32 memory location, to store the contents of a ZMM register into memory. The destination operand is updated according to the writemask.</p>
<p>VEX.256 and EVEX.256 encoded versions:</p>
<p>Moves 256 bits of packed single precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load a YMM register from a 256-bit memory location, to store the contents of a YMM register into a 256-bit memory location, or to move data between two YMM registers. Bits (MAXVL-1:256) of the destination register are zeroed.</p>
<p>128-bit versions:</p>
<p>Moves 128 bits of packed single precision floating-point values from the source operand (second operand) to the destination operand (first operand). This instruction can be used to load an XMM register from a 128-bit memory location, to store the contents of an XMM register into a 128-bit memory location, or to move data between two XMM registers.</p>
<p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding destination register remain unchanged.</p>
<p>When the source or destination operand is a memory operand, the operand may be unaligned without causing a general-protection exception (#GP) to be generated.</p>
<p>VEX.128 and EVEX.128 encoded versions: Bits (MAXVL-1:128) of the destination register are zeroed.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vmovups--evex-encoded-versions--register-copy-form-">VMOVUPS (EVEX Encoded Versions, Register-Copy Form)<a class="anchor" href="#vmovups--evex-encoded-versions--register-copy-form-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] := SRC[i+31:i]
ELSE
IF *merging-masking*
THEN *DEST[i+31:i] remains unchanged*
ELSE DEST[i+31:i] := 0 ; zeroing-masking
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vmovups--evex-encoded-versions--store-form-">VMOVUPS (EVEX Encoded Versions, Store-Form)<a class="anchor" href="#vmovups--evex-encoded-versions--store-form-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] := SRC[i+31:i]
ELSE *DEST[i+31:i] remains unchanged*
; merging-masking
FI;
ENDFOR;
</pre>
<h3 id="vmovups--evex-encoded-versions--load-form-">VMOVUPS (EVEX Encoded Versions, Load-Form)<a class="anchor" href="#vmovups--evex-encoded-versions--load-form-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] := SRC[i+31:i]
ELSE
IF *merging-masking*
THEN *DEST[i+31:i] remains unchanged*
ELSE DEST[i+31:i] := 0 ; zeroing-masking
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vmovups--vex-256-encoded-version--load---and-register-copy-">VMOVUPS (VEX.256 Encoded Version, Load - and Register Copy)<a class="anchor" href="#vmovups--vex-256-encoded-version--load---and-register-copy-">
</a></h3>
<pre>DEST[255:0] := SRC[255:0]
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vmovups--vex-256-encoded-version--store-form-">VMOVUPS (VEX.256 Encoded Version, Store-Form)<a class="anchor" href="#vmovups--vex-256-encoded-version--store-form-">
</a></h3>
<pre>DEST[255:0] := SRC[255:0]
</pre>
<h3 id="vmovups--vex-128-encoded-version-">VMOVUPS (VEX.128 Encoded Version)<a class="anchor" href="#vmovups--vex-128-encoded-version-">
</a></h3>
<pre>DEST[127:0] := SRC[127:0]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="movups--128-bit-load--and-register-copy--form-legacy-sse-version-">MOVUPS (128-bit Load- and Register-Copy- Form Legacy SSE Version)<a class="anchor" href="#movups--128-bit-load--and-register-copy--form-legacy-sse-version-">
</a></h3>
<pre>DEST[127:0] := SRC[127:0]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="-v-movups--128-bit-store-form-version-">(V)MOVUPS (128-bit Store-Form Version)<a class="anchor" href="#-v-movups--128-bit-store-form-version-">
</a></h3>
<pre>DEST[127:0] := SRC[127:0]
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VMOVUPS __m512 _mm512_loadu_ps( void * s);
</pre>
<pre>VMOVUPS __m512 _mm512_mask_loadu_ps(__m512 a, __mmask16 k, void * s);
</pre>
<pre>VMOVUPS __m512 _mm512_maskz_loadu_ps( __mmask16 k, void * s);
</pre>
<pre>VMOVUPS void _mm512_storeu_ps( void * d, __m512 a);
</pre>
<pre>VMOVUPS void _mm512_mask_storeu_ps( void * d, __mmask8 k, __m512 a);
</pre>
<pre>VMOVUPS __m256 _mm256_mask_loadu_ps(__m256 a, __mmask8 k, void * s);
</pre>
<pre>VMOVUPS __m256 _mm256_maskz_loadu_ps( __mmask8 k, void * s);
</pre>
<pre>VMOVUPS void _mm256_mask_storeu_ps( void * d, __mmask8 k, __m256 a);
</pre>
<pre>VMOVUPS __m128 _mm_mask_loadu_ps(__m128 a, __mmask8 k, void * s);
</pre>
<pre>VMOVUPS __m128 _mm_maskz_loadu_ps( __mmask8 k, void * s);
</pre>
<pre>VMOVUPS void _mm_mask_storeu_ps( void * d, __mmask8 k, __m128 a);
</pre>
<pre>MOVUPS __m256 _mm256_loadu_ps ( float * p);
</pre>
<pre>MOVUPS void _mm256 _storeu_ps( float *p, __m256 a);
</pre>
<pre>MOVUPS __m128 _mm_loadu_ps ( float * p);
</pre>
<pre>MOVUPS void _mm_storeu_ps( float *p, __m128 a);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>Non-EVEX-encoded instruction, see <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>Note treatment of #AC varies.</p>
<p>EVEX-encoded instruction, see Exceptions Type E4.nb in <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If EVEX.vvvv != 1111B or VEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>