ia32-64/x86/movntdq.html
2025-07-08 02:23:29 -03:00

124 lines
6.1 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>MOVNTDQ
— Store Packed Integers Using Non-Temporal Hint</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>MOVNTDQ
— Store Packed Integers Using Non-Temporal Hint</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F E7 /r MOVNTDQ m128, xmm1</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Move packed integer values in xmm1 to m128 using non-temporal hint.</td></tr>
<tr>
<td>VEX.128.66.0F.WIG E7 /r VMOVNTDQ m128, xmm1</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Move packed integer values in xmm1 to m128 using non-temporal hint.</td></tr>
<tr>
<td>VEX.256.66.0F.WIG E7 /r VMOVNTDQ m256, ymm1</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Move packed integer values in ymm1 to m256 using non-temporal hint.</td></tr>
<tr>
<td>EVEX.128.66.0F.W0 E7 /r VMOVNTDQ m128, xmm1</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move packed integer values in xmm1 to m128 using non-temporal hint.</td></tr>
<tr>
<td>EVEX.256.66.0F.W0 E7 /r VMOVNTDQ m256, ymm1</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Move packed integer values in zmm1 to m256 using non-temporal hint.</td></tr>
<tr>
<td>EVEX.512.66.0F.W0 E7 /r VMOVNTDQ m512, zmm1</td>
<td>B</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Move packed integer values in zmm1 to m512 using non-temporal hint.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<sup>1</sup><a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<blockquote>
<p>1. ModRM.MOD != 011B</p></blockquote>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>Full Mem</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Moves the packed integers in the source operand (second operand) to the destination operand (first operand) using a non-temporal hint to prevent caching of the data during the write to memory. The source operand is an XMM register, YMM register or ZMM register, which is assumed to contain integer data (packed bytes, words, double-words, or quadwords). The destination operand is a 128-bit, 256-bit or 512-bit memory location. The memory operand must be aligned on a 16-byte (128-bit version), 32-byte (VEX.256 encoded version) or 64-byte (512-bit version) boundary otherwise a general-protection exception (#GP) will be generated.</p>
<p>The non-temporal hint is implemented by using a write combining (WC) memory type protocol when writing the data to memory. Using this protocol, the processor does not write the data into the cache hierarchy, nor does it fetch the corresponding cache line from memory into the cache hierarchy. The memory type of the region being written to can override the non-temporal hint, if the memory address specified for the non-temporal store is in an uncacheable (UC) or write protected (WP) memory region. For more information on non-temporal stores, see “Caching of Temporal vs. Non-Temporal Data” in Chapter 10 in the IA-32 Intel Architecture Software Developers Manual, Volume 1.</p>
<p>Because the WC protocol uses a weakly-ordered memory consistency model, a fencing operation implemented with the SFENCE or MFENCE instruction should be used in conjunction with VMOVNTDQ instructions if multiple processors might use different memory types to read/write the destination memory locations.</p>
<p>Note: VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, VEX.L must be 0; otherwise instructions will #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vmovntdq-evex-encoded-versions-">VMOVNTDQ(EVEX Encoded Versions)<a class="anchor" href="#vmovntdq-evex-encoded-versions-">
</a></h3>
<pre>VL = 128, 256, 512
DEST[VL-1:0] := SRC[VL-1:0]
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="movntdq--legacy-and-vex-versions-">MOVNTDQ (Legacy and VEX Versions)<a class="anchor" href="#movntdq--legacy-and-vex-versions-">
</a></h3>
<pre>DEST := SRC
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VMOVNTDQ void _mm512_stream_si512(void * p, __m512i a);
</pre>
<pre>VMOVNTDQ void _mm256_stream_si256 (__m256i * p, __m256i a);
</pre>
<pre>MOVNTDQ void _mm_stream_si128 (__m128i * p, __m128i a);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>Non-EVEX-encoded instruction, see Exceptions Type1.SSE2 in <span class="not-imported">Table 2-18</span>, “Type 1 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-45</span>, “Type E1NF Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>