ia32-64/x86/vpshrd.html
2025-07-08 02:23:29 -03:00

212 lines
8.7 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>VPSHRD
— Concatenate and Shift Packed Data Right Logical</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>VPSHRD
— Concatenate and Shift Packed Data Right Logical</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>EVEX.128.66.0F3A.W1 72 /r /ib VPSHRDW xmm1{k1}{z}, xmm2, xmm3/m128, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512_VBMI2 AVX512VL</td>
<td>Concatenate destination and source operands, extract result shifted to the right by constant value in imm8 into xmm1.</td></tr>
<tr>
<td>EVEX.256.66.0F3A.W1 72 /r /ib VPSHRDW ymm1{k1}{z}, ymm2, ymm3/m256, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512_VBMI2 AVX512VL</td>
<td>Concatenate destination and source operands, extract result shifted to the right by constant value in imm8 into ymm1.</td></tr>
<tr>
<td>EVEX.512.66.0F3A.W1 72 /r /ib VPSHRDW zmm1{k1}{z}, zmm2, zmm3/m512, imm8</td>
<td>A</td>
<td>V/V</td>
<td>AVX512_VBMI2</td>
<td>Concatenate destination and source operands, extract result shifted to the right by constant value in imm8 into zmm1.</td></tr>
<tr>
<td>EVEX.128.66.0F3A.W0 73 /r /ib VPSHRDD xmm1{k1}{z}, xmm2, xmm3/m128/m32bcst, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX512_VBMI2 AVX512VL</td>
<td>Concatenate destination and source operands, extract result shifted to the right by constant value in imm8 into xmm1.</td></tr>
<tr>
<td>EVEX.256.66.0F3A.W0 73 /r /ib VPSHRDD ymm1{k1}{z}, ymm2, ymm3/m256/m32bcst, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX512_VBMI2 AVX512VL</td>
<td>Concatenate destination and source operands, extract result shifted to the right by constant value in imm8 into ymm1.</td></tr>
<tr>
<td>EVEX.512.66.0F3A.W0 73 /r /ib VPSHRDD zmm1{k1}{z}, zmm2, zmm3/m512/m32bcst, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX512_VBMI2</td>
<td>Concatenate destination and source operands, extract result shifted to the right by constant value in imm8 into zmm1.</td></tr>
<tr>
<td>EVEX.128.66.0F3A.W1 73 /r /ib VPSHRDQ xmm1{k1}{z}, xmm2, xmm3/m128/m64bcst, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX512_VBMI2 AVX512VL</td>
<td>Concatenate destination and source operands, extract result shifted to the right by constant value in imm8 into xmm1.</td></tr>
<tr>
<td>EVEX.256.66.0F3A.W1 73 /r /ib VPSHRDQ ymm1{k1}{z}, ymm2, ymm3/m256/m64bcst, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX512_VBMI2 AVX512VL</td>
<td>Concatenate destination and source operands, extract result shifted to the right by constant value in imm8 into ymm1.</td></tr>
<tr>
<td>EVEX.512.66.0F3A.W1 73 /r /ib VPSHRDQ zmm1{k1}{z}, zmm2, zmm3/m512/m64bcst, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX512_VBMI2</td>
<td>Concatenate destination and source operands, extract result shifted to the right by constant value in imm8 into zmm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>Full Mem</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8 (r)</td></tr>
<tr>
<td>B</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8 (r)</td></tr></table>
<h3 id="description">Description<a class="anchor" href="#description">
</a></h3>
<p>Concatenate packed data, extract result shifted to the right by constant value.</p>
<p>This instruction supports memory fault suppression.</p>
<h3 id="operation">Operation<a class="anchor" href="#operation">
</a></h3>
<h4 id="vpshrdw-dest--src2--src3--imm8">VPSHRDW DEST, SRC2, SRC3, imm8<a class="anchor" href="#vpshrdw-dest--src2--src3--imm8">
</a></h4>
<pre>(KL, VL) = (8, 128), (16, 256), (32, 512)
FOR j := 0 TO KL-1:
IF MaskBit(j) OR *no writemask*:
DEST.word[j] := concat(SRC3.word[j], SRC2.word[j]) &gt;&gt; (imm8 &amp; 15)
ELSE IF *zeroing*:
DEST.word[j] := 0
*ELSE DEST.word[j] remains unchanged*
DEST[MAX_VL-1:VL] := 0
</pre>
<h4 id="vpshrdd-dest--src2--src3--imm8">VPSHRDD DEST, SRC2, SRC3, imm8<a class="anchor" href="#vpshrdd-dest--src2--src3--imm8">
</a></h4>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1:
IF SRC3 is broadcast memop:
tsrc3 := SRC3.dword[0]
ELSE:
tsrc3 := SRC3.dword[j]
IF MaskBit(j) OR *no writemask*:
DEST.dword[j] := concat(tsrc3, SRC2.dword[j]) &gt;&gt; (imm8 &amp; 31)
ELSE IF *zeroing*:
DEST.dword[j] := 0
*ELSE DEST.dword[j] remains unchanged*
DEST[MAX_VL-1:VL] := 0
</pre>
<h4 id="vpshrdq-dest--src2--src3--imm8">VPSHRDQ DEST, SRC2, SRC3, imm8<a class="anchor" href="#vpshrdq-dest--src2--src3--imm8">
</a></h4>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1:
IF SRC3 is broadcast memop:
tsrc3 := SRC3.qword[0]
ELSE:
tsrc3 := SRC3.qword[j]
IF MaskBit(j) OR *no writemask*:
DEST.qword[j] := concat(tsrc3, SRC2.qword[j]) &gt;&gt; (imm8 &amp; 63)
ELSE IF *zeroing*:
DEST.qword[j] := 0
*ELSE DEST.qword[j] remains unchanged*
DEST[MAX_VL-1:VL] := 0
</pre>
<h3 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h3>
<pre>VPSHRDQ __m128i _mm_shrdi_epi64(__m128i, __m128i, int);
</pre>
<pre>VPSHRDQ __m128i _mm_mask_shrdi_epi64(__m128i, __mmask8, __m128i, __m128i, int);
</pre>
<pre>VPSHRDQ __m128i _mm_maskz_shrdi_epi64(__mmask8, __m128i, __m128i, int);
</pre>
<pre>VPSHRDQ __m256i _mm256_shrdi_epi64(__m256i, __m256i, int);
</pre>
<pre>VPSHRDQ __m256i _mm256_mask_shrdi_epi64(__m256i, __mmask8, __m256i, __m256i, int);
</pre>
<pre>VPSHRDQ __m256i _mm256_maskz_shrdi_epi64(__mmask8, __m256i, __m256i, int);
</pre>
<pre>VPSHRDQ __m512i _mm512_shrdi_epi64(__m512i, __m512i, int);
</pre>
<pre>VPSHRDQ __m512i _mm512_mask_shrdi_epi64(__m512i, __mmask8, __m512i, __m512i, int);
</pre>
<pre>VPSHRDQ __m512i _mm512_maskz_shrdi_epi64(__mmask8, __m512i, __m512i, int);
</pre>
<pre>VPSHRDD __m128i _mm_shrdi_epi32(__m128i, __m128i, int);
</pre>
<pre>VPSHRDD __m128i _mm_mask_shrdi_epi32(__m128i, __mmask8, __m128i, __m128i, int);
</pre>
<pre>VPSHRDD __m128i _mm_maskz_shrdi_epi32(__mmask8, __m128i, __m128i, int);
</pre>
<pre>VPSHRDD __m256i _mm256_shrdi_epi32(__m256i, __m256i, int);
</pre>
<pre>VPSHRDD __m256i _mm256_mask_shrdi_epi32(__m256i, __mmask8, __m256i, __m256i, int);
</pre>
<pre>VPSHRDD __m256i _mm256_maskz_shrdi_epi32(__mmask8, __m256i, __m256i, int);
</pre>
<pre>VPSHRDD __m512i _mm512_shrdi_epi32(__m512i, __m512i, int);
</pre>
<pre>VPSHRDD __m512i _mm512_mask_shrdi_epi32(__m512i, __mmask16, __m512i, __m512i, int);
</pre>
<pre>VPSHRDD __m512i _mm512_maskz_shrdi_epi32(__mmask16, __m512i, __m512i, int);
</pre>
<pre>VPSHRDW __m128i _mm_shrdi_epi16(__m128i, __m128i, int);
</pre>
<pre>VPSHRDW __m128i _mm_mask_shrdi_epi16(__m128i, __mmask8, __m128i, __m128i, int);
</pre>
<pre>VPSHRDW __m128i _mm_maskz_shrdi_epi16(__mmask8, __m128i, __m128i, int);
</pre>
<pre>VPSHRDW __m256i _mm256_shrdi_epi16(__m256i, __m256i, int);
</pre>
<pre>VPSHRDW __m256i _mm256_mask_shrdi_epi16(__m256i, __mmask16, __m256i, __m256i, int);
</pre>
<pre>VPSHRDW __m256i _mm256_maskz_shrdi_epi16(__mmask16, __m256i, __m256i, int);
</pre>
<pre>VPSHRDW __m512i _mm512_shrdi_epi16(__m512i, __m512i, int);
</pre>
<pre>VPSHRDW __m512i _mm512_mask_shrdi_epi16(__m512i, __mmask32, __m512i, __m512i, int);
</pre>
<pre>VPSHRDW __m512i _mm512_maskz_shrdi_epi16(__mmask32, __m512i, __m512i, int);
</pre>
<h3 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h3>
<p>None.</p>
<h3 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h3>
<p>See <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>