forked from NRZCode/ia32-64
359 lines
25 KiB
HTML
359 lines
25 KiB
HTML
<!DOCTYPE html>
|
||
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>PMULHUW
|
||
— Multiply Packed Unsigned Integers and Store High Result</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>PMULHUW
|
||
— Multiply Packed Unsigned Integers and Store High Result</h1>
|
||
|
||
<table>
|
||
<tr>
|
||
<th>Opcode/Instruction</th>
|
||
<th>Op/En</th>
|
||
<th>64/32 bit Mode Support</th>
|
||
<th>CPUID Feature Flag</th>
|
||
<th>Description</th></tr>
|
||
<tr>
|
||
<td>NP 0F E4 /r<sup>1</sup> PMULHUW mm1, mm2/m64</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>SSE</td>
|
||
<td>Multiply the packed unsigned word integers in mm1 register and mm2/m64, and store the high 16 bits of the results in mm1.</td></tr>
|
||
<tr>
|
||
<td>66 0F E4 /r PMULHUW xmm1, xmm2/m128</td>
|
||
<td>A</td>
|
||
<td>V/V</td>
|
||
<td>SSE2</td>
|
||
<td>Multiply the packed unsigned word integers in xmm1 and xmm2/m128, and store the high 16 bits of the results in xmm1.</td></tr>
|
||
<tr>
|
||
<td>VEX.128.66.0F.WIG E4 /r VPMULHUW xmm1, xmm2, xmm3/m128</td>
|
||
<td>B</td>
|
||
<td>V/V</td>
|
||
<td>AVX</td>
|
||
<td>Multiply the packed unsigned word integers in xmm2 and xmm3/m128, and store the high 16 bits of the results in xmm1.</td></tr>
|
||
<tr>
|
||
<td>VEX.256.66.0F.WIG E4 /r VPMULHUW ymm1, ymm2, ymm3/m256</td>
|
||
<td>B</td>
|
||
<td>V/V</td>
|
||
<td>AVX2</td>
|
||
<td>Multiply the packed unsigned word integers in ymm2 and ymm3/m256, and store the high 16 bits of the results in ymm1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.128.66.0F.WIG E4 /r VPMULHUW xmm1 {k1}{z}, xmm2, xmm3/m128</td>
|
||
<td>C</td>
|
||
<td>V/V</td>
|
||
<td>AVX512VL AVX512BW</td>
|
||
<td>Multiply the packed unsigned word integers in xmm2 and xmm3/m128, and store the high 16 bits of the results in xmm1 under writemask k1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.256.66.0F.WIG E4 /r VPMULHUW ymm1 {k1}{z}, ymm2, ymm3/m256</td>
|
||
<td>C</td>
|
||
<td>V/V</td>
|
||
<td>AVX512VL AVX512BW</td>
|
||
<td>Multiply the packed unsigned word integers in ymm2 and ymm3/m256, and store the high 16 bits of the results in ymm1 under writemask k1.</td></tr>
|
||
<tr>
|
||
<td>EVEX.512.66.0F.WIG E4 /r VPMULHUW zmm1 {k1}{z}, zmm2, zmm3/m512</td>
|
||
<td>C</td>
|
||
<td>V/V</td>
|
||
<td>AVX512BW</td>
|
||
<td>Multiply the packed unsigned word integers in zmm2 and zmm3/m512, and store the high 16 bits of the results in zmm1 under writemask k1.</td></tr></table>
|
||
<blockquote>
|
||
<p>1. See note in Section 2.5, “Intel® AVX and Intel® SSE Instruction Exception Classification,” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A, and Section 23.25.3, “Exception Conditions of Legacy SIMD Instructions Operating on MMX Registers,” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B.</p></blockquote>
|
||
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
|
||
¶
|
||
</a></h2>
|
||
<table>
|
||
<tr>
|
||
<th>Op/En</th>
|
||
<th>Tuple Type</th>
|
||
<th>Operand 1</th>
|
||
<th>Operand 2</th>
|
||
<th>Operand 3</th>
|
||
<th>Operand 4</th></tr>
|
||
<tr>
|
||
<td>A</td>
|
||
<td>N/A</td>
|
||
<td>ModRM:reg (r, w)</td>
|
||
<td>ModRM:r/m (r)</td>
|
||
<td>N/A</td>
|
||
<td>N/A</td></tr>
|
||
<tr>
|
||
<td>B</td>
|
||
<td>N/A</td>
|
||
<td>ModRM:reg (w)</td>
|
||
<td>VEX.vvvv (r)</td>
|
||
<td>ModRM:r/m (r)</td>
|
||
<td>N/A</td></tr>
|
||
<tr>
|
||
<td>C</td>
|
||
<td>Full Mem</td>
|
||
<td>ModRM:reg (w)</td>
|
||
<td>EVEX.vvvv (r)</td>
|
||
<td>ModRM:r/m (r)</td>
|
||
<td>N/A</td></tr></table>
|
||
<h2 id="description">Description<a class="anchor" href="#description">
|
||
¶
|
||
</a></h2>
|
||
<p>Performs a SIMD unsigned multiply of the packed unsigned word integers in the destination operand (first operand) and the source operand (second operand), and stores the high 16 bits of each 32-bit intermediate results in the destination operand. (<a href='pmulhuw.html#fig-4-12'>Figure 4-12</a> shows this operation when using 64-bit operands.)</p>
|
||
<p>In 64-bit mode and not encoded with VEX/EVEX, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>
|
||
<p>Legacy SSE version 64-bit operand: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p>
|
||
<p>128-bit Legacy SSE version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p>
|
||
<p>VEX.128 encoded version: The first source and destination operands are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise the instruction will #UD.</p>
|
||
<p>VEX.256 encoded version: The second source operand can be an YMM register or a 256-bit memory location. The first source and destination operands are YMM registers.</p>
|
||
<p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p>
|
||
<figure id="fig-4-12">
|
||
<svg style="width: 455.616pt; height: 143.2799759999999pt" viewBox="109.10000000000001 0.0 384.68 124.39997999999991">
|
||
<g xmlns="http://www.w3.org/2000/svg" style="fill: none; stroke: none">
|
||
<rect height="118.44" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="111.60000000000001" y="0.479979999999955"></rect>
|
||
<rect height="118.44" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="490.8" y="0.479979999999955"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="379.68" x="111.60000000000001" y="0.0"></rect>
|
||
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="379.68" x="111.60000000000001" y="118.91996999999992"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="72.24" x="170.04" y="66.65993999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="241.8" y="66.89997999999991"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="72.24" x="169.8" y="84.65993999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="169.8" y="66.6599799999999"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="72.0" x="242.04" y="66.89997999999991"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="72.24" x="242.04" y="66.65993999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.8" y="66.89997999999991"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="72.24" x="241.8" y="84.65993999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="241.8" y="66.6599799999999"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="72.0" x="314.04" y="66.89997999999991"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="72.24" x="314.04" y="66.65993999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="385.8" y="66.89997999999991"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="72.24" x="313.8" y="84.65993999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.8" y="66.6599799999999"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="72.0" x="386.04" y="66.89997999999991"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="72.24" x="386.04" y="66.65993999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="457.8" y="66.89997999999991"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="72.24" x="385.8" y="84.65993999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="385.8" y="66.6599799999999"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="241.08" y="10.85997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="241.08" y="10.620000000000005"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="277.2" y="10.85997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="240.84" y="28.620000000000005"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="240.84" y="10.619979999999941"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="277.44" y="10.85997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="277.44" y="10.620000000000005"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.56" y="10.85997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="277.2" y="28.620000000000005"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="277.2" y="10.619979999999941"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="313.8" y="10.85997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.8" y="10.620000000000005"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="349.92" y="10.85997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.56" y="28.620000000000005"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.56" y="10.619979999999941"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="350.16" y="10.85997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="350.16" y="10.620000000000005"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="386.34000000000003" y="10.85997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="349.92" y="28.620000000000005"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="349.92" y="10.619979999999941"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="241.02" y="37.07997999999998"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="241.02" y="36.839939999999956"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="277.14" y="37.07997999999998"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="240.78" y="54.84000000000003"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="240.78" y="36.83997999999997"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="277.38" y="37.07997999999998"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="277.38" y="36.839939999999956"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.56" y="37.07997999999998"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="277.14" y="54.84000000000003"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="277.14" y="36.83997999999997"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="313.8" y="37.07997999999998"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.8" y="36.839939999999956"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="349.92" y="37.07997999999998"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.56" y="54.84000000000003"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.56" y="36.83997999999997"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="350.16" y="37.07997999999998"></rect>
|
||
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="350.16" y="36.839939999999956"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="386.28000000000003" y="37.07997999999998"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="349.92" y="54.84000000000003"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="349.92" y="36.83997999999997"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="240.96" y="93.29997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="240.96" y="93.05999999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="277.14" y="93.29997999999989"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="240.72" y="111.05999999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="240.72" y="93.05997999999994"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="277.38" y="93.29997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="277.38" y="93.05999999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.5" y="93.29997999999989"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="277.14" y="111.05999999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="277.14" y="93.05997999999994"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="313.74" y="93.29997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.74" y="93.05999999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="349.86" y="93.29997999999989"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.5" y="111.05999999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.5" y="93.05997999999994"></rect>
|
||
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="350.1" y="93.29997999999995"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="350.1" y="93.05999999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="386.28000000000003" y="93.29997999999989"></rect>
|
||
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="349.86" y="111.05999999999989"></rect>
|
||
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="349.86" y="93.05997999999994"></rect>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="16.86740279999998" x="202.3212714" y="23.22640209999986">SRC</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="9.783381299999974" x="252.60000000000002" y="23.946080999999936">X3</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="9.783381299999974" x="288.11919589999997" y="23.946080999999936">X2</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="9.783381299999974" x="325.13830249999995" y="23.946080999999936">X1</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="9.783381299999974" x="361.6172174999999" y="23.946080999999936">X0</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="21.32638079999998" x="202.3212714" y="50.04659539999989">DEST</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="9.783381300000002" x="255.78" y="51.24608099999989">Y3</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="9.783381299999974" x="291.2991959" y="51.24608099999989">Y2</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="9.783381299999974" x="328.31830249999996" y="51.24608099999989">Y1</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="9.783381299999974" x="364.79721749999993" y="51.24608099999989">Y0</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.918266199999948pt; fill: #000" textLength="46.01255619999998" x="181.8604716" y="79.62608099999989">Z3 = X3 ∗ Y3</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="46.09925509999999" x="254.9289556" y="79.62608099999989">Z2 = X2 ∗ Y2</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="45.94915930000002" x="328.263936" y="79.62608099999989">Z1 = X1 ∗ Y1</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="46.096263299999976" x="397.248751" y="79.62608099999989">Z0 = X0 ∗ Y0</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="22.194203399999992" x="144.1209579" y="80.46597159999988">TEMP</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="21.32638079999998" x="203.4016546" y="105.36669019999988">DEST</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.720880000000022pt; fill: #000" textLength="30.092256000000077" x="244.98000000000002" y="107.2233399999999">Z3[31:16]</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.720880000000022pt; fill: #000" textLength="30.092256000000134" x="280.55464800000004" y="107.2233399999999">Z2[31:16]</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.720880000000022pt; fill: #000" textLength="30.092256000000134" x="317.5693200000001" y="107.2233399999999">Z1[31:16]</text>
|
||
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.720880000000022pt; fill: #000" textLength="30.09016800000012" x="353.9840400000002" y="107.2233399999999">Z0[31:16]</text></g></svg>
|
||
<figcaption><a href='pmulhuw.html#fig-4-12'>Figure 4-12</a>. PMULHUW and PMULHW Instruction Operation Using 64-bit Operands</figcaption></figure>
|
||
<h2 id="operation">Operation<a class="anchor" href="#operation">
|
||
¶
|
||
</a></h2>
|
||
<h3 id="pmulhuw--with-64-bit-operands-">PMULHUW (With 64-bit Operands)<a class="anchor" href="#pmulhuw--with-64-bit-operands-">
|
||
¶
|
||
</a></h3>
|
||
<pre>TEMP0[31:0] := DEST[15:0] ∗ SRC[15:0]; (* Unsigned multiplication *)
|
||
TEMP1[31:0] := DEST[31:16] ∗ SRC[31:16];
|
||
TEMP2[31:0] := DEST[47:32] ∗ SRC[47:32];
|
||
TEMP3[31:0] := DEST[63:48] ∗ SRC[63:48];
|
||
DEST[15:0] := TEMP0[31:16];
|
||
DEST[31:16] := TEMP1[31:16];
|
||
DEST[47:32] := TEMP2[31:16];
|
||
DEST[63:48] := TEMP3[31:16];
|
||
</pre>
|
||
<h3 id="pmulhuw--with-128-bit-operands-">PMULHUW (With 128-bit Operands)<a class="anchor" href="#pmulhuw--with-128-bit-operands-">
|
||
¶
|
||
</a></h3>
|
||
<pre>TEMP0[31:0] := DEST[15:0] ∗ SRC[15:0]; (* Unsigned multiplication *)
|
||
TEMP1[31:0] := DEST[31:16] ∗ SRC[31:16];
|
||
TEMP2[31:0] := DEST[47:32] ∗ SRC[47:32];
|
||
TEMP3[31:0] := DEST[63:48] ∗ SRC[63:48];
|
||
TEMP4[31:0] := DEST[79:64] ∗ SRC[79:64];
|
||
TEMP5[31:0] := DEST[95:80] ∗ SRC[95:80];
|
||
TEMP6[31:0] := DEST[111:96] ∗ SRC[111:96];
|
||
TEMP7[31:0] := DEST[127:112] ∗ SRC[127:112];
|
||
DEST[15:0] := TEMP0[31:16];
|
||
DEST[31:16] := TEMP1[31:16];
|
||
DEST[47:32] := TEMP2[31:16];
|
||
DEST[63:48] := TEMP3[31:16];
|
||
DEST[79:64] := TEMP4[31:16];
|
||
DEST[95:80] := TEMP5[31:16];
|
||
DEST[111:96] := TEMP6[31:16];
|
||
DEST[127:112] := TEMP7[31:16];
|
||
</pre>
|
||
<h3 id="vpmulhuw--vex-128-encoded-version-">VPMULHUW (VEX.128 Encoded Version)<a class="anchor" href="#vpmulhuw--vex-128-encoded-version-">
|
||
¶
|
||
</a></h3>
|
||
<pre>TEMP0[31:0] := SRC1[15:0] * SRC2[15:0]
|
||
TEMP1[31:0] := SRC1[31:16] * SRC2[31:16]
|
||
TEMP2[31:0] := SRC1[47:32] * SRC2[47:32]
|
||
TEMP3[31:0] := SRC1[63:48] * SRC2[63:48]
|
||
TEMP4[31:0] := SRC1[79:64] * SRC2[79:64]
|
||
TEMP5[31:0] := SRC1[95:80] * SRC2[95:80]
|
||
TEMP6[31:0] := SRC1[111:96] * SRC2[111:96]
|
||
TEMP7[31:0] := SRC1[127:112] * SRC2[127:112]
|
||
DEST[15:0] := TEMP0[31:16]
|
||
DEST[31:16] := TEMP1[31:16]
|
||
DEST[47:32] := TEMP2[31:16]
|
||
DEST[63:48] := TEMP3[31:16]
|
||
DEST[79:64] := TEMP4[31:16]
|
||
DEST[95:80] := TEMP5[31:16]
|
||
DEST[111:96] := TEMP6[31:16]
|
||
DEST[127:112] := TEMP7[31:16]
|
||
DEST[MAXVL-1:128] := 0
|
||
</pre>
|
||
<h3 id="pmulhuw--vex-256-encoded-version-">PMULHUW (VEX.256 Encoded Version)<a class="anchor" href="#pmulhuw--vex-256-encoded-version-">
|
||
¶
|
||
</a></h3>
|
||
<pre>TEMP0[31:0] := SRC1[15:0] * SRC2[15:0]
|
||
TEMP1[31:0] := SRC1[31:16] * SRC2[31:16]
|
||
TEMP2[31:0] := SRC1[47:32] * SRC2[47:32]
|
||
TEMP3[31:0] := SRC1[63:48] * SRC2[63:48]
|
||
TEMP4[31:0] := SRC1[79:64] * SRC2[79:64]
|
||
TEMP5[31:0] := SRC1[95:80] * SRC2[95:80]
|
||
TEMP6[31:0] := SRC1[111:96] * SRC2[111:96]
|
||
TEMP7[31:0] := SRC1[127:112] * SRC2[127:112]
|
||
TEMP8[31:0] := SRC1[143:128] * SRC2[143:128]
|
||
TEMP9[31:0] := SRC1[159:144] * SRC2[159:144]
|
||
TEMP10[31:0] := SRC1[175:160] * SRC2[175:160]
|
||
TEMP11[31:0] := SRC1[191:176] * SRC2[191:176]
|
||
TEMP12[31:0] := SRC1[207:192] * SRC2[207:192]
|
||
TEMP13[31:0] := SRC1[223:208] * SRC2[223:208]
|
||
TEMP14[31:0] := SRC1[239:224] * SRC2[239:224]
|
||
TEMP15[31:0] := SRC1[255:240] * SRC2[255:240]
|
||
DEST[15:0] := TEMP0[31:16]
|
||
DEST[31:16] := TEMP1[31:16]
|
||
DEST[47:32] := TEMP2[31:16]
|
||
DEST[63:48] := TEMP3[31:16]
|
||
DEST[79:64] := TEMP4[31:16]
|
||
DEST[95:80] := TEMP5[31:16]
|
||
DEST[111:96] := TEMP6[31:16]
|
||
DEST[127:112] := TEMP7[31:16]
|
||
DEST[143:128] := TEMP8[31:16]
|
||
DEST[159:144] := TEMP9[31:16]
|
||
DEST[175:160] := TEMP10[31:16]
|
||
DEST[191:176] := TEMP11[31:16]
|
||
DEST[207:192] := TEMP12[31:16]
|
||
DEST[223:208] := TEMP13[31:16]
|
||
DEST[239:224] := TEMP14[31:16]
|
||
DEST[255:240] := TEMP15[31:16]
|
||
DEST[MAXVL-1:256] := 0
|
||
</pre>
|
||
<h3 id="pmulhuw--evex-encoded-versions-">PMULHUW (EVEX Encoded Versions)<a class="anchor" href="#pmulhuw--evex-encoded-versions-">
|
||
¶
|
||
</a></h3>
|
||
<pre>(KL, VL) = (8, 128), (16, 256), (32, 512)
|
||
FOR j := 0 TO KL-1
|
||
i := j * 16
|
||
IF k1[j] OR *no writemask*
|
||
THEN
|
||
temp[31:0] := SRC1[i+15:i] * SRC2[i+15:i]
|
||
DEST[i+15:i] := tmp[31:16]
|
||
ELSE
|
||
IF *merging-masking* ; merging-masking
|
||
THEN *DEST[i+15:i] remains unchanged*
|
||
ELSE *zeroing-masking*
|
||
; zeroing-masking
|
||
DEST[i+15:i] := 0
|
||
FI
|
||
FI;
|
||
ENDFOR
|
||
DEST[MAXVL-1:VL] := 0
|
||
</pre>
|
||
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
|
||
¶
|
||
</a></h2>
|
||
<pre>VPMULHUW __m512i _mm512_mulhi_epu16(__m512i a, __m512i b);
|
||
</pre>
|
||
<pre>VPMULHUW __m512i _mm512_mask_mulhi_epu16(__m512i s, __mmask32 k, __m512i a, __m512i b);
|
||
</pre>
|
||
<pre>VPMULHUW __m512i _mm512_maskz_mulhi_epu16( __mmask32 k, __m512i a, __m512i b);
|
||
</pre>
|
||
<pre>VPMULHUW __m256i _mm256_mask_mulhi_epu16(__m256i s, __mmask16 k, __m256i a, __m256i b);
|
||
</pre>
|
||
<pre>VPMULHUW __m256i _mm256_maskz_mulhi_epu16( __mmask16 k, __m256i a, __m256i b);
|
||
</pre>
|
||
<pre>VPMULHUW __m128i _mm_mask_mulhi_epu16(__m128i s, __mmask8 k, __m128i a, __m128i b);
|
||
</pre>
|
||
<pre>VPMULHUW __m128i _mm_maskz_mulhi_epu16( __mmask8 k, __m128i a, __m128i b);
|
||
</pre>
|
||
<pre>PMULHUW __m64 _mm_mulhi_pu16(__m64 a, __m64 b)
|
||
</pre>
|
||
<pre>(V)PMULHUW __m128i _mm_mulhi_epu16 ( __m128i a, __m128i b)
|
||
</pre>
|
||
<pre>VPMULHUW __m256i _mm256_mulhi_epu16 ( __m256i a, __m256i b)
|
||
</pre>
|
||
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
|
||
¶
|
||
</a></h2>
|
||
<p>None.</p>
|
||
<h2 class="exceptions" id="numeric-exceptions">Numeric Exceptions<a class="anchor" href="#numeric-exceptions">
|
||
¶
|
||
</a></h2>
|
||
<p>None.</p>
|
||
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
|
||
¶
|
||
</a></h2>
|
||
<p>Non-EVEX-encoded instruction, see <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
|
||
<p>EVEX-encoded instruction, see Exceptions Type E4.nb in <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p><footer><p>
|
||
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
|
||
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
|
||
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developer’s Manual</a> for anything serious.
|
||
</p></footer></body></html>
|