392 lines
33 KiB
HTML
392 lines
33 KiB
HTML
|
<!DOCTYPE html>
|
|||
|
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>PSADBW
|
|||
|
— Compute Sum of Absolute Differences</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>PSADBW
|
|||
|
— Compute Sum of Absolute Differences</h1>
|
|||
|
|
|||
|
<table>
|
|||
|
<tr>
|
|||
|
<th>Opcode/Instruction</th>
|
|||
|
<th>Op/En</th>
|
|||
|
<th>64/32 bit Mode Support</th>
|
|||
|
<th>CPUID Feature Flag</th>
|
|||
|
<th>Description</th></tr>
|
|||
|
<tr>
|
|||
|
<td>NP 0F F6 /r<sup>1</sup> PSADBW mm1, mm2/m64</td>
|
|||
|
<td>A</td>
|
|||
|
<td>V/V</td>
|
|||
|
<td>SSE</td>
|
|||
|
<td>Computes the absolute differences of the packed unsigned byte integers from mm2 /m64 and mm1; differences are then summed to produce an unsigned word integer result.</td></tr>
|
|||
|
<tr>
|
|||
|
<td>66 0F F6 /r PSADBW xmm1, xmm2/m128</td>
|
|||
|
<td>A</td>
|
|||
|
<td>V/V</td>
|
|||
|
<td>SSE2</td>
|
|||
|
<td>Computes the absolute differences of the packed unsigned byte integers from xmm2 /m128 and xmm1; the 8 low differences and 8 high differences are then summed separately to produce two unsigned word integer results.</td></tr>
|
|||
|
<tr>
|
|||
|
<td>VEX.128.66.0F.WIG F6 /r VPSADBW xmm1, xmm2, xmm3/m128</td>
|
|||
|
<td>B</td>
|
|||
|
<td>V/V</td>
|
|||
|
<td>AVX</td>
|
|||
|
<td>Computes the absolute differences of the packed unsigned byte integers from xmm3 /m128 and xmm2; the 8 low differences and 8 high differences are then summed separately to produce two unsigned word integer results.</td></tr>
|
|||
|
<tr>
|
|||
|
<td>VEX.256.66.0F.WIG F6 /r VPSADBW ymm1, ymm2, ymm3/m256</td>
|
|||
|
<td>B</td>
|
|||
|
<td>V/V</td>
|
|||
|
<td>AVX2</td>
|
|||
|
<td>Computes the absolute differences of the packed unsigned byte integers from ymm3 /m256 and ymm2; then each consecutive 8 differences are summed separately to produce four unsigned word integer results.</td></tr>
|
|||
|
<tr>
|
|||
|
<td>EVEX.128.66.0F.WIG F6 /r VPSADBW xmm1, xmm2, xmm3/m128</td>
|
|||
|
<td>C</td>
|
|||
|
<td>V/V</td>
|
|||
|
<td>AVX512VL AVX512BW</td>
|
|||
|
<td>Computes the absolute differences of the packed unsigned byte integers from xmm3 /m128 and xmm2; then each consecutive 8 differences are summed separately to produce two unsigned word integer results.</td></tr>
|
|||
|
<tr>
|
|||
|
<td>EVEX.256.66.0F.WIG F6 /r VPSADBW ymm1, ymm2, ymm3/m256</td>
|
|||
|
<td>C</td>
|
|||
|
<td>V/V</td>
|
|||
|
<td>AVX512VL AVX512BW</td>
|
|||
|
<td>Computes the absolute differences of the packed unsigned byte integers from ymm3 /m256 and ymm2; then each consecutive 8 differences are summed separately to produce four unsigned word integer results.</td></tr>
|
|||
|
<tr>
|
|||
|
<td>EVEX.512.66.0F.WIG F6 /r VPSADBW zmm1, zmm2, zmm3/m512</td>
|
|||
|
<td>C</td>
|
|||
|
<td>V/V</td>
|
|||
|
<td>AVX512BW</td>
|
|||
|
<td>Computes the absolute differences of the packed unsigned byte integers from zmm3 /m512 and zmm2; then each consecutive 8 differences are summed separately to produce eight unsigned word integer results.</td></tr></table>
|
|||
|
<blockquote>
|
|||
|
<p>1. See note in Section 2.5, “Intel® AVX and Intel® SSE Instruction Exception Classification,” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developer’s Manual, Volume 2A, and Section 23.25.3, “Exception Conditions of Legacy SIMD Instructions Operating on MMX Registers,” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developer’s Manual, Volume 3B.</p></blockquote>
|
|||
|
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
|
|||
|
¶
|
|||
|
</a></h2>
|
|||
|
<table>
|
|||
|
<tr>
|
|||
|
<th>Op/En</th>
|
|||
|
<th>Tuple Type</th>
|
|||
|
<th>Operand 1</th>
|
|||
|
<th>Operand 2</th>
|
|||
|
<th>Operand 3</th>
|
|||
|
<th>Operand 4</th></tr>
|
|||
|
<tr>
|
|||
|
<td>A</td>
|
|||
|
<td>N/A</td>
|
|||
|
<td>ModRM:reg (r, w)</td>
|
|||
|
<td>ModRM:r/m (r)</td>
|
|||
|
<td>N/A</td>
|
|||
|
<td>N/A</td></tr>
|
|||
|
<tr>
|
|||
|
<td>B</td>
|
|||
|
<td>N/A</td>
|
|||
|
<td>ModRM:reg (w)</td>
|
|||
|
<td>VEX.vvvv (r)</td>
|
|||
|
<td>ModRM:r/m (r)</td>
|
|||
|
<td>N/A</td></tr>
|
|||
|
<tr>
|
|||
|
<td>C</td>
|
|||
|
<td>Full Mem</td>
|
|||
|
<td>ModRM:reg (w)</td>
|
|||
|
<td>EVEX.vvvv</td>
|
|||
|
<td>ModRM:r/m (r)</td>
|
|||
|
<td>N/A</td></tr></table>
|
|||
|
<h2 id="description">Description<a class="anchor" href="#description">
|
|||
|
¶
|
|||
|
</a></h2>
|
|||
|
<p>Computes the absolute value of the difference of 8 unsigned byte integers from the source operand (second operand) and from the destination operand (first operand). These 8 differences are then summed to produce an unsigned word integer result that is stored in the destination operand. <a href='psadbw.html#fig-4-14'>Figure 4-14</a> shows the operation of the PSADBW instruction when using 64-bit operands.</p>
|
|||
|
<p>When operating on 64-bit operands, the word integer result is stored in the low word of the destination operand, and the remaining bytes in the destination operand are cleared to all 0s.</p>
|
|||
|
<p>When operating on 128-bit operands, two packed results are computed. Here, the 8 low-order bytes of the source and destination operands are operated on to produce a word result that is stored in the low word of the destination operand, and the 8 high-order bytes are operated on to produce a word result that is stored in bits 64 through 79 of the destination operand. The remaining bytes of the destination operand are cleared.</p>
|
|||
|
<p>For 256-bit version, the third group of 8 differences are summed to produce an unsigned word in bits[143:128] of the destination register and the fourth group of 8 differences are summed to produce an unsigned word in bits[207:192] of the destination register. The remaining words of the destination are set to 0.</p>
|
|||
|
<p>For 512-bit version, the fifth group result is stored in bits [271:256] of the destination. The result from the sixth group is stored in bits [335:320]. The results for the seventh and eighth group are stored respectively in bits [399:384] and bits [463:447], respectively. The remaining bits in the destination are set to 0.</p>
|
|||
|
<p>In 64-bit mode and not encoded by VEX/EVEX prefix, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>
|
|||
|
<p>Legacy SSE version: The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an MMX technology register.</p>
|
|||
|
<p>128-bit Legacy SSE version: The first source operand and destination register are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding ZMM destination register remain unchanged.</p>
|
|||
|
<p>VEX.128 and EVEX.128 encoded versions: The first source operand and destination register are XMM registers. The second source operand is an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the corresponding ZMM register are zeroed.</p>
|
|||
|
<p>VEX.256 and EVEX.256 encoded versions: The first source operand and destination register are YMM registers. The second source operand is an YMM register or a 256-bit memory location. Bits (MAXVL-1:256) of the corresponding ZMM register are zeroed.</p>
|
|||
|
<p>EVEX.512 encoded version: The first source operand and destination register are ZMM registers. The second source operand is a ZMM register or a 512-bit memory location.</p>
|
|||
|
<figure id="fig-4-14">
|
|||
|
<svg style="width: 455.616pt; height: 158.040012pt" viewBox="109.64 0.0 384.68 136.70001">
|
|||
|
<g xmlns="http://www.w3.org/2000/svg" style="stroke: none; fill: none">
|
|||
|
<rect height="130.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="112.14" y="0.48000999999999294"></rect>
|
|||
|
<rect height="130.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="491.34000000000003" y="0.48000999999999294"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="379.68" x="112.14" y="0.0"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="379.68" x="112.14" y="131.22001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="168.48" y="12.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="204.60000000000002" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="168.24" y="30.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="168.24" y="12.720010000000002"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="204.84" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="204.84" y="12.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="240.96" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="204.60000000000002" y="30.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="204.60000000000002" y="12.720010000000002"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="241.20000000000002" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="241.20000000000002" y="12.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="277.32" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="240.96" y="30.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="240.96" y="12.720010000000002"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="277.56" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="277.56" y="12.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.74" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="277.32" y="30.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="277.32" y="12.720010000000002"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="314.16" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="314.16" y="12.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="350.28000000000003" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.92" y="30.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="313.92" y="12.720010000000002"></rect>
|
|||
|
<path d="M 350.52 12.960010000000011 L 386.88 12.960010000000011 L 386.88 30.96001000000001 L 350.52 30.96001000000001" style="fill: rgb(100%, 100%, 100%); fill-rule: nonzero"></path>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="350.52" y="12.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="386.64" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="350.28000000000003" y="30.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="350.28000000000003" y="12.720010000000002"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="386.88" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="386.88" y="12.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="423.0" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="386.64" y="30.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="386.64" y="12.720010000000002"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="423.24" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="423.24" y="12.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="459.42" y="12.960010000000011"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="423.0" y="30.720000000000027"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="423.0" y="12.720010000000002"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="167.88" y="42.96001000000001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="167.88" y="42.72000000000003"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="204.0" y="42.96001000000001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="167.64000000000001" y="60.72"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="167.64000000000001" y="42.72001"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="204.24" y="42.96001000000001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="204.24" y="42.72000000000003"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="240.42000000000002" y="42.96001000000001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="204.0" y="60.72"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="204.0" y="42.72001"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="240.66" y="42.96001000000001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="240.66" y="42.72000000000003"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="276.78000000000003" y="42.96001000000001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="240.42000000000002" y="60.72"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="240.42000000000002" y="42.72001"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="277.02" y="42.96001000000001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="277.02" y="42.72000000000003"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="313.14" y="42.96001000000001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="276.78000000000003" y="60.72"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="276.78000000000003" y="42.72001"></rect>
|
|||
|
<rect height="17.400000000000002" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="313.56" y="43.50001000000003"></rect>
|
|||
|
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.56" y="43.26002999999997"></rect>
|
|||
|
<rect height="17.64" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="349.68" y="43.500009999999975"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.32" y="60.66"></rect>
|
|||
|
<rect height="17.64" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="313.32" y="43.26001000000002"></rect>
|
|||
|
<rect height="17.400000000000002" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="349.92" y="43.50001000000003"></rect>
|
|||
|
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="349.92" y="43.26002999999997"></rect>
|
|||
|
<rect height="17.64" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="386.1" y="43.500009999999975"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="349.68" y="60.66"></rect>
|
|||
|
<rect height="17.64" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="349.68" y="43.26001000000002"></rect>
|
|||
|
<rect height="17.400000000000002" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="386.34000000000003" y="43.50001000000003"></rect>
|
|||
|
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="386.34000000000003" y="43.26002999999997"></rect>
|
|||
|
<rect height="17.64" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="422.46000000000004" y="43.500009999999975"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="386.1" y="60.66"></rect>
|
|||
|
<rect height="17.64" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="386.1" y="43.26001000000002"></rect>
|
|||
|
<rect height="17.400000000000002" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="422.70000000000005" y="43.50001000000003"></rect>
|
|||
|
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="422.70000000000005" y="43.26002999999997"></rect>
|
|||
|
<rect height="17.64" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="458.82" y="43.500009999999975"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="422.46000000000004" y="60.66"></rect>
|
|||
|
<rect height="17.64" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="422.46000000000004" y="43.26001000000002"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="168.42000000000002" y="71.88001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="168.42000000000002" y="71.64000999999999"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="204.54" y="71.88000999999997"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="168.18" y="89.64000999999999"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="168.18" y="71.64000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="204.78" y="71.88001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="204.78" y="71.64000999999999"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="240.9" y="71.88000999999997"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="204.54" y="89.64000999999999"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="204.54" y="71.64000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="241.20000000000002" y="71.88001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="241.20000000000002" y="71.64000999999999"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="277.32" y="71.88000999999997"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="240.96" y="89.64000999999999"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="240.96" y="71.64000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="277.56" y="71.88001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="277.56" y="71.64000999999999"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.68" y="71.88000999999997"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="277.32" y="89.64000999999999"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="277.32" y="71.64000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="314.1" y="71.82001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="314.1" y="71.58000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="350.22" y="71.82001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="313.86" y="89.58000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="313.86" y="71.58000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="350.46" y="71.82001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="350.46" y="71.58000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="386.64" y="71.82001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="350.22" y="89.58000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="350.22" y="71.58000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="386.88" y="71.82001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="386.88" y="71.58000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="423.0" y="71.82001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="386.64" y="89.58000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="386.64" y="71.58000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="423.24" y="71.82001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="423.24" y="71.58000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="459.36" y="71.82001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="423.0" y="89.58000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="423.0" y="71.58000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="168.72" y="105.60001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="168.72" y="105.36001000000002"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="204.9" y="105.60001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="168.48" y="123.36001000000002"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="168.48" y="105.36000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="205.14000000000001" y="105.60001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="205.14000000000001" y="105.36001000000002"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="241.26" y="105.60001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="204.9" y="123.36001000000002"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="204.9" y="105.36000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="241.5" y="105.60001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="241.5" y="105.36001000000002"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="277.62" y="105.60001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="241.26" y="123.36001000000002"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="241.26" y="105.36000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="277.86" y="105.60001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="277.86" y="105.36001000000002"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="313.98" y="105.60001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="277.62" y="123.36001000000002"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="277.62" y="105.36000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.42" x="314.40000000000003" y="105.54001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="314.40000000000003" y="105.30000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="350.58" y="105.54001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.660000000000004" x="314.16" y="123.30001000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="314.16" y="105.30000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="36.36" x="350.82" y="105.54001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="350.82" y="105.30000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="386.94" y="105.54001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="36.6" x="350.58" y="123.30001000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="350.58" y="105.30000999999999"></rect>
|
|||
|
<rect height="18.0" style="fill: rgb(100%, 100%, 100%)" width="71.22" x="387.18" y="105.54001"></rect>
|
|||
|
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="71.46000000000001" x="387.18" y="105.30000000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="458.16" y="105.54001"></rect>
|
|||
|
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="71.46000000000001" x="386.94" y="123.30001000000001"></rect>
|
|||
|
<rect height="18.240000000000002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="386.94" y="105.30000999999999"></rect>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="16.86740279999998" x="148.4410266" y="25.925919399999998">SRC</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="119.70278269999997" x="182.52104340000002" y="26.466111000000012">X7 X6 X5 X4</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="120.42117359999986" x="326.22" y="26.466111000000012">X3 X2 X1 X0</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="21.32638079999998" x="144.001227" y="54.84614750000003">DEST</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="119.70278269999997" x="181.9211109" y="56.40611100000001">Y7 Y6 Y5 Y4</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="120.36044199999986" x="325.68" y="56.40611100000001">Y3 Y2 Y1 Y0</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="22.194203399999992" x="143.1014404" y="85.80567880000004">TEMP</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 6.695999999999998pt; fill: #000" textLength="69.08040000000011" x="170.76000000000002" y="84.42600999999999">ABS(X7:Y7) ABS(X6:Y6)</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 6.695999999999998pt; fill: #000" textLength="69.08579999999989" x="243.36240000000015" y="84.42600999999999">ABS(X5:Y5) ABS(X4:Y4)</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 6.695999999999998pt; fill: #000" textLength="140.70840000000015" x="316.51020000000005" y="84.42600999999999">ABS(X3:Y3) ABS(X2:Y2) ABS(X1:Y1) ABS(X0:Y0)</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="21.32638079999998" x="144.0" y="117.486111">DEST</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="124.56690440000003" x="181.38104339999998" y="119.046111">00H 00H 00H 00H</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.917956000000004pt; fill: #000" textLength="52.38420140000011" x="325.08" y="119.046111">00H 00H</text>
|
|||
|
<text lengthAdjust="spacingAndGlyphs" style="font-size: 6.695999999999998pt; fill: #000" textLength="62.27580000000006" x="391.32" y="117.00600999999997">SUM(TEMP7...TEMP0)</text></g></svg>
|
|||
|
<figcaption><a href='psadbw.html#fig-4-14'>Figure 4-14</a>. PSADBW Instruction Operation Using 64-bit Operands</figcaption></figure>
|
|||
|
<h2 id="operation">Operation<a class="anchor" href="#operation">
|
|||
|
¶
|
|||
|
</a></h2>
|
|||
|
<h3 id="vpsadbw--evex-encoded-versions-">VPSADBW (EVEX Encoded Versions)<a class="anchor" href="#vpsadbw--evex-encoded-versions-">
|
|||
|
¶
|
|||
|
</a></h3>
|
|||
|
<pre>VL = 128, 256, 512
|
|||
|
TEMP0 := ABS(SRC1[7:0] - SRC2[7:0])
|
|||
|
(* Repeat operation for bytes 1 through 15 *)
|
|||
|
TEMP15 := ABS(SRC1[127:120] - SRC2[127:120])
|
|||
|
DEST[15:0] := SUM(TEMP0:TEMP7)
|
|||
|
DEST[63:16] := 000000000000H
|
|||
|
DEST[79:64] := SUM(TEMP8:TEMP15)
|
|||
|
DEST[127:80] := 00000000000H
|
|||
|
IF VL >= 256
|
|||
|
(* Repeat operation for bytes 16 through 31*)
|
|||
|
TEMP31 := ABS(SRC1[255:248] - SRC2[255:248])
|
|||
|
DEST[143:128] := SUM(TEMP16:TEMP23)
|
|||
|
DEST[191:144] := 000000000000H
|
|||
|
DEST[207:192] := SUM(TEMP24:TEMP31)
|
|||
|
DEST[223:208] := 00000000000H
|
|||
|
FI;
|
|||
|
IF VL >= 512
|
|||
|
(* Repeat operation for bytes 32 through 63*)
|
|||
|
TEMP63 := ABS(SRC1[511:504] - SRC2[511:504])
|
|||
|
DEST[271:256] := SUM(TEMP0:TEMP7)
|
|||
|
DEST[319:272] := 000000000000H
|
|||
|
DEST[335:320] := SUM(TEMP8:TEMP15)
|
|||
|
DEST[383:336] := 00000000000H
|
|||
|
DEST[399:384] := SUM(TEMP16:TEMP23)
|
|||
|
DEST[447:400] := 000000000000H
|
|||
|
DEST[463:448] := SUM(TEMP24:TEMP31)
|
|||
|
DEST[511:464] := 00000000000H
|
|||
|
FI;
|
|||
|
DEST[MAXVL-1:VL] := 0
|
|||
|
</pre>
|
|||
|
<h3 id="vpsadbw--vex-256-encoded-version-">VPSADBW (VEX.256 Encoded Version)<a class="anchor" href="#vpsadbw--vex-256-encoded-version-">
|
|||
|
¶
|
|||
|
</a></h3>
|
|||
|
<pre>TEMP0 := ABS(SRC1[7:0] - SRC2[7:0])
|
|||
|
(* Repeat operation for bytes 2 through 30*)
|
|||
|
TEMP31 := ABS(SRC1[255:248] - SRC2[255:248])
|
|||
|
DEST[15:0] := SUM(TEMP0:TEMP7)
|
|||
|
DEST[63:16] := 000000000000H
|
|||
|
DEST[79:64] := SUM(TEMP8:TEMP15)
|
|||
|
DEST[127:80] := 00000000000H
|
|||
|
DEST[143:128] := SUM(TEMP16:TEMP23)
|
|||
|
DEST[191:144] := 000000000000H
|
|||
|
DEST[207:192] := SUM(TEMP24:TEMP31)
|
|||
|
DEST[223:208] := 00000000000H
|
|||
|
DEST[MAXVL-1:256] := 0
|
|||
|
</pre>
|
|||
|
<h3 id="vpsadbw--vex-128-encoded-version-">VPSADBW (VEX.128 Encoded Version)<a class="anchor" href="#vpsadbw--vex-128-encoded-version-">
|
|||
|
¶
|
|||
|
</a></h3>
|
|||
|
<pre>TEMP0 := ABS(SRC1[7:0] - SRC2[7:0])
|
|||
|
(* Repeat operation for bytes 2 through 14 *)
|
|||
|
TEMP15 := ABS(SRC1[127:120] - SRC2[127:120])
|
|||
|
DEST[15:0] := SUM(TEMP0:TEMP7)
|
|||
|
DEST[63:16] := 000000000000H
|
|||
|
DEST[79:64] := SUM(TEMP8:TEMP15)
|
|||
|
DEST[127:80] := 00000000000H
|
|||
|
DEST[MAXVL-1:128] := 0
|
|||
|
</pre>
|
|||
|
<h3 id="psadbw--128-bit-legacy-sse-version-">PSADBW (128-bit Legacy SSE Version)<a class="anchor" href="#psadbw--128-bit-legacy-sse-version-">
|
|||
|
¶
|
|||
|
</a></h3>
|
|||
|
<pre>TEMP0 := ABS(DEST[7:0] - SRC[7:0])
|
|||
|
(* Repeat operation for bytes 2 through 14 *)
|
|||
|
TEMP15 := ABS(DEST[127:120] - SRC[127:120])
|
|||
|
DEST[15:0] := SUM(TEMP0:TEMP7)
|
|||
|
DEST[63:16] := 000000000000H
|
|||
|
DEST[79:64] := SUM(TEMP8:TEMP15)
|
|||
|
DEST[127:80] := 00000000000
|
|||
|
DEST[MAXVL-1:128] (Unmodified)
|
|||
|
</pre>
|
|||
|
<h3 id="psadbw--64-bit-operand-">PSADBW (64-bit Operand)<a class="anchor" href="#psadbw--64-bit-operand-">
|
|||
|
¶
|
|||
|
</a></h3>
|
|||
|
<pre>TEMP0 := ABS(DEST[7:0] - SRC[7:0])
|
|||
|
(* Repeat operation for bytes 2 through 6 *)
|
|||
|
TEMP7 := ABS(DEST[63:56] - SRC[63:56])
|
|||
|
DEST[15:0] := SUM(TEMP0:TEMP7)
|
|||
|
DEST[63:16] := 000000000000H
|
|||
|
</pre>
|
|||
|
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
|
|||
|
¶
|
|||
|
</a></h2>
|
|||
|
<pre>VPSADBW __m512i _mm512_sad_epu8( __m512i a, __m512i b)
|
|||
|
</pre>
|
|||
|
<pre>PSADBW __m64 _mm_sad_pu8(__m64 a,__m64 b)
|
|||
|
</pre>
|
|||
|
<pre>(V)PSADBW __m128i _mm_sad_epu8(__m128i a, __m128i b)
|
|||
|
</pre>
|
|||
|
<pre>VPSADBW __m256i _mm256_sad_epu8( __m256i a, __m256i b)
|
|||
|
</pre>
|
|||
|
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
|
|||
|
¶
|
|||
|
</a></h2>
|
|||
|
<p>None.</p>
|
|||
|
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
|
|||
|
¶
|
|||
|
</a></h2>
|
|||
|
<p>None.</p>
|
|||
|
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
|
|||
|
¶
|
|||
|
</a></h2>
|
|||
|
<p>Non-EVEX-encoded instruction, see <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
|
|||
|
<p>EVEX-encoded instruction, see Exceptions Type E4NF.nb in <span class="not-imported">Table 2-50</span>, “Type E4NF Class Exception Conditions.”</p><footer><p>
|
|||
|
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
|
|||
|
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
|
|||
|
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developer’s Manual</a> for anything serious.
|
|||
|
</p></footer></body></html>
|