ia32-64/x86/addsubps.html
2025-07-08 02:23:29 -03:00

166 lines
13 KiB
HTML
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADDSUBPS
— Packed Single Precision Floating-Point Add/Subtract</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADDSUBPS
— Packed Single Precision Floating-Point Add/Subtract</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F2 0F D0 /r ADDSUBPS xmm1, xmm2/m128</td>
<td>RM</td>
<td>V/V</td>
<td>SSE3</td>
<td>Add/subtract single precision floating-point values from xmm2/m128 to xmm1.</td></tr>
<tr>
<td>VEX.128.F2.0F.WIG D0 /r VADDSUBPS xmm1, xmm2, xmm3/m128</td>
<td>RVM</td>
<td>V/V</td>
<td>AVX</td>
<td>Add/subtract single precision floating-point values from xmm3/mem to xmm2 and stores result in xmm1.</td></tr>
<tr>
<td>VEX.256.F2.0F.WIG D0 /r VADDSUBPS ymm1, ymm2, ymm3/m256</td>
<td>RVM</td>
<td>V/V</td>
<td>AVX</td>
<td>Add / subtract single precision floating-point values from ymm3/mem to ymm2 and stores result in ymm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>RVM</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adds odd-numbered single precision floating-point values of the first source operand (second operand) with the corresponding single precision floating-point values from the second source operand (third operand); stores the result in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered single precision floating-point values from the second source operand from the corresponding single precision floating values in the first source operand; stores the result into the even-numbered values of the destination operand.</p>
<p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified. See <a href='addsubps.html#fig-3-4'>Figure 3-4</a>.</p>
<p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding YMM register destination are zeroed.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>
<figure id="fig-3-4">
<svg style="width: 432.018pt; height: 167.40719999999996pt" viewBox="115.111 0.0 365.015 144.50599999999997">
<g xmlns="http://www.w3.org/2000/svg" style="fill: none; stroke: none">
<rect height="139.506" style="stroke: rgb(0%, 0%, 0%)" width="360.015" x="117.611" y="0.0"></rect>
<path d="M 394.372 69.09300000000007 L 394.372 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 397.132 68.73300000000006 L 394.372 74.25300000000004 L 391.612 68.73300000000006 L 397.132 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<rect height="38.252" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="356.12" y="74.25300000000004"></rect>
<rect height="38.252" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="356.12" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="356.12" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="356.12" y="25.875999999999976"></rect>
<path d="M 317.869 69.09300000000007 L 317.869 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 320.629 68.73300000000006 L 317.869 74.25300000000004 L 315.10900000000004 68.73300000000006 L 320.629 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<rect height="38.252" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="279.617" y="74.25300000000004"></rect>
<rect height="38.252" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="279.617" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="279.617" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="279.617" y="25.875999999999976"></rect>
<path d="M 241.366 69.09300000000007 L 241.366 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 244.126 68.73300000000006 L 241.366 74.25300000000004 L 238.606 68.73300000000006 L 244.126 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<rect height="38.252" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="203.114" y="74.25300000000004"></rect>
<rect height="38.252" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="203.114" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="203.114" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="203.114" y="25.875999999999976"></rect>
<path d="M 164.862 69.09300000000007 L 164.862 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 167.623 68.73300000000006 L 164.863 74.25300000000004 L 162.10299999999998 68.73300000000006 L 167.623 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<rect height="38.252" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="126.611" y="74.25300000000004"></rect>
<rect height="38.252" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="126.611" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="126.611" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="126.611" y="25.875999999999976"></rect>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="116.4683674000002" x="221.3897" y="18.14386300000001">ADDSUBPS xmm1, xmm2/m128</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.00090000000006" x="437.07378785000003" y="38.65743223000004">xmm2/</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="28.913084200000014" x="150.4017" y="43.456863">[127:96]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.46491740000002" x="229.135" y="43.456863">[95:64]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.464917400000047" x="305.6381" y="43.456863">[63:32]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="20.016750600000023" x="384.3613" y="43.456863">[31:0]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="20.008750300000088" x="437.07378785000003" y="48.257792230000064">m128</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="32.30521139999996" x="437.07378785000003" y="92.65945723000004">RESULT:</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="52.55836626360113" x="139.5214" y="92.58567721807572">xmm1[127:96] +</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="70.42132331726216" x="207.3341" y="92.58567721807572">xmm1[95:64] - xmm2/</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="48.51457826360098" x="294.4876" y="92.58567721807572">xmm1[63:32] +</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="42.55616558866285" x="373.8308" y="92.58567721807572">xmm1[31:0] -</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="21.776816600000075" x="437.07378785000003" y="102.25981723000007">xmm1</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="66.39208131726227" x="132.7713287" y="102.18603721807574">xmm2/m128[127:96]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="40.52929331726219" x="221.7950059" y="102.18603721807574">m128[95:64]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="62.34829331726206" x="287.7375287" y="102.18603721807574">xmm2/m128[63:32]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="58.304505317262056" x="366.2006957" y="102.18603721807574">xmm2/m128[31:0]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="28.913084200000014" x="150.41210039" y="125.53514082000004">[127:96]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.46491740000002" x="229.14545278000003" y="125.53514082000004">[95:64]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.464917400000047" x="305.64832153000003" y="125.53514082000004">[63:32]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="20.016750600000023" x="384.3712735300001" y="125.53514082000004">[31:0]</text></g></svg>
<figcaption><a href='addsubps.html#fig-3-4'>Figure 3-4</a>. ADDSUBPS—Packed Single Precision Floating-Point Add/Subtract</figcaption></figure>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="addsubps--128-bit-legacy-sse-version-">ADDSUBPS (128-bit Legacy SSE Version)<a class="anchor" href="#addsubps--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := DEST[31:0] - SRC[31:0]
DEST[63:32] := DEST[63:32] + SRC[63:32]
DEST[95:64] := DEST[95:64] - SRC[95:64]
DEST[127:96] := DEST[127:96] + SRC[127:96]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vaddsubps--vex-128-encoded-version-">VADDSUBPS (VEX.128 Encoded Version)<a class="anchor" href="#vaddsubps--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] - SRC2[31:0]
DEST[63:32] := SRC1[63:32] + SRC2[63:32]
DEST[95:64] := SRC1[95:64] - SRC2[95:64]
DEST[127:96] := SRC1[127:96] + SRC2[127:96]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vaddsubps--vex-256-encoded-version-">VADDSUBPS (VEX.256 Encoded Version)<a class="anchor" href="#vaddsubps--vex-256-encoded-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] - SRC2[31:0]
DEST[63:32] := SRC1[63:32] + SRC2[63:32]
DEST[95:64] := SRC1[95:64] - SRC2[95:64]
DEST[127:96] := SRC1[127:96] + SRC2[127:96]
DEST[159:128] := SRC1[159:128] - SRC2[159:128]
DEST[191:160] := SRC1[191:160] + SRC2[191:160]
DEST[223:192] := SRC1[223:192] - SRC2[223:192]
DEST[255:224] := SRC1[255:224] + SRC2[255:224]
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>ADDSUBPS __m128 _mm_addsub_ps(__m128 a, __m128 b)
</pre>
<pre>VADDSUBPS __m256 _mm256_addsub_ps (__m256 a, __m256 b)
</pre>
<h2 class="exceptions" id="exceptions">Exceptions<a class="anchor" href="#exceptions">
</a></h2>
<p>When the source operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Overflow, Underflow, Invalid, Precision, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>