Initial commit

This commit is contained in:
NRZ Code 2025-07-08 02:23:29 -03:00
commit 92622c37d6
845 changed files with 169239 additions and 0 deletions

674
LICENSE Normal file
View file

@ -0,0 +1,674 @@
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<https://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<https://www.gnu.org/licenses/why-not-lgpl.html>.

20
README.md Normal file
View file

@ -0,0 +1,20 @@
# Intel® 64 and IA-32 Instruction Set Reference
Intel® 64 and IA-32 Instruction Set Reference using FZF ([Fuzzy Finder](https://github.com/junegunn/fzf)) interface.
This UNOFFICIAL reference was generated from the official [Intel® 64 and IA-32 Architectures Software Developers Manual](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html) by a dumb script.
There is no guarantee that some parts aren't mangled or broken and is distributed WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[![ia32-64](./img/ia32-64.png "IA32-64")](./img/ia32-64.png)
Please report issues [here](https://bolha.dev/nrzcode/ia32-64) in case you run into any.
Read [here](https://github.com/junegunn/fzf) to install FZF or checkout the project and run this according shell in use.
```sh
git clone {https://github.com,~/.local/vendor}/junegunn/fzf
~/.local/vendor/junegunn/fzf/install --all
```
You could be [here](https://www.felixcloutier.com/x86) for the online x86 reference.

17
bin/build.sh Executable file
View file

@ -0,0 +1,17 @@
#!/bin/bash
workdir=$(mktemp -d)
cd $workdir
wget -r https://www.felixcloutier.com
workdir+=/www.felixcloutier.com/x86
for f in $workdir/*; do
if [[ $f == *:* ]]; then
mv "$f" "${f//:/.}"
: "${f##*/}"
sed -i "s|$_|${_//:/.}|g" $workdir/*
f=${f//:/.}
fi
[[ $f == *html ]] || mv "$f" "$f.html"
done
sed -i -E "/href='\/x86[^#']+#/s|href='/x86/([^'#]+)#|href='\1.html#|g" $workdir/*
sed -i -E "/href='\/x86[^']+'/s|href='/x86/([^']+)'|href='\1.html'|g" $workdir/*
sed -i "/href='\/x86\/'/s|href='/x86/'|href='index.html'|g" $workdir/*

90
ia32-64.sh Executable file
View file

@ -0,0 +1,90 @@
#!/usr/bin/env bash
# -------------------------------------------
# @script: ia32-64.sh
# @link: https://bolha.dev/nrzcode/ia32-64
# @description: Intel® 64 and IA-32 Instruction Set Reference using FZF interface.
# @license: GNU/GPL v3.0
# @version: 0.0.1
# @author: NRZ Code <nrzcode@protonmail.com>
# @created: 08/07/2025 01:24
#
# @requirements: ---
# @bugs: ---
# @notes: ---
# @revision: ---
# -------------------------------------------
# Copyright (C) 2025 NRZ Code <nrzcode@protonmail.com>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# -------------------------------------------
# USAGE
# ia32-64.sh [OPTIONS]
# -------------------------------------------
version='0.0.1'
usage='Usage: ia32-64.sh [OPTIONS]
DESCRIPTION
Intel® 64 and IA-32 Instruction Set Reference using FZF interface
OPTIONS
General options
-h, --help Print this help usage and exit
-v, --version Display version information and exit
'
copy='Copyright (C) 2025 NRZ Code <nrzcode@protonmail.com>.
License: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
'
# functions ---------------------------------
check_dependencies() {
local code=0
for dep; do
if ! type -p $dep &>/dev/null; then
echo "ERROR: ${BASH_SOURCE##*/}: $dep dependency is missing."
code=1
fi
done
return $code
}
main() {
fzf_opt=(
--info inline-right
--pointer '█'
--color hl+:red:bold,hl:red:bold
--color label:gray
--no-scrollbar
--border
--border-label $'┤ \e[1;34mINSTRUÇÕES INTEL 32 E 64 BITS\e[0m ├'
-e -i
--reverse
--prompt ': '
--tiebreak begin
--tabstop 24
-d '|'
--with-nth $'{1} \t- {2}'
--preview 'w3m -cols $(tput cols) -o display_border=1 {3}.html'
--preview-window bottom,70%,border-top
--bind 'enter:become(w3m -o display_border=1 {3}.html),esc:become(true)'
)
fzf "${fzf_opt[@]}" < ../inst.list
}
# main --------------------------------------
check_dependencies fzf w3m || exit 1
workdir=$(dirname $BASH_SOURCE)
cd $workdir/x86
[[ $BASH_SOURCE == $0 ]] && main "$@"

BIN
img/ia32-64.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

1222
inst.list Normal file

File diff suppressed because it is too large Load diff

95
x86/aaa.html Normal file
View file

@ -0,0 +1,95 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AAA
— ASCII Adjust After Addition</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AAA
— ASCII Adjust After Addition</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>37</td>
<td>AAA</td>
<td>ZO</td>
<td>Invalid</td>
<td>Valid</td>
<td>ASCII adjust AL after addition.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adjusts the sum of two unpacked BCD values to create an unpacked BCD result. The AL register is the implied source and destination operand for this instruction. The AAA instruction is only useful when it follows an ADD instruction that adds (binary addition) two unpacked BCD values and stores a byte result in the AL register. The AAA instruction then adjusts the contents of the AL register to contain the correct 1-digit unpacked BCD result.</p>
<p>If the addition produces a decimal carry, the AH register increments by 1, and the CF and AF flags are set. If there was no decimal carry, the CF and AF flags are cleared and the AH register is unchanged. In either case, bits 4 through 7 of the AL register are set to 0.</p>
<p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF 64-Bit Mode
THEN
#UD;
ELSE
IF ((AL AND 0FH) &gt; 9) or (AF = 1)
THEN
AX := AX + 106H;
AF := 1;
CF := 1;
ELSE
AF := 0;
CF := 0;
FI;
AL := AL AND 0FH;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The AF and CF flags are set to 1 if the adjustment results in a decimal carry; otherwise they are set to 0. The OF, SF, ZF, and PF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If in 64-bit mode.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

99
x86/aad.html Normal file
View file

@ -0,0 +1,99 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AAD
— ASCII Adjust AX Before Division</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AAD
— ASCII Adjust AX Before Division</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>D5 0A</td>
<td>AAD</td>
<td>ZO</td>
<td>Invalid</td>
<td>Valid</td>
<td>ASCII adjust AX before division.</td></tr>
<tr>
<td>D5 ib</td>
<td>AAD imm8</td>
<td>ZO</td>
<td>Invalid</td>
<td>Valid</td>
<td>Adjust AX before division to number base imm8.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adjusts two unpacked BCD digits (the least-significant digit in the AL register and the most-significant digit in the AH register) so that a division operation performed on the result will yield a correct unpacked BCD value. The AAD instruction is only useful when it precedes a DIV instruction that divides (binary division) the adjusted value in the AX register by an unpacked BCD value.</p>
<p>The AAD instruction sets the value in the AL register to (AL + (10 * AH)), and then clears the AH register to 00H. The value in the AX register is then equal to the binary equivalent of the original unpacked two-digit (base 10) number in registers AH and AL.</p>
<p>The generalized version of this instruction allows adjustment of two unpacked digits of any number base (see the “Operation” section below), by setting the <em>imm8</em> byte to the selected number base (for example, 08H for octal, 0AH for decimal, or 0CH for base 12 numbers). The AAD mnemonic is interpreted by all assemblers to mean adjust ASCII (base 10) values. To adjust values in another number base, the instruction must be hand coded in machine code (D5 <em>imm8</em>).</p>
<p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF 64-Bit Mode
THEN
#UD;
ELSE
tempAL := AL;
tempAH := AH;
AL := (tempAL + (tempAH <em>imm8</em>)) AND FFH;
(* <em>imm8</em> is set to 0AH for the AAD mnemonic.*)
AH := 0;
FI;
The immediate value (<em>imm8</em>) is taken from the second byte of the instruction.
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The SF, ZF, and PF flags are set according to the resulting binary value in the AL register; the OF, AF, and CF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If in 64-bit mode.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

99
x86/aam.html Normal file
View file

@ -0,0 +1,99 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AAM
— ASCII Adjust AX After Multiply</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AAM
— ASCII Adjust AX After Multiply</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>D4 0A</td>
<td>AAM</td>
<td>ZO</td>
<td>Invalid</td>
<td>Valid</td>
<td>ASCII adjust AX after multiply.</td></tr>
<tr>
<td>D4 ib</td>
<td>AAM imm8</td>
<td>ZO</td>
<td>Invalid</td>
<td>Valid</td>
<td>Adjust AX after multiply to number base imm8.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adjusts the result of the multiplication of two unpacked BCD values to create a pair of unpacked (base 10) BCD values. The AX register is the implied source and destination operand for this instruction. The AAM instruction is only useful when it follows an MUL instruction that multiplies (binary multiplication) two unpacked BCD values and stores a word result in the AX register. The AAM instruction then adjusts the contents of the AX register to contain the correct 2-digit unpacked (base 10) BCD result.</p>
<p>The generalized version of this instruction allows adjustment of the contents of the AX to create two unpacked digits of any number base (see the “Operation” section below). Here, the <em>imm8</em> byte is set to the selected number base (for example, 08H for octal, 0AH for decimal, or 0CH for base 12 numbers). The AAM mnemonic is interpreted by all assemblers to mean adjust to ASCII (base 10) values. To adjust to values in another number base, the instruction must be hand coded in machine code (D4 <em>imm8</em>).</p>
<p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF 64-Bit Mode
THEN
#UD;
ELSE
tempAL := AL;
AH := tempAL / <em>imm8</em>; (* <em>imm8</em> is set to 0AH for the AAM mnemonic *)
AL := tempAL MOD <em>imm8</em>;
FI;
The immediate value (<em>imm8</em>) is taken from the second byte of the instruction.
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The SF, ZF, and PF flags are set according to the resulting binary value in the AL register. The OF, AF, and CF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#DE</td>
<td>If an immediate value of 0 is used.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If in 64-bit mode.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

97
x86/aas.html Normal file
View file

@ -0,0 +1,97 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AAS
— ASCII Adjust AL After Subtraction</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AAS
— ASCII Adjust AL After Subtraction</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>3F</td>
<td>AAS</td>
<td>ZO</td>
<td>Invalid</td>
<td>Valid</td>
<td>ASCII adjust AL after subtraction.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adjusts the result of the subtraction of two unpacked BCD values to create a unpacked BCD result. The AL register is the implied source and destination operand for this instruction. The AAS instruction is only useful when it follows a SUB instruction that subtracts (binary subtraction) one unpacked BCD value from another and stores a byte result in the AL register. The AAA instruction then adjusts the contents of the AL register to contain the correct 1-digit unpacked BCD result.</p>
<p>If the subtraction produced a decimal carry, the AH register decrements by 1, and the CF and AF flags are set. If no decimal carry occurred, the CF and AF flags are cleared, and the AH register is unchanged. In either case, the AL register is left with its top four bits set to 0.</p>
<p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF 64-bit mode
THEN
#UD;
ELSE
IF ((AL AND 0FH) &gt; 9) or (AF = 1)
THEN
AX := AX 6;
AH := AH 1;
AF := 1;
CF := 1;
AL := AL AND 0FH;
ELSE
CF := 0;
AF := 0;
AL := AL AND 0FH;
FI;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The AF and CF flags are set to 1 if there is a decimal borrow; otherwise, they are cleared to 0. The OF, SF, ZF, and PF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If in 64-bit mode.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

313
x86/adc.html Normal file
View file

@ -0,0 +1,313 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADC
— Add With Carry</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADC
— Add With Carry</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>14 ib</td>
<td>ADC AL, imm8</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with carry imm8 to AL.</td></tr>
<tr>
<td>15 iw</td>
<td>ADC AX, imm16</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with carry imm16 to AX.</td></tr>
<tr>
<td>15 id</td>
<td>ADC EAX, imm32</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with carry imm32 to EAX.</td></tr>
<tr>
<td>REX.W + 15 id</td>
<td>ADC RAX, imm32</td>
<td>I</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add with carry imm32 sign extended to 64-bits to RAX.</td></tr>
<tr>
<td>80 /2 ib</td>
<td>ADC r/m8, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with carry imm8 to r/m8.</td></tr>
<tr>
<td>REX + 80 /2 ib</td>
<td>ADC r/m8<sup>*</sup>, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add with carry imm8 to r/m8.</td></tr>
<tr>
<td>81 /2 iw</td>
<td>ADC r/m16, imm16</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with carry imm16 to r/m16.</td></tr>
<tr>
<td>81 /2 id</td>
<td>ADC r/m32, imm32</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with CF imm32 to r/m32.</td></tr>
<tr>
<td>REX.W + 81 /2 id</td>
<td>ADC r/m64, imm32</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add with CF imm32 sign extended to 64-bits to r/m64.</td></tr>
<tr>
<td>83 /2 ib</td>
<td>ADC r/m16, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with CF sign-extended imm8 to r/m16.</td></tr>
<tr>
<td>83 /2 ib</td>
<td>ADC r/m32, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with CF sign-extended imm8 into r/m32.</td></tr>
<tr>
<td>REX.W + 83 /2 ib</td>
<td>ADC r/m64, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add with CF sign-extended imm8 into r/m64.</td></tr>
<tr>
<td>10 /r</td>
<td>ADC r/m8, r8</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with carry byte register to r/m8.</td></tr>
<tr>
<td>REX + 10 /r</td>
<td>ADC r/m8<sup>*</sup>, r8<sup>*</sup></td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add with carry byte register to r/m64.</td></tr>
<tr>
<td>11 /r</td>
<td>ADC r/m16, r16</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with carry r16 to r/m16.</td></tr>
<tr>
<td>11 /r</td>
<td>ADC r/m32, r32</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with CF r32 to r/m32.</td></tr>
<tr>
<td>REX.W + 11 /r</td>
<td>ADC r/m64, r64</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add with CF r64 to r/m64.</td></tr>
<tr>
<td>12 /r</td>
<td>ADC r8, r/m8</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with carry r/m8 to byte register.</td></tr>
<tr>
<td>REX + 12 /r</td>
<td>ADC r8<sup>*</sup>, r/m8<sup>*</sup></td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add with carry r/m64 to byte register.</td></tr>
<tr>
<td>13 /r</td>
<td>ADC r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with carry r/m16 to r16.</td></tr>
<tr>
<td>13 /r</td>
<td>ADC r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Add with CF r/m32 to r32.</td></tr>
<tr>
<td>REX.W + 13 /r</td>
<td>ADC r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add with CF r/m64 to r64.</td></tr></table>
<blockquote>
<p>*In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.</p></blockquote>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (r, w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MI</td>
<td>ModRM:r/m (r, w)</td>
<td>imm8/16/32</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>I</td>
<td>AL/AX/EAX/RAX</td>
<td>imm8/16/32</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adds the destination operand (first operand), the source operand (second operand), and the carry (CF) flag and stores the result in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) The state of the CF flag represents a carry from a previous addition. When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format.</p>
<p>The ADC instruction does not distinguish between signed or unsigned operands. Instead, the processor evaluates the result for both data types and sets the OF and CF flags to indicate a carry in the signed or unsigned result, respectively. The SF flag indicates the sign of the signed result.</p>
<p>The ADC instruction is usually executed as part of a multibyte or multiword addition in which an ADD instruction is followed by an ADC instruction.</p>
<p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>DEST := DEST + SRC + CF;
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>ADC extern unsigned char _addcarry_u8(unsigned char c_in, unsigned char src1, unsigned char src2, unsigned char *sum_out);
</pre>
<pre>ADC extern unsigned char _addcarry_u16(unsigned char c_in, unsigned short src1, unsigned short src2, unsigned short *sum_out);
</pre>
<pre>ADC extern unsigned char _addcarry_u32(unsigned char c_in, unsigned int src1, unsigned char int, unsigned int *sum_out);
</pre>
<pre>ADC extern unsigned char _addcarry_u64(unsigned char c_in, unsigned __int64 src1, unsigned __int64 src2, unsigned __int64 *sum_out);
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The OF, SF, ZF, AF, CF, and PF flags are set according to the result.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the destination is located in a non-writable segment.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

160
x86/adcx.html Normal file
View file

@ -0,0 +1,160 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADCX
— Unsigned Integer Addition of Two Operands With Carry Flag</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADCX
— Unsigned Integer Addition of Two Operands With Carry Flag</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 38 F6 /r ADCX r32, r/m32</td>
<td>RM</td>
<td>V/V</td>
<td>ADX</td>
<td>Unsigned addition of r32 with CF, r/m32 to r32, writes CF.</td></tr>
<tr>
<td>66 REX.w 0F 38 F6 /r ADCX r64, r/m64</td>
<td>RM</td>
<td>V/N.E.</td>
<td>ADX</td>
<td>Unsigned addition of r64 with CF, r/m64 to r64, writes CF.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs an unsigned addition of the destination operand (first operand), the source operand (second operand) and the carry-flag (CF) and stores the result in the destination operand. The destination operand is a general-purpose register, whereas the source operand can be a general-purpose register or memory location. The state of CF can represent a carry from a previous addition. The instruction sets the CF flag with the carry generated by the unsigned addition of the operands.</p>
<p>The ADCX instruction is executed in the context of multi-precision addition, where we add a series of operands with a carry-chain. At the beginning of a chain of additions, we need to make sure the CF is in a desired initial state. Often, this initial state needs to be 0, which can be achieved with an instruction to zero the CF (e.g. XOR).</p>
<p>This instruction is supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode.</p>
<p>In 64-bit mode, the default operation size is 32 bits. Using a REX Prefix in the form of REX.R permits access to additional registers (R8-15). Using REX Prefix in the form of REX.W promotes operation to 64 bits.</p>
<p>ADCX executes normally either inside or outside a transaction region.</p>
<p>Note: ADCX defines the OF flag differently than the ADD/ADC instructions as defined in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 2A.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF OperandSize is 64-bit
THEN CF:DEST[63:0] := DEST[63:0] + SRC[63:0] + CF;
ELSE CF:DEST[31:0] := DEST[31:0] + SRC[31:0] + CF;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>CF is updated based on result. OF, SF, ZF, AF, and PF flags are unmodified.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>unsigned char _addcarryx_u32 (unsigned char c_in, unsigned int src1, unsigned int src2, unsigned int *sum_out);
</pre>
<pre>unsigned char _addcarryx_u64 (unsigned char c_in, unsigned __int64 src1, unsigned __int64 src2, unsigned __int64 *sum_out);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>For an illegal address in the SS segment.</td></tr>
<tr>
<td rowspan="2">#GP(0)</td>
<td>For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register is used to access memory and it contains a null segment selector.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>For an illegal address in the SS segment.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If any part of the operand lies outside the effective address space from 0 to FFFFH.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>For an illegal address in the SS segment.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If any part of the operand lies outside the effective address space from 0 to FFFFH.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

301
x86/add.html Normal file
View file

@ -0,0 +1,301 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADD
— Add</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADD
— Add</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>04 ib</td>
<td>ADD AL, imm8</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>Add imm8 to AL.</td></tr>
<tr>
<td>05 iw</td>
<td>ADD AX, imm16</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>Add imm16 to AX.</td></tr>
<tr>
<td>05 id</td>
<td>ADD EAX, imm32</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>Add imm32 to EAX.</td></tr>
<tr>
<td>REX.W + 05 id</td>
<td>ADD RAX, imm32</td>
<td>I</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add imm32 sign-extended to 64-bits to RAX.</td></tr>
<tr>
<td>80 /0 ib</td>
<td>ADD r/m8, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add imm8 to r/m8.</td></tr>
<tr>
<td>REX + 80 /0 ib</td>
<td>ADD r/m8<sup>*</sup>, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add sign-extended imm8 to r/m8.</td></tr>
<tr>
<td>81 /0 iw</td>
<td>ADD r/m16, imm16</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add imm16 to r/m16.</td></tr>
<tr>
<td>81 /0 id</td>
<td>ADD r/m32, imm32</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add imm32 to r/m32.</td></tr>
<tr>
<td>REX.W + 81 /0 id</td>
<td>ADD r/m64, imm32</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add imm32 sign-extended to 64-bits to r/m64.</td></tr>
<tr>
<td>83 /0 ib</td>
<td>ADD r/m16, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add sign-extended imm8 to r/m16.</td></tr>
<tr>
<td>83 /0 ib</td>
<td>ADD r/m32, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Add sign-extended imm8 to r/m32.</td></tr>
<tr>
<td>REX.W + 83 /0 ib</td>
<td>ADD r/m64, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add sign-extended imm8 to r/m64.</td></tr>
<tr>
<td>00 /r</td>
<td>ADD r/m8, r8</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Add r8 to r/m8.</td></tr>
<tr>
<td>REX + 00 /r</td>
<td>ADD r/m8<sup>*</sup>, r8<sup>*</sup></td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add r8 to r/m8.</td></tr>
<tr>
<td>01 /r</td>
<td>ADD r/m16, r16</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Add r16 to r/m16.</td></tr>
<tr>
<td>01 /r</td>
<td>ADD r/m32, r32</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Add r32 to r/m32.</td></tr>
<tr>
<td>REX.W + 01 /r</td>
<td>ADD r/m64, r64</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add r64 to r/m64.</td></tr>
<tr>
<td>02 /r</td>
<td>ADD r8, r/m8</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Add r/m8 to r8.</td></tr>
<tr>
<td>REX + 02 /r</td>
<td>ADD r8<sup>*</sup>, r/m8<sup>*</sup></td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add r/m8 to r8.</td></tr>
<tr>
<td>03 /r</td>
<td>ADD r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Add r/m16 to r16.</td></tr>
<tr>
<td>03 /r</td>
<td>ADD r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Add r/m32 to r32.</td></tr>
<tr>
<td>REX.W + 03 /r</td>
<td>ADD r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Add r/m64 to r64.</td></tr></table>
<blockquote>
<p>*In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.</p></blockquote>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (r, w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MI</td>
<td>ModRM:r/m (r, w)</td>
<td>imm8/16/32</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>I</td>
<td>AL/AX/EAX/RAX</td>
<td>imm8/16/32</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adds the destination operand (first operand) and the source operand (second operand) and then stores the result in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format.</p>
<p>The ADD instruction performs integer addition. It evaluates the result for both signed and unsigned integer operands and sets the OF and CF flags to indicate a carry (overflow) in the signed or unsigned result, respectively. The SF flag indicates the sign of the signed result.</p>
<p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>DEST := DEST + SRC;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The OF, SF, ZF, AF, CF, and PF flags are set according to the result.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the destination is located in a non-writable segment.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

203
x86/addpd.html Normal file
View file

@ -0,0 +1,203 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADDPD
— Add Packed Double Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADDPD
— Add Packed Double Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 58 /r ADDPD xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Add packed double precision floating-point values from xmm2/mem to xmm1 and store result in xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F.WIG 58 /r VADDPD xmm1,xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Add packed double precision floating-point values from xmm3/mem to xmm2 and store result in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F.WIG 58 /r VADDPD ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Add packed double precision floating-point values from ymm3/mem to ymm2 and store result in ymm1.</td></tr>
<tr>
<td>EVEX.128.66.0F.W1 58 /r VADDPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Add packed double precision floating-point values from xmm3/m128/m64bcst to xmm2 and store result in xmm1 with writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W1 58 /r VADDPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Add packed double precision floating-point values from ymm3/m256/m64bcst to ymm2 and store result in ymm1 with writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W1 58 /r VADDPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst{er}</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Add packed double precision floating-point values from zmm3/m512/m64bcst to zmm2 and store result in zmm1 with writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adds two, four or eight packed double precision floating-point values from the first source operand to the second source operand, and stores the packed double precision floating-point result in the destination operand.</p>
<p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p>
<p>VEX.128 encoded version: the first source operand is a XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper Bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vaddpd--evex-encoded-versions--when-src2-operand-is-a-vector-register">VADDPD (EVEX Encoded Versions) When SRC2 Operand is a Vector Register<a class="anchor" href="#vaddpd--evex-encoded-versions--when-src2-operand-is-a-vector-register">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
IF (VL = 512) AND (EVEX.b = 1)
THEN
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
ELSE
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
FI;
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+63:i] := SRC1[i+63:i] + SRC2[i+63:i]
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+63:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+63:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vaddpd--evex-encoded-versions--when-src2-operand-is-a-memory-source">VADDPD (EVEX Encoded Versions) When SRC2 Operand is a Memory Source<a class="anchor" href="#vaddpd--evex-encoded-versions--when-src2-operand-is-a-memory-source">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask*
THEN
IF (EVEX.b = 1)
THEN
DEST[i+63:i] := SRC1[i+63:i] + SRC2[63:0]
ELSE
DEST[i+63:i] := SRC1[i+63:i] + SRC2[i+63:i]
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+63:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+63:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vaddpd--vex-256-encoded-version-">VADDPD (VEX.256 Encoded Version)<a class="anchor" href="#vaddpd--vex-256-encoded-version-">
</a></h3>
<pre>DEST[63:0] := SRC1[63:0] + SRC2[63:0]
DEST[127:64] := SRC1[127:64] + SRC2[127:64]
DEST[191:128] := SRC1[191:128] + SRC2[191:128]
DEST[255:192] := SRC1[255:192] + SRC2[255:192]
DEST[MAXVL-1:256] := 0
.
</pre>
<h3 id="vaddpd--vex-128-encoded-version-">VADDPD (VEX.128 Encoded Version)<a class="anchor" href="#vaddpd--vex-128-encoded-version-">
</a></h3>
<pre>DEST[63:0] := SRC1[63:0] + SRC2[63:0]
DEST[127:64] := SRC1[127:64] + SRC2[127:64]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="addpd--128-bit-legacy-sse-version-">ADDPD (128-bit Legacy SSE Version)<a class="anchor" href="#addpd--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[63:0] := DEST[63:0] + SRC[63:0]
DEST[127:64] := DEST[127:64] + SRC[127:64]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VADDPD __m512d _mm512_add_pd (__m512d a, __m512d b);
</pre>
<pre>VADDPD __m512d _mm512_mask_add_pd (__m512d s, __mmask8 k, __m512d a, __m512d b);
</pre>
<pre>VADDPD __m512d _mm512_maskz_add_pd (__mmask8 k, __m512d a, __m512d b);
</pre>
<pre>VADDPD __m256d _mm256_mask_add_pd (__m256d s, __mmask8 k, __m256d a, __m256d b);
</pre>
<pre>VADDPD __m256d _mm256_maskz_add_pd (__mmask8 k, __m256d a, __m256d b);
</pre>
<pre>VADDPD __m128d _mm_mask_add_pd (__m128d s, __mmask8 k, __m128d a, __m128d b);
</pre>
<pre>VADDPD __m128d _mm_maskz_add_pd (__mmask8 k, __m128d a, __m128d b);
</pre>
<pre>VADDPD __m512d _mm512_add_round_pd (__m512d a, __m512d b, int);
</pre>
<pre>VADDPD __m512d _mm512_mask_add_round_pd (__m512d s, __mmask8 k, __m512d a, __m512d b, int);
</pre>
<pre>VADDPD __m512d _mm512_maskz_add_round_pd (__mmask8 k, __m512d a, __m512d b, int);
</pre>
<pre>ADDPD __m256d _mm256_add_pd (__m256d a, __m256d b);
</pre>
<pre>ADDPD __m128d _mm_add_pd (__m128d a, __m128d b);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Overflow, Underflow, Invalid, Precision, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instruction, see <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

216
x86/addps.html Normal file
View file

@ -0,0 +1,216 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADDPS
— Add Packed Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADDPS
— Add Packed Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 58 /r ADDPS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE</td>
<td>Add packed single precision floating-point values from xmm2/m128 to xmm1 and store result in xmm1.</td></tr>
<tr>
<td>VEX.128.0F.WIG 58 /r VADDPS xmm1,xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Add packed single precision floating-point values from xmm3/m128 to xmm2 and store result in xmm1.</td></tr>
<tr>
<td>VEX.256.0F.WIG 58 /r VADDPS ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Add packed single precision floating-point values from ymm3/m256 to ymm2 and store result in ymm1.</td></tr>
<tr>
<td>EVEX.128.0F.W0 58 /r VADDPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Add packed single precision floating-point values from xmm3/m128/m32bcst to xmm2 and store result in xmm1 with writemask k1.</td></tr>
<tr>
<td>EVEX.256.0F.W0 58 /r VADDPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Add packed single precision floating-point values from ymm3/m256/m32bcst to ymm2 and store result in ymm1 with writemask k1.</td></tr>
<tr>
<td>EVEX.512.0F.W0 58 /r VADDPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst {er}</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Add packed single precision floating-point values from zmm3/m512/m32bcst to zmm2 and store result in zmm1 with writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adds four, eight or sixteen packed single precision floating-point values from the first source operand with the second source operand, and stores the packed single precision floating-point result in the destination operand.</p>
<p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p>
<p>VEX.128 encoded version: the first source operand is a XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper Bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vaddps--evex-encoded-versions--when-src2-operand-is-a-register">VADDPS (EVEX Encoded Versions) When SRC2 Operand is a Register<a class="anchor" href="#vaddps--evex-encoded-versions--when-src2-operand-is-a-register">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
IF (VL = 512) AND (EVEX.b = 1)
THEN
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
ELSE
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
FI;
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] := SRC1[i+31:i] + SRC2[i+31:i]
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] := 0
FI
FI;
ENDFOR;
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vaddps--evex-encoded-versions--when-src2-operand-is-a-memory-source">VADDPS (EVEX Encoded Versions) When SRC2 Operand is a Memory Source<a class="anchor" href="#vaddps--evex-encoded-versions--when-src2-operand-is-a-memory-source">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN
IF (EVEX.b = 1)
THEN
DEST[i+31:i] :=
SRC1[i+31:i] + SRC2[31:0]
ELSE
DEST[i+31:i] :=
SRC1[i+31:i] + SRC2[i+31:i]
FI;
ELSE
IF *merging-masking*
; merging-masking
THEN *DEST[i+31:i]
remains unchanged*
ELSE
; zeroing-masking
DEST[i+31:i] :=
0
FI
FI;
ENDFOR;
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vaddps--vex-256-encoded-version-">VADDPS (VEX.256 Encoded Version)<a class="anchor" href="#vaddps--vex-256-encoded-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] + SRC2[31:0]
DEST[63:32] := SRC1[63:32] + SRC2[63:32]
DEST[95:64] := SRC1[95:64] + SRC2[95:64]
DEST[127:96] := SRC1[127:96] + SRC2[127:96]
DEST[159:128] := SRC1[159:128] + SRC2[159:128]
DEST[191:160]:= SRC1[191:160] + SRC2[191:160]
DEST[223:192] := SRC1[223:192] + SRC2[223:192]
DEST[255:224] := SRC1[255:224] + SRC2[255:224].
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vaddps--vex-128-encoded-version-">VADDPS (VEX.128 Encoded Version)<a class="anchor" href="#vaddps--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] + SRC2[31:0]
DEST[63:32] := SRC1[63:32] + SRC2[63:32]
DEST[95:64] := SRC1[95:64] + SRC2[95:64]
DEST[127:96] := SRC1[127:96] + SRC2[127:96]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="addps--128-bit-legacy-sse-version-">ADDPS (128-bit Legacy SSE Version)<a class="anchor" href="#addps--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] + SRC2[31:0]
DEST[63:32] := SRC1[63:32] + SRC2[63:32]
DEST[95:64] := SRC1[95:64] + SRC2[95:64]
DEST[127:96] := SRC1[127:96] + SRC2[127:96]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VADDPS __m512 _mm512_add_ps (__m512 a, __m512 b);
</pre>
<pre>VADDPS __m512 _mm512_mask_add_ps (__m512 s, __mmask16 k, __m512 a, __m512 b);
</pre>
<pre>VADDPS __m512 _mm512_maskz_add_ps (__mmask16 k, __m512 a, __m512 b);
</pre>
<pre>VADDPS __m256 _mm256_mask_add_ps (__m256 s, __mmask8 k, __m256 a, __m256 b);
</pre>
<pre>VADDPS __m256 _mm256_maskz_add_ps (__mmask8 k, __m256 a, __m256 b);
</pre>
<pre>VADDPS __m128 _mm_mask_add_ps (__m128d s, __mmask8 k, __m128 a, __m128 b);
</pre>
<pre>VADDPS __m128 _mm_maskz_add_ps (__mmask8 k, __m128 a, __m128 b);
</pre>
<pre>VADDPS __m512 _mm512_add_round_ps (__m512 a, __m512 b, int);
</pre>
<pre>VADDPS __m512 _mm512_mask_add_round_ps (__m512 s, __mmask16 k, __m512 a, __m512 b, int);
</pre>
<pre>VADDPS __m512 _mm512_maskz_add_round_ps (__mmask16 k, __m512 a, __m512 b, int);
</pre>
<pre>ADDPS __m256 _mm256_add_ps (__m256 a, __m256 b);
</pre>
<pre>ADDPS __m128 _mm_add_ps (__m128 a, __m128 b);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Overflow, Underflow, Invalid, Precision, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instruction, see <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

136
x86/addsd.html Normal file
View file

@ -0,0 +1,136 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADDSD
— Add Scalar Double Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADDSD
— Add Scalar Double Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F2 0F 58 /r ADDSD xmm1, xmm2/m64</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Add the low double precision floating-point value from xmm2/mem to xmm1 and store the result in xmm1.</td></tr>
<tr>
<td>VEX.LIG.F2.0F.WIG 58 /r VADDSD xmm1, xmm2, xmm3/m64</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Add the low double precision floating-point value from xmm3/mem to xmm2 and store the result in xmm1.</td></tr>
<tr>
<td>EVEX.LLIG.F2.0F.W1 58 /r VADDSD xmm1 {k1}{z}, xmm2, xmm3/m64{er}</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Add the low double precision floating-point value from xmm3/m64 to xmm2 and store the result in xmm1 with writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Tuple1 Scalar</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adds the low double precision floating-point values from the second source operand and the first source operand and stores the double precision floating-point result in the destination operand.</p>
<p>The second source operand can be an XMM register or a 64-bit memory location. The first source and destination operands are XMM registers.</p>
<p>128-bit Legacy SSE version: The first source and destination operands are the same. Bits (MAXVL-1:64) of the corresponding destination register remain unchanged.</p>
<p>EVEX and VEX.128 encoded version: The first source operand is encoded by EVEX.vvvv/VEX.vvvv. Bits (127:64) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (MAXVL-1:128) of the destination register are zeroed.</p>
<p>EVEX version: The low quadword element of the destination is updated according to the writemask.</p>
<p>Software should ensure VADDSD is encoded with VEX.L=0. Encoding VADDSD with VEX.L=1 may encounter unpredictable behavior across different processor generations.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vaddsd--evex-encoded-version-">VADDSD (EVEX Encoded Version)<a class="anchor" href="#vaddsd--evex-encoded-version-">
</a></h3>
<pre>IF (EVEX.b = 1) AND SRC2 *is a register*
THEN
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
ELSE
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
FI;
IF k1[0] or *no writemask*
THEN DEST[63:0] := SRC1[63:0] + SRC2[63:0]
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[63:0] remains unchanged*
ELSE ; zeroing-masking
THEN DEST[63:0] := 0
FI;
FI;
DEST[127:64] := SRC1[127:64]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vaddsd--vex-128-encoded-version-">VADDSD (VEX.128 Encoded Version)<a class="anchor" href="#vaddsd--vex-128-encoded-version-">
</a></h3>
<pre>DEST[63:0] := SRC1[63:0] + SRC2[63:0]
DEST[127:64] := SRC1[127:64]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="addsd--128-bit-legacy-sse-version-">ADDSD (128-bit Legacy SSE Version)<a class="anchor" href="#addsd--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[63:0] := DEST[63:0] + SRC[63:0]
DEST[MAXVL-1:64] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VADDSD __m128d _mm_mask_add_sd (__m128d s, __mmask8 k, __m128d a, __m128d b);
</pre>
<pre>VADDSD __m128d _mm_maskz_add_sd (__mmask8 k, __m128d a, __m128d b);
</pre>
<pre>VADDSD __m128d _mm_add_round_sd (__m128d a, __m128d b, int);
</pre>
<pre>VADDSD __m128d _mm_mask_add_round_sd (__m128d s, __mmask8 k, __m128d a, __m128d b, int);
</pre>
<pre>VADDSD __m128d _mm_maskz_add_round_sd (__mmask8 k, __m128d a, __m128d b, int);
</pre>
<pre>ADDSD __m128d _mm_add_sd (__m128d a, __m128d b);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Overflow, Underflow, Invalid, Precision, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instruction, see <span class="not-imported">Table 2-20</span>, “Type 3 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-47</span>, “Type E3 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

136
x86/addss.html Normal file
View file

@ -0,0 +1,136 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADDSS
— Add Scalar Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADDSS
— Add Scalar Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 58 /r ADDSS xmm1, xmm2/m32</td>
<td>A</td>
<td>V/V</td>
<td>SSE</td>
<td>Add the low single precision floating-point value from xmm2/mem to xmm1 and store the result in xmm1.</td></tr>
<tr>
<td>VEX.LIG.F3.0F.WIG 58 /r VADDSS xmm1,xmm2, xmm3/m32</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Add the low single precision floating-point value from xmm3/mem to xmm2 and store the result in xmm1.</td></tr>
<tr>
<td>EVEX.LLIG.F3.0F.W0 58 /r VADDSS xmm1{k1}{z}, xmm2, xmm3/m32{er}</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Add the low single precision floating-point value from xmm3/m32 to xmm2 and store the result in xmm1with writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Tuple1 Scalar</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adds the low single precision floating-point values from the second source operand and the first source operand, and stores the double precision floating-point result in the destination operand.</p>
<p>The second source operand can be an XMM register or a 64-bit memory location. The first source and destination operands are XMM registers.</p>
<p>128-bit Legacy SSE version: The first source and destination operands are the same. Bits (MAXVL-1:32) of the corresponding the destination register remain unchanged.</p>
<p>EVEX and VEX.128 encoded version: The first source operand is encoded by EVEX.vvvv/VEX.vvvv. Bits (127:32) of the XMM register destination are copied from corresponding bits in the first source operand. Bits (MAXVL-1:128) of the destination register are zeroed.</p>
<p>EVEX version: The low doubleword element of the destination is updated according to the writemask.</p>
<p>Software should ensure VADDSS is encoded with VEX.L=0. Encoding VADDSS with VEX.L=1 may encounter unpredictable behavior across different processor generations.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vaddss--evex-encoded-versions-">VADDSS (EVEX Encoded Versions)<a class="anchor" href="#vaddss--evex-encoded-versions-">
</a></h3>
<pre>IF (EVEX.b = 1) AND SRC2 *is a register*
THEN
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
ELSE
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
FI;
IF k1[0] or *no writemask*
THEN DEST[31:0] := SRC1[31:0] + SRC2[31:0]
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[31:0] remains unchanged*
ELSE ; zeroing-masking
THEN DEST[31:0] := 0
FI;
FI;
DEST[127:32] := SRC1[127:32]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vaddss-dest--src1--src2--vex-128-encoded-version-">VADDSS DEST, SRC1, SRC2 (VEX.128 Encoded Version)<a class="anchor" href="#vaddss-dest--src1--src2--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] + SRC2[31:0]
DEST[127:32] := SRC1[127:32]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="addss-dest--src--128-bit-legacy-sse-version-">ADDSS DEST, SRC (128-bit Legacy SSE Version)<a class="anchor" href="#addss-dest--src--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := DEST[31:0] + SRC[31:0]
DEST[MAXVL-1:32] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VADDSS __m128 _mm_mask_add_ss (__m128 s, __mmask8 k, __m128 a, __m128 b);
</pre>
<pre>VADDSS __m128 _mm_maskz_add_ss (__mmask8 k, __m128 a, __m128 b);
</pre>
<pre>VADDSS __m128 _mm_add_round_ss (__m128 a, __m128 b, int);
</pre>
<pre>VADDSS __m128 _mm_mask_add_round_ss (__m128 s, __mmask8 k, __m128 a, __m128 b, int);
</pre>
<pre>VADDSS __m128 _mm_maskz_add_round_ss (__mmask8 k, __m128 a, __m128 b, int);
</pre>
<pre>ADDSS __m128 _mm_add_ss (__m128 a, __m128 b);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Overflow, Underflow, Invalid, Precision, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instruction, see <span class="not-imported">Table 2-20</span>, “Type 3 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-47</span>, “Type E3 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

135
x86/addsubpd.html Normal file
View file

@ -0,0 +1,135 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADDSUBPD
— Packed Double Precision Floating-Point Add/Subtract</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADDSUBPD
— Packed Double Precision Floating-Point Add/Subtract</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F D0 /r ADDSUBPD xmm1, xmm2/m128</td>
<td>RM</td>
<td>V/V</td>
<td>SSE3</td>
<td>Add/subtract double precision floating-point values from xmm2/m128 to xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F.WIG D0 /r VADDSUBPD xmm1, xmm2, xmm3/m128</td>
<td>RVM</td>
<td>V/V</td>
<td>AVX</td>
<td>Add/subtract packed double precision floating-point values from xmm3/mem to xmm2 and stores result in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F.WIG D0 /r VADDSUBPD ymm1, ymm2, ymm3/m256</td>
<td>RVM</td>
<td>V/V</td>
<td>AVX</td>
<td>Add / subtract packed double precision floating-point values from ymm3/mem to ymm2 and stores result in ymm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>RVM</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adds odd-numbered double precision floating-point values of the first source operand (second operand) with the corresponding double precision floating-point values from the second source operand (third operand); stores the result in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered double precision floating-point values from the second source operand from the corresponding double precision floating values in the first source operand; stores the result into the even-numbered values of the destination operand.</p>
<p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified. See <a href='addsubpd.html#fig-3-3'>Figure 3-3</a>.</p>
<p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding YMM register destination are zeroed.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>
<figure id="fig-3-3">
<svg style="width: 432.018pt; height: 151.206pt" viewBox="128.824 0.0 365.015 131.005">
<g xmlns="http://www.w3.org/2000/svg" style="stroke: none; fill: none">
<rect height="126.00500000000001" style="stroke: rgb(0%, 0%, 0%)" width="360.015" x="131.324" y="0.0"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="144.006" x="144.82500000000002" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="144.006" x="144.82500000000002" y="25.875999999999976"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="144.006" x="144.82500000000002" y="74.25300000000004"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="144.006" x="144.82500000000002" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="144.006" x="288.831" y="74.25300000000004"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="144.006" x="288.831" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="144.006" x="288.831" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="144.006" x="288.831" y="25.875999999999976"></rect>
<path d="M 216.828 69.09300000000007 L 216.828 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 219.588 68.73300000000006 L 216.828 74.25300000000004 L 214.06799999999998 68.73300000000006 L 219.588 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 360.834 69.09300000000007 L 360.834 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 363.594 68.73300000000006 L 360.834 74.25300000000004 L 358.074 68.73300000000006 L 363.594 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="116.90838390000019" x="230.38335018" y="18.14341284000011">ADDSUBPD xmm1, xmm2/m128</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="44.009650300000146" x="436.83669189000005" y="43.456362040000045">xmm2/m128</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="28.913084200000014" x="202.3676" y="43.456863">[127:64]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="20.016750600000023" x="350.82370000000003" y="43.456863">[63:0]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="32.30521139999996" x="436.83669189000005" y="87.03319611000006">RESULT:</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="132.73297730000004" x="150.4553" y="91.83386300000006">xmm1[127:64] + xmm2/m128[127:64]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="112.93223480000023" x="304.3718" y="91.83386300000006">xmm1[63:0] - xmm2/m128[63:0]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="21.776816600000075" x="436.83669189000005" y="96.63355611000009">xmm1</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="28.913084200000014" x="202.3775" y="115.48466300000007">[127:64]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="20.016750600000023" x="350.83346689" y="115.48466300000007">[63:0]</text></g></svg>
<figcaption><a href='addsubpd.html#fig-3-3'>Figure 3-3</a>. ADDSUBPD—Packed Double Precision Floating-Point Add/Subtract</figcaption></figure>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="addsubpd--128-bit-legacy-sse-version-">ADDSUBPD (128-bit Legacy SSE Version)<a class="anchor" href="#addsubpd--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[63:0] := DEST[63:0] - SRC[63:0]
DEST[127:64] := DEST[127:64] + SRC[127:64]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vaddsubpd--vex-128-encoded-version-">VADDSUBPD (VEX.128 Encoded Version)<a class="anchor" href="#vaddsubpd--vex-128-encoded-version-">
</a></h3>
<pre>DEST[63:0] := SRC1[63:0] - SRC2[63:0]
DEST[127:64] := SRC1[127:64] + SRC2[127:64]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vaddsubpd--vex-256-encoded-version-">VADDSUBPD (VEX.256 Encoded Version)<a class="anchor" href="#vaddsubpd--vex-256-encoded-version-">
</a></h3>
<pre>DEST[63:0] := SRC1[63:0] - SRC2[63:0]
DEST[127:64] := SRC1[127:64] + SRC2[127:64]
DEST[191:128] := SRC1[191:128] - SRC2[191:128]
DEST[255:192] := SRC1[255:192] + SRC2[255:192]
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>ADDSUBPD __m128d _mm_addsub_pd(__m128d a, __m128d b)
</pre>
<pre>VADDSUBPD __m256d _mm256_addsub_pd (__m256d a, __m256d b)
</pre>
<h2 class="exceptions" id="exceptions">Exceptions<a class="anchor" href="#exceptions">
</a></h2>
<p>When the source operand is a memory operand, it must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Overflow, Underflow, Invalid, Precision, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

166
x86/addsubps.html Normal file
View file

@ -0,0 +1,166 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADDSUBPS
— Packed Single Precision Floating-Point Add/Subtract</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADDSUBPS
— Packed Single Precision Floating-Point Add/Subtract</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F2 0F D0 /r ADDSUBPS xmm1, xmm2/m128</td>
<td>RM</td>
<td>V/V</td>
<td>SSE3</td>
<td>Add/subtract single precision floating-point values from xmm2/m128 to xmm1.</td></tr>
<tr>
<td>VEX.128.F2.0F.WIG D0 /r VADDSUBPS xmm1, xmm2, xmm3/m128</td>
<td>RVM</td>
<td>V/V</td>
<td>AVX</td>
<td>Add/subtract single precision floating-point values from xmm3/mem to xmm2 and stores result in xmm1.</td></tr>
<tr>
<td>VEX.256.F2.0F.WIG D0 /r VADDSUBPS ymm1, ymm2, ymm3/m256</td>
<td>RVM</td>
<td>V/V</td>
<td>AVX</td>
<td>Add / subtract single precision floating-point values from ymm3/mem to ymm2 and stores result in ymm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>RVM</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Adds odd-numbered single precision floating-point values of the first source operand (second operand) with the corresponding single precision floating-point values from the second source operand (third operand); stores the result in the odd-numbered values of the destination operand (first operand). Subtracts the even-numbered single precision floating-point values from the second source operand from the corresponding single precision floating values in the first source operand; stores the result into the even-numbered values of the destination operand.</p>
<p>In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15).</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified. See <a href='addsubps.html#fig-3-4'>Figure 3-4</a>.</p>
<p>VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding YMM register destination are zeroed.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>
<figure id="fig-3-4">
<svg style="width: 432.018pt; height: 167.40719999999996pt" viewBox="115.111 0.0 365.015 144.50599999999997">
<g xmlns="http://www.w3.org/2000/svg" style="fill: none; stroke: none">
<rect height="139.506" style="stroke: rgb(0%, 0%, 0%)" width="360.015" x="117.611" y="0.0"></rect>
<path d="M 394.372 69.09300000000007 L 394.372 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 397.132 68.73300000000006 L 394.372 74.25300000000004 L 391.612 68.73300000000006 L 397.132 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<rect height="38.252" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="356.12" y="74.25300000000004"></rect>
<rect height="38.252" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="356.12" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="356.12" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="356.12" y="25.875999999999976"></rect>
<path d="M 317.869 69.09300000000007 L 317.869 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 320.629 68.73300000000006 L 317.869 74.25300000000004 L 315.10900000000004 68.73300000000006 L 320.629 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<rect height="38.252" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="279.617" y="74.25300000000004"></rect>
<rect height="38.252" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="279.617" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="279.617" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="279.617" y="25.875999999999976"></rect>
<path d="M 241.366 69.09300000000007 L 241.366 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 244.126 68.73300000000006 L 241.366 74.25300000000004 L 238.606 68.73300000000006 L 244.126 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<rect height="38.252" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="203.114" y="74.25300000000004"></rect>
<rect height="38.252" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="203.114" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="203.114" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="203.114" y="25.875999999999976"></rect>
<path d="M 164.862 69.09300000000007 L 164.862 52.88200000000006" style="fill-rule: nonzero; stroke: rgb(0%, 0%, 0%)"></path>
<path d="M 167.623 68.73300000000006 L 164.863 74.25300000000004 L 162.10299999999998 68.73300000000006 L 167.623 68.73300000000006" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<rect height="38.252" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="126.611" y="74.25300000000004"></rect>
<rect height="38.252" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="126.611" y="74.25300000000004"></rect>
<rect height="27.001" style="fill: rgb(0%, 0%, 0%)" width="76.503" x="126.611" y="25.875999999999976"></rect>
<rect height="27.001" style="stroke: rgb(0%, 0%, 0%)" width="76.503" x="126.611" y="25.875999999999976"></rect>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="116.4683674000002" x="221.3897" y="18.14386300000001">ADDSUBPS xmm1, xmm2/m128</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.00090000000006" x="437.07378785000003" y="38.65743223000004">xmm2/</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="28.913084200000014" x="150.4017" y="43.456863">[127:96]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.46491740000002" x="229.135" y="43.456863">[95:64]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.464917400000047" x="305.6381" y="43.456863">[63:32]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="20.016750600000023" x="384.3613" y="43.456863">[31:0]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="20.008750300000088" x="437.07378785000003" y="48.257792230000064">m128</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="32.30521139999996" x="437.07378785000003" y="92.65945723000004">RESULT:</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="52.55836626360113" x="139.5214" y="92.58567721807572">xmm1[127:96] +</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="70.42132331726216" x="207.3341" y="92.58567721807572">xmm1[95:64] - xmm2/</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="48.51457826360098" x="294.4876" y="92.58567721807572">xmm1[63:32] +</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="42.55616558866285" x="373.8308" y="92.58567721807572">xmm1[31:0] -</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="21.776816600000075" x="437.07378785000003" y="102.25981723000007">xmm1</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="66.39208131726227" x="132.7713287" y="102.18603721807574">xmm2/m128[127:96]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="40.52929331726219" x="221.7950059" y="102.18603721807574">m128[95:64]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="62.34829331726206" x="287.7375287" y="102.18603721807574">xmm2/m128[63:32]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.063515733038685pt; fill: #000" textLength="58.304505317262056" x="366.2006957" y="102.18603721807574">xmm2/m128[31:0]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="28.913084200000014" x="150.41210039" y="125.53514082000004">[127:96]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.46491740000002" x="229.14545278000003" y="125.53514082000004">[95:64]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="24.464917400000047" x="305.64832153000003" y="125.53514082000004">[63:32]</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 7.408277799999951pt; fill: #000" textLength="20.016750600000023" x="384.3712735300001" y="125.53514082000004">[31:0]</text></g></svg>
<figcaption><a href='addsubps.html#fig-3-4'>Figure 3-4</a>. ADDSUBPS—Packed Single Precision Floating-Point Add/Subtract</figcaption></figure>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="addsubps--128-bit-legacy-sse-version-">ADDSUBPS (128-bit Legacy SSE Version)<a class="anchor" href="#addsubps--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := DEST[31:0] - SRC[31:0]
DEST[63:32] := DEST[63:32] + SRC[63:32]
DEST[95:64] := DEST[95:64] - SRC[95:64]
DEST[127:96] := DEST[127:96] + SRC[127:96]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vaddsubps--vex-128-encoded-version-">VADDSUBPS (VEX.128 Encoded Version)<a class="anchor" href="#vaddsubps--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] - SRC2[31:0]
DEST[63:32] := SRC1[63:32] + SRC2[63:32]
DEST[95:64] := SRC1[95:64] - SRC2[95:64]
DEST[127:96] := SRC1[127:96] + SRC2[127:96]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vaddsubps--vex-256-encoded-version-">VADDSUBPS (VEX.256 Encoded Version)<a class="anchor" href="#vaddsubps--vex-256-encoded-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] - SRC2[31:0]
DEST[63:32] := SRC1[63:32] + SRC2[63:32]
DEST[95:64] := SRC1[95:64] - SRC2[95:64]
DEST[127:96] := SRC1[127:96] + SRC2[127:96]
DEST[159:128] := SRC1[159:128] - SRC2[159:128]
DEST[191:160] := SRC1[191:160] + SRC2[191:160]
DEST[223:192] := SRC1[223:192] - SRC2[223:192]
DEST[255:224] := SRC1[255:224] + SRC2[255:224]
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>ADDSUBPS __m128 _mm_addsub_ps(__m128 a, __m128 b)
</pre>
<pre>VADDSUBPS __m256 _mm256_addsub_ps (__m256 a, __m256 b)
</pre>
<h2 class="exceptions" id="exceptions">Exceptions<a class="anchor" href="#exceptions">
</a></h2>
<p>When the source operand is a memory operand, the operand must be aligned on a 16-byte boundary or a general-protection exception (#GP) will be generated.</p>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Overflow, Underflow, Invalid, Precision, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

160
x86/adox.html Normal file
View file

@ -0,0 +1,160 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ADOX
— Unsigned Integer Addition of Two Operands With Overflow Flag</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ADOX
— Unsigned Integer Addition of Two Operands With Overflow Flag</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 38 F6 /r ADOX r32, r/m32</td>
<td>RM</td>
<td>V/V</td>
<td>ADX</td>
<td>Unsigned addition of r32 with OF, r/m32 to r32, writes OF.</td></tr>
<tr>
<td>F3 REX.w 0F 38 F6 /r ADOX r64, r/m64</td>
<td>RM</td>
<td>V/N.E.</td>
<td>ADX</td>
<td>Unsigned addition of r64 with OF, r/m64 to r64, writes OF.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs an unsigned addition of the destination operand (first operand), the source operand (second operand) and the overflow-flag (OF) and stores the result in the destination operand. The destination operand is a general-purpose register, whereas the source operand can be a general-purpose register or memory location. The state of OF represents a carry from a previous addition. The instruction sets the OF flag with the carry generated by the unsigned addition of the operands.</p>
<p>The ADOX instruction is executed in the context of multi-precision addition, where we add a series of operands with a carry-chain. At the beginning of a chain of additions, we execute an instruction to zero the OF (e.g. XOR).</p>
<p>This instruction is supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode.</p>
<p>In 64-bit mode, the default operation size is 32 bits. Using a REX Prefix in the form of REX.R permits access to additional registers (R8-15). Using REX Prefix in the form of REX.W promotes operation to 64-bits.</p>
<p>ADOX executes normally either inside or outside a transaction region.</p>
<p>Note: ADOX defines the CF and OF flags differently than the ADD/ADC instructions as defined in Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 2A.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF OperandSize is 64-bit
THEN OF:DEST[63:0] := DEST[63:0] + SRC[63:0] + OF;
ELSE OF:DEST[31:0] := DEST[31:0] + SRC[31:0] + OF;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>OF is updated based on result. CF, SF, ZF, AF, and PF flags are unmodified.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>unsigned char _addcarryx_u32 (unsigned char c_in, unsigned int src1, unsigned int src2, unsigned int *sum_out);
</pre>
<pre>unsigned char _addcarryx_u64 (unsigned char c_in, unsigned __int64 src1, unsigned __int64 src2, unsigned __int64 *sum_out);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>For an illegal address in the SS segment.</td></tr>
<tr>
<td rowspan="2">#GP(0)</td>
<td>For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register is used to access memory and it contains a null segment selector.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>For an illegal address in the SS segment.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If any part of the operand lies outside the effective address space from 0 to FFFFH.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>For an illegal address in the SS segment.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If any part of the operand lies outside the effective address space from 0 to FFFFH.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.ADX[bit 19] = 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

149
x86/aesdec.html Normal file
View file

@ -0,0 +1,149 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESDEC
— Perform One Round of an AES Decryption Flow</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESDEC
— Perform One Round of an AES Decryption Flow</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 38 DE /r AESDEC xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AES</td>
<td>Perform one round of an AES decryption flow, using the Equivalent Inverse Cipher, using one 128-bit data (state) from xmm1 with one 128-bit round key from xmm2/m128.</td></tr>
<tr>
<td>VEX.128.66.0F38.WIG DE /r VAESDEC xmm1, xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AES AVX</td>
<td>Perform one round of an AES decryption flow, using the Equivalent Inverse Cipher, using one 128-bit data (state) from xmm2 with one 128-bit round key from xmm3/m128; store the result in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F38.WIG DE /r VAESDEC ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>VAES</td>
<td>Perform one round of an AES decryption flow, using the Equivalent Inverse Cipher, using two 128-bit data (state) from ymm2 with two 128-bit round keys from ymm3/m256; store the result in ymm1.</td></tr>
<tr>
<td>EVEX.128.66.0F38.WIG DE /r VAESDEC xmm1, xmm2, xmm3/m128</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512VL</td>
<td>Perform one round of an AES decryption flow, using the Equivalent Inverse Cipher, using one 128-bit data (state) from xmm2 with one 128-bit round key from xmm3/m128; store the result in xmm1.</td></tr>
<tr>
<td>EVEX.256.66.0F38.WIG DE /r VAESDEC ymm1, ymm2, ymm3/m256</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512VL</td>
<td>Perform one round of an AES decryption flow, using the Equivalent Inverse Cipher, using two 128-bit data (state) from ymm2 with two 128-bit round keys from ymm3/m256; store the result in ymm1.</td></tr>
<tr>
<td>EVEX.512.66.0F38.WIG DE /r VAESDEC zmm1, zmm2, zmm3/m512</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512F</td>
<td>Perform one round of an AES decryption flow, using the Equivalent Inverse Cipher, using four 128-bit data (state) from zmm2 with four 128-bit round keys from zmm3/m512; store the result in zmm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full Mem</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>This instruction performs a single round of the AES decryption flow using the Equivalent Inverse Cipher, using one/two/four (depending on vector length) 128-bit data (state) from the first source operand with one/two/four (depending on vector length) round key(s) from the second source operand, and stores the result in the destination operand.</p>
<p>Use the AESDEC instruction for all but the last decryption round. For the last decryption round, use the AESDECLAST instruction.</p>
<p>VEX and EVEX encoded versions of the instruction allow 3-operand (non-destructive) operation. The legacy encoded versions of the instruction require that the first source operand and the destination operand are the same and must be an XMM register.</p>
<p>The EVEX encoded form of this instruction does not support memory fault suppression.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="aesdec">AESDEC<a class="anchor" href="#aesdec">
</a></h3>
<pre>STATE := SRC1;
RoundKey := SRC2;
STATE := InvShiftRows( STATE );
STATE := InvSubBytes( STATE );
STATE := InvMixColumns( STATE );
DEST[127:0] := STATE XOR RoundKey;
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vaesdec--128b-and-256b-vex-encoded-versions-">VAESDEC (128b and 256b VEX Encoded Versions)<a class="anchor" href="#vaesdec--128b-and-256b-vex-encoded-versions-">
</a></h3>
<pre>(KL,VL) = (1,128), (2,256)
FOR i = 0 to KL-1:
STATE := SRC1.xmm[i]
RoundKey := SRC2.xmm[i]
STATE := InvShiftRows( STATE )
STATE := InvSubBytes( STATE )
STATE := InvMixColumns( STATE )
DEST.xmm[i] := STATE XOR RoundKey
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vaesdec--evex-encoded-version-">VAESDEC (EVEX Encoded Version)<a class="anchor" href="#vaesdec--evex-encoded-version-">
</a></h3>
<pre>(KL,VL) = (1,128), (2,256), (4,512)
FOR i = 0 to KL-1:
STATE := SRC1.xmm[i]
RoundKey := SRC2.xmm[i]
STATE := InvShiftRows( STATE )
STATE := InvSubBytes( STATE )
STATE := InvMixColumns( STATE )
DEST.xmm[i] := STATE XOR RoundKey
DEST[MAXVL-1:VL] :=0
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>(V)AESDEC __m128i _mm_aesdec (__m128i, __m128i)
</pre>
<pre>VAESDEC __m256i _mm256_aesdec_epi128(__m256i, __m256i);
</pre>
<pre>VAESDEC __m512i _mm512_aesdec_epi128(__m512i, __m512i);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>EVEX-encoded: See <span class="not-imported">Table 2-50</span>, “Type E4NF Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

98
x86/aesdec128kl.html Normal file
View file

@ -0,0 +1,98 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESDEC128KL
— Perform Ten Rounds of AES Decryption Flow With Key Locker Using 128-BitKey</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESDEC128KL
— Perform Ten Rounds of AES Decryption Flow With Key Locker Using 128-BitKey</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 38 DD !(11):rrr:bbb AESDEC128KL xmm, m384</td>
<td>A</td>
<td>V/V</td>
<td>AESKLE</td>
<td>Decrypt xmm using 128-bit AES key indicated by handle at m384 and store result in xmm.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>The AESDEC128KL<sup>1</sup> instruction performs 10 rounds of AES to decrypt the first operand using the 128-bit key indicated by the handle from the second operand. It stores the result in the first operand if the operation succeeds (e.g., does not run into a handle violation failure).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h4 id="aesdec128kl">AESDEC128KL<a class="anchor" href="#aesdec128kl">
</a></h4>
<pre>Handle := UnalignedLoad of 384 bit (SRC); // Load is not guaranteed to be atomic.
Illegal Handle = (HandleReservedBitSet (Handle) ||
(Handle[0] AND (CPL &gt; 0)) ||
Handle [2] ||
HandleKeyType (Handle) != HANDLE_KEY_TYPE_AES128);
IF (Illegal Handle) {
THEN RFLAGS.ZF := 1;
ELSE
(UnwrappedKey, Authentic) := UnwrapKeyAndAuthenticate384 (Handle[383:0], IWKey);
IF (Authentic == 0)
THEN RFLAGS.ZF := 1;
ELSE
DEST := AES128Decrypt (DEST, UnwrappedKey) ;
RFLAGS.ZF := 0;
FI;
FI;
RFLAGS.OF, SF, AF, PF, CF := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF is set to 0 if the operation succeeded and set to 1 if the operation failed due to a handle violation. The other arithmetic flags (OF, SF, AF, PF, CF) are cleared to 0.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>AESDEC128KL unsigned char _mm_aesdec128kl_u8(__m128i* odata, __m128i idata, const void* h);
</pre>
<pre>1. Further details on Key Locker and usage of this instruction can be found here:
</pre>
<h3 id="https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html.<a class="anchor" href="#https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">
</a></h3>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p>
<p>If CPUID.07H:ECX.KL[bit 23] = 0.</p>
<p>If CR4.KL = 0.</p>
<p>If CPUID.19H:EBX.AESKLE[bit 0] = 0.</p>
<p>If CR0.EM = 1.</p>
<p>If CR4.OSFXSR = 0.</p>
<p>#NM If CR0.TS = 1.</p>
<p>#PF If a page fault occurs.</p>
<p>#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</p>
<p>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</p>
<p>If the memory address is in a non-canonical form.</p>
<p>#SS(0) If a memory operand effective address is outside the SS segment limit.</p>
<p>If a memory address referencing the SS segment is in a non-canonical form.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

98
x86/aesdec256kl.html Normal file
View file

@ -0,0 +1,98 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESDEC256KL
— Perform 14 Rounds of AES Decryption Flow With Key Locker Using 256-Bit Key</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESDEC256KL
— Perform 14 Rounds of AES Decryption Flow With Key Locker Using 256-Bit Key</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 38 DF !(11):rrr:bbb AESDEC256KL xmm, m512</td>
<td>A</td>
<td>V/V</td>
<td>AESKLE</td>
<td>Decrypt xmm using 256-bit AES key indicated by handle at m512 and store result in xmm.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>The AESDEC256KL<sup>1</sup> instruction performs 14 rounds of AES to decrypt the first operand using the 256-bit key indicated by the handle from the second operand. It stores the result in the first operand if the operation succeeds (e.g., does not run into a handle violation failure).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h4 id="aesdec256kl">AESDEC256KL<a class="anchor" href="#aesdec256kl">
</a></h4>
<pre>Handle := UnalignedLoad of 512 bit (SRC); // Load is not guaranteed to be atomic.
Illegal Handle = (HandleReservedBitSet (Handle) ||
(Handle[0] AND (CPL &gt; 0)) ||
Handle [2] ||
HandleKeyType (Handle) != HANDLE_KEY_TYPE_AES256);
IF (Illegal Handle)
THEN RFLAGS.ZF := 1;
ELSE
(UnwrappedKey, Authentic) := UnwrapKeyAndAuthenticate512 (Handle[511:0], IWKey);
IF (Authentic == 0)
THEN RFLAGS.ZF := 1;
ELSE
DEST := AES256Decrypt (DEST, UnwrappedKey) ;
RFLAGS.ZF := 0;
FI;
FI;
RFLAGS.OF, SF, AF, PF, CF := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF is set to 0 if the operation succeeded and set to 1 if the operation failed due to a handle violation. The other arithmetic flags (OF, SF, AF, PF, CF) are cleared to 0.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>AESDEC256KL unsigned char _mm_aesdec256kl_u8(__m128i* odata, __m128i idata, const void* h);
</pre>
<pre>1. Further details on Key Locker and usage of this instruction can be found here:
</pre>
<h3 id="https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html.<a class="anchor" href="#https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">
</a></h3>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p>
<p>If CPUID.07H:ECX.KL[bit 23] = 0.</p>
<p>If CR4.KL = 0.</p>
<p>If CPUID.19H:EBX.AESKLE[bit 0] = 0.</p>
<p>If CR0.EM = 1.</p>
<p>If CR4.OSFXSR = 0.</p>
<p>#NM If CR0.TS = 1.</p>
<p>#PF If a page fault occurs.</p>
<p>#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</p>
<p>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</p>
<p>If the memory address is in a non-canonical form.</p>
<p>#SS(0) If a memory operand effective address is outside the SS segment limit.</p>
<p>If a memory address referencing the SS segment is in a non-canonical form.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

145
x86/aesdeclast.html Normal file
View file

@ -0,0 +1,145 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESDECLAST
— Perform Last Round of an AES Decryption Flow</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESDECLAST
— Perform Last Round of an AES Decryption Flow</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 38 DF /r AESDECLAST xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AES</td>
<td>Perform the last round of an AES decryption flow, using the Equivalent Inverse Cipher, using one 128-bit data (state) from xmm1 with one 128-bit round key from xmm2/m128.</td></tr>
<tr>
<td>VEX.128.66.0F38.WIG DF /r VAESDECLAST xmm1, xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AES AVX</td>
<td>Perform the last round of an AES decryption flow, using the Equivalent Inverse Cipher, using one 128-bit data (state) from xmm2 with one 128-bit round key from xmm3/m128; store the result in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F38.WIG DF /r VAESDECLAST ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>VAES</td>
<td>Perform the last round of an AES decryption flow, using the Equivalent Inverse Cipher, using two 128-bit data (state) from ymm2 with two 128-bit round keys from ymm3/m256; store the result in ymm1.</td></tr>
<tr>
<td>EVEX.128.66.0F38.WIG DF /r VAESDECLAST xmm1, xmm2, xmm3/m128</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512VL</td>
<td>Perform the last round of an AES decryption flow, using the Equivalent Inverse Cipher, using one 128-bit data (state) from xmm2 with one 128-bit round key from xmm3/m128; store the result in xmm1.</td></tr>
<tr>
<td>EVEX.256.66.0F38.WIG DF /r VAESDECLAST ymm1, ymm2, ymm3/m256</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512VL</td>
<td>Perform the last round of an AES decryption flow, using the Equivalent Inverse Cipher, using two 128-bit data (state) from ymm2 with two 128-bit round keys from ymm3/m256; store the result in ymm1.</td></tr>
<tr>
<td>EVEX.512.66.0F38.WIG DF /r VAESDECLAST zmm1, zmm2, zmm3/m512</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512F</td>
<td>Perform the last round of an AES decryption flow, using the Equivalent Inverse Cipher, using four128-bit data (state) from zmm2 with four 128-bit round keys from zmm3/m512; store the result in zmm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full Mem</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>This instruction performs the last round of the AES decryption flow using the Equivalent Inverse Cipher, using one/two/four (depending on vector length) 128-bit data (state) from the first source operand with one/two/four (depending on vector length) round key(s) from the second source operand, and stores the result in the destination operand.</p>
<p>VEX and EVEX encoded versions of the instruction allow 3-operand (non-destructive) operation. The legacy encoded versions of the instruction require that the first source operand and the destination operand are the same and must be an XMM register.</p>
<p>The EVEX encoded form of this instruction does not support memory fault suppression.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="aesdeclast">AESDECLAST<a class="anchor" href="#aesdeclast">
</a></h3>
<pre>STATE := SRC1;
RoundKey := SRC2;
STATE := InvShiftRows( STATE );
STATE := InvSubBytes( STATE );
DEST[127:0] := STATE XOR RoundKey;
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vaesdeclast--128b-and-256b-vex-encoded-versions-">VAESDECLAST (128b and 256b VEX Encoded Versions)<a class="anchor" href="#vaesdeclast--128b-and-256b-vex-encoded-versions-">
</a></h3>
<pre>(KL,VL) = (1,128), (2,256)
FOR i = 0 to KL-1:
STATE := SRC1.xmm[i]
RoundKey := SRC2.xmm[i]
STATE := InvShiftRows( STATE )
STATE := InvSubBytes( STATE )
DEST.xmm[i] := STATE XOR RoundKey
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vaesdeclast--evex-encoded-version-">VAESDECLAST (EVEX Encoded Version)<a class="anchor" href="#vaesdeclast--evex-encoded-version-">
</a></h3>
<pre>(KL,VL) = (1,128), (2,256), (4,512)
FOR i = 0 to KL-1:
STATE := SRC1.xmm[i]
RoundKey := SRC2.xmm[i]
STATE := InvShiftRows( STATE )
STATE := InvSubBytes( STATE )
DEST.xmm[i] := STATE XOR RoundKey
DEST[MAXVL-1:VL] := 0
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>(V)AESDECLAST __m128i _mm_aesdeclast (__m128i, __m128i)
</pre>
<pre>VAESDECLAST __m256i _mm256_aesdeclast_epi128(__m256i, __m256i);
</pre>
<pre>VAESDECLAST __m512i _mm512_aesdeclast_epi128(__m512i, __m512i);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>EVEX-encoded: See <span class="not-imported">Table 2-50</span>, “Type E4NF Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

101
x86/aesdecwide128kl.html Normal file
View file

@ -0,0 +1,101 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESDECWIDE128KL
— Perform Ten Rounds of AES Decryption Flow With Key Locker on 8 BlocksUsing 128-Bit Key</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESDECWIDE128KL
— Perform Ten Rounds of AES Decryption Flow With Key Locker on 8 BlocksUsing 128-Bit Key</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 38 D8 !(11):001:bbb AESDECWIDE128KL m384, &lt;XMM0-7&gt;</td>
<td>A</td>
<td>V/V</td>
<td>AESKLEWIDE_KL</td>
<td>Decrypt XMM0-7 using 128-bit AES key indicated by handle at m384 and store each resultant block back to its corresponding register.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operands 2—9</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:r/m (r)</td>
<td>Implicit XMM0-7 (r, w)</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>The AESDECWIDE128KL<sup>1</sup> instruction performs ten rounds of AES to decrypt each of the eight blocks in XMM0-7 using the 128-bit key indicated by the handle from the second operand. It replaces each input block in XMM0-7 with its corresponding decrypted block if the operation succeeds (e.g., does not run into a handle violation failure).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h4 id="aesdecwide128kl">AESDECWIDE128KL<a class="anchor" href="#aesdecwide128kl">
</a></h4>
<pre>Handle := UnalignedLoad of 384 bit (SRC); // Load is not guaranteed to be atomic.
Illegal Handle = (HandleReservedBitSet (Handle) ||
(Handle[0] AND (CPL &gt; 0)) ||
Handle [2] ||
HandleKeyType (Handle) != HANDLE_KEY_TYPE_AES128);
IF (Illegal Handle)
THEN RFLAGS.ZF := 1;
ELSE
(UnwrappedKey, Authentic) := UnwrapKeyAndAuthenticate384 (Handle[383:0], IWKey);
IF Authentic == 0 {
THEN RFLAGS.ZF := 1;
ELSE
XMM0 := AES128Decrypt (XMM0, UnwrappedKey) ;
XMM1 := AES128Decrypt (XMM1, UnwrappedKey) ;
XMM2 := AES128Decrypt (XMM2, UnwrappedKey) ;
XMM3 := AES128Decrypt (XMM3, UnwrappedKey) ;
XMM4 := AES128Decrypt (XMM4, UnwrappedKey) ;
XMM5 := AES128Decrypt (XMM5, UnwrappedKey) ;
XMM6 := AES128Decrypt (XMM6, UnwrappedKey) ;
XMM7 := AES128Decrypt (XMM7, UnwrappedKey) ;
RFLAGS.ZF := 0;
FI;
FI;
RFLAGS.OF, SF, AF, PF, CF := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF is set to 0 if the operation succeeded and set to 1 if the operation failed due to a handle violation. The other arithmetic flags (OF, SF, AF, PF, CF) are cleared to 0.</p>
<p>1. Further details on Key Locker and usage of this instruction can be found here:</p>
<h3 id="https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html.<a class="anchor" href="#https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">
</a></h3>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>AESDECWIDE128KLunsigned char _mm_aesdecwide128kl_u8(__m128i odata[8], const __m128i idata[8], const void* h);
</pre>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p>
<p>If CPUID.07H:ECX.KL[bit 23] = 0.</p>
<p>If CR4.KL = 0.</p>
<p>If CPUID.19H:EBX.AESKLE[bit 0] = 0.</p>
<p>If CR0.EM = 1.</p>
<p>If CR4.OSFXSR = 0.</p>
<p>If CPUID.19H:EBX.WIDE_KL[bit 2] = 0.</p>
<p>#NM If CR0.TS = 1.</p>
<p>#PF If a page fault occurs.</p>
<p>#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</p>
<p>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</p>
<p>If the memory address is in a non-canonical form.</p>
<p>#SS(0) If a memory operand effective address is outside the SS segment limit.</p>
<p>If a memory address referencing the SS segment is in a non-canonical form.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

101
x86/aesdecwide256kl.html Normal file
View file

@ -0,0 +1,101 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESDECWIDE256KL
— Perform 14 Rounds of AES Decryption Flow With Key Locker on 8 BlocksUsing 256-Bit Key</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESDECWIDE256KL
— Perform 14 Rounds of AES Decryption Flow With Key Locker on 8 BlocksUsing 256-Bit Key</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 38 D8 !(11):011:bbb AESDECWIDE256KL m512, &lt;XMM0-7&gt;</td>
<td>A</td>
<td>V/V</td>
<td>AESKLEWIDE_KL</td>
<td>Decrypt XMM0-7 using 256-bit AES key indicated by handle at m512 and store each resultant block back to its corresponding register.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operands 2—9</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:r/m (r)</td>
<td>Implicit XMM0-7 (r, w)</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>The AESDECWIDE256KL<sup>1</sup> instruction performs 14 rounds of AES to decrypt each of the eight blocks in XMM0-7 using the 256-bit key indicated by the handle from the second operand. It replaces each input block in XMM0-7 with its corresponding decrypted block if the operation succeeds (e.g., does not run into a handle violation failure).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h4 id="aesdecwide256kl">AESDECWIDE256KL<a class="anchor" href="#aesdecwide256kl">
</a></h4>
<pre>Handle := UnalignedLoad of 512 bit (SRC); // Load is not guaranteed to be atomic.
Illegal Handle = (HandleReservedBitSet (Handle) ||
(Handle[0] AND (CPL &gt; 0)) ||
Handle [2] ||
HandleKeyType (Handle) != HANDLE_KEY_TYPE_AES256);
IF (Illegal Handle) {
THEN RFLAGS.ZF := 1;
ELSE
(UnwrappedKey, Authentic) := UnwrapKeyAndAuthenticate512 (Handle[511:0], IWKey);
IF (Authentic == 0)
THEN RFLAGS.ZF := 1;
ELSE
XMM0 := AES256Decrypt (XMM0, UnwrappedKey) ;
XMM1 := AES256Decrypt (XMM1, UnwrappedKey) ;
XMM2 := AES256Decrypt (XMM2, UnwrappedKey) ;
XMM3 := AES256Decrypt (XMM3, UnwrappedKey) ;
XMM4 := AES256Decrypt (XMM4, UnwrappedKey) ;
XMM5 := AES256Decrypt (XMM5, UnwrappedKey) ;
XMM6 := AES256Decrypt (XMM6, UnwrappedKey) ;
XMM7 := AES256Decrypt (XMM7, UnwrappedKey) ;
RFLAGS.ZF := 0;
FI;
FI;
RFLAGS.OF, SF, AF, PF, CF := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF is set to 0 if the operation succeeded and set to 1 if the operation failed due to a handle violation. The other arithmetic flags (OF, SF, AF, PF, CF) are cleared to 0.</p>
<p>1. Further details on Key Locker and usage of this instruction can be found here:</p>
<h3 id="https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html.<a class="anchor" href="#https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">
</a></h3>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>AESDECWIDE256KLunsigned char _mm_aesdecwide256kl_u8(__m128i odata[8], const __m128i idata[8], const void* h);
</pre>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p>
<p>If CPUID.07H:ECX.KL[bit 23] = 0.</p>
<p>If CR4.KL = 0.</p>
<p>If CPUID.19H:EBX.AESKLE[bit 0] = 0.</p>
<p>If CR0.EM = 1.</p>
<p>If CR4.OSFXSR = 0.</p>
<p>If CPUID.19H:EBX.WIDE_KL[bit 2] = 0.</p>
<p>#NM If CR0.TS = 1.</p>
<p>#PF If a page fault occurs.</p>
<p>#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</p>
<p>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</p>
<p>If the memory address is in a non-canonical form.</p>
<p>#SS(0) If a memory operand effective address is outside the SS segment limit.</p>
<p>If a memory address referencing the SS segment is in a non-canonical form.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

149
x86/aesenc.html Normal file
View file

@ -0,0 +1,149 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESENC
— Perform One Round of an AES Encryption Flow</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESENC
— Perform One Round of an AES Encryption Flow</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 38 DC /r AESENC xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AES</td>
<td>Perform one round of an AES encryption flow, using one 128-bit data (state) from xmm1 with one 128-bit round key from xmm2/m128.</td></tr>
<tr>
<td>VEX.128.66.0F38.WIG DC /r VAESENC xmm1, xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AES AVX</td>
<td>Perform one round of an AES encryption flow, using one 128-bit data (state) from xmm2 with one 128-bit round key from the xmm3/m128; store the result in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F38.WIG DC /r VAESENC ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>VAES</td>
<td>Perform one round of an AES encryption flow, using two 128-bit data (state) from ymm2 with two 128-bit round keys from the ymm3/m256; store the result in ymm1.</td></tr>
<tr>
<td>EVEX.128.66.0F38.WIG DC /r VAESENC xmm1, xmm2, xmm3/m128</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512VL</td>
<td>Perform one round of an AES encryption flow, using one 128-bit data (state) from xmm2 with one 128-bit round key from the xmm3/m128; store the result in xmm1.</td></tr>
<tr>
<td>EVEX.256.66.0F38.WIG DC /r VAESENC ymm1, ymm2, ymm3/m256</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512VL</td>
<td>Perform one round of an AES encryption flow, using two 128-bit data (state) from ymm2 with two 128-bit round keys from the ymm3/m256; store the result in ymm1.</td></tr>
<tr>
<td>EVEX.512.66.0F38.WIG DC /r VAESENC zmm1, zmm2, zmm3/m512</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512F</td>
<td>Perform one round of an AES encryption flow, using four 128-bit data (state) from zmm2 with four 128-bit round keys from the zmm3/m512; store the result in zmm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full Mem</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>This instruction performs a single round of an AES encryption flow using one/two/four (depending on vector length) 128-bit data (state) from the first source operand with one/two/four (depending on vector length) round key(s) from the second source operand, and stores the result in the destination operand.</p>
<p>Use the AESENC instruction for all but the last encryption rounds. For the last encryption round, use the AESENCCLAST instruction.</p>
<p>VEX and EVEX encoded versions of the instruction allow 3-operand (non-destructive) operation. The legacy encoded versions of the instruction require that the first source operand and the destination operand are the same and must be an XMM register.</p>
<p>The EVEX encoded form of this instruction does not support memory fault suppression.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="aesenc">AESENC<a class="anchor" href="#aesenc">
</a></h3>
<pre>STATE := SRC1;
RoundKey := SRC2;
STATE := ShiftRows( STATE );
STATE := SubBytes( STATE );
STATE := MixColumns( STATE );
DEST[127:0] := STATE XOR RoundKey;
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vaesenc--128b-and-256b-vex-encoded-versions-">VAESENC (128b and 256b VEX Encoded Versions)<a class="anchor" href="#vaesenc--128b-and-256b-vex-encoded-versions-">
</a></h3>
<pre>(KL,VL) = (1,128), (2,256)
FOR I := 0 to KL-1:
STATE := SRC1.xmm[i]
RoundKey := SRC2.xmm[i]
STATE := ShiftRows( STATE )
STATE := SubBytes( STATE )
STATE := MixColumns( STATE )
DEST.xmm[i] := STATE XOR RoundKey
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vaesenc--evex-encoded-version-">VAESENC (EVEX Encoded Version)<a class="anchor" href="#vaesenc--evex-encoded-version-">
</a></h3>
<pre>(KL,VL) = (1,128), (2,256), (4,512)
FOR i := 0 to KL-1:
STATE := SRC1.xmm[i] // xmm[i] is the ith xmm word in the SIMD register
RoundKey := SRC2.xmm[i]
STATE := ShiftRows( STATE )
STATE := SubBytes( STATE )
STATE := MixColumns( STATE )
DEST.xmm[i] := STATE XOR RoundKey
DEST[MAXVL-1:VL] := 0
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>(V)AESENC __m128i _mm_aesenc (__m128i, __m128i)
</pre>
<pre>VAESENC __m256i _mm256_aesenc_epi128(__m256i, __m256i);
</pre>
<pre>VAESENC __m512i _mm512_aesenc_epi128(__m512i, __m512i);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>EVEX-encoded: See <span class="not-imported">Table 2-50</span>, “Type E4NF Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

100
x86/aesenc128kl.html Normal file
View file

@ -0,0 +1,100 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESENC128KL
— Perform Ten Rounds of AES Encryption Flow With Key Locker Using 128-Bit Key</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESENC128KL
— Perform Ten Rounds of AES Encryption Flow With Key Locker Using 128-Bit Key</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 38 DC !(11):rrr:bbb AESENC128KL xmm, m384</td>
<td>A</td>
<td>V/V</td>
<td>AESKLE</td>
<td>Encrypt xmm using 128-bit AES key indicated by handle at m384 and store result in xmm.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>The AESENC128KL<sup>1</sup> instruction performs ten rounds of AES to encrypt the first operand using the 128-bit key indicated by the handle from the second operand. It stores the result in the first operand if the operation succeeds (e.g., does not run into a handle violation failure).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h4 id="aesenc128kl">AESENC128KL<a class="anchor" href="#aesenc128kl">
</a></h4>
<pre>Handle := UnalignedLoad of 384 bit (SRC); // Load is not guaranteed to be atomic.
Illegal Handle = (
HandleReservedBitSet (Handle) ||
(Handle[0] AND (CPL &gt; 0)) ||
Handle [1] ||
HandleKeyType (Handle) != HANDLE_KEY_TYPE_AES128
);
IF (Illegal Handle) {
THEN RFLAGS.ZF := 1;
ELSE
(UnwrappedKey, Authentic) := UnwrapKeyAndAuthenticate384 (Handle[383:0], IWKey);
IF (Authentic == 0)
THEN RFLAGS.ZF := 1;
ELSE
DEST := AES128Encrypt (DEST, UnwrappedKey) ;
RFLAGS.ZF := 0;
FI;
FI;
RFLAGS.OF, SF, AF, PF, CF := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF is set to 0 if the operation succeeded and set to 1 if the operation failed due to a handle violation. The other arithmetic flags (OF, SF, AF, PF, CF) are cleared to 0.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>AESENC128KL unsigned char _mm_aesenc128kl_u8(__m128i* odata, __m128i idata, const void* h);
</pre>
<pre>1. Further details on Key Locker and usage of this instruction can be found here:
</pre>
<h3 id="https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html.<a class="anchor" href="#https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">
</a></h3>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p>
<p>If CPUID.07H:ECX.KL[bit 23] = 0.</p>
<p>If CR4.KL = 0.</p>
<p>If CPUID.19H:EBX.AESKLE[bit 0] = 0.</p>
<p>If CR0.EM = 1.</p>
<p>If CR4.OSFXSR = 0.</p>
<p>#NM If CR0.TS = 1.</p>
<p>#PF If a page fault occurs.</p>
<p>#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</p>
<p>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</p>
<p>If the memory address is in a non-canonical form.</p>
<p>#SS(0) If a memory operand effective address is outside the SS segment limit.</p>
<p>If a memory address referencing the SS segment is in a non-canonical form.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

100
x86/aesenc256kl.html Normal file
View file

@ -0,0 +1,100 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESENC256KL
— Perform 14 Rounds of AES Encryption Flow With Key Locker Using 256-Bit Key</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESENC256KL
— Perform 14 Rounds of AES Encryption Flow With Key Locker Using 256-Bit Key</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 38 DE !(11):rrr:bbb AESENC256KL xmm, m512</td>
<td>A</td>
<td>V/V</td>
<td>AESKLE</td>
<td>Encrypt xmm using 256-bit AES key indicated by handle at m512 and store result in xmm.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>The AESENC256KL<sup>1</sup> instruction performs 14 rounds of AES to encrypt the first operand using the 256-bit key indicated by the handle from the second operand. It stores the result in the first operand if the operation succeeds (e.g., does not run into a handle violation failure).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h4 id="aesenc256kl">AESENC256KL<a class="anchor" href="#aesenc256kl">
</a></h4>
<pre>Handle := UnalignedLoad of 512 bit (SRC); // Load is not guaranteed to be atomic.
Illegal Handle = (
HandleReservedBitSet (Handle) ||
(Handle[0] AND (CPL &gt; 0)) ||
Handle [1] ||
HandleKeyType (Handle) != HANDLE_KEY_TYPE_AES256
);
IF (Illegal Handle)
THEN RFLAGS.ZF := 1;
ELSE
(UnwrappedKey, Authentic) := UnwrapKeyAndAuthenticate512 (Handle[511:0], IWKey);
IF (Authentic == 0)
THEN RFLAGS.ZF := 1;
ELSE
DEST := AES256Encrypt (DEST, UnwrappedKey) ;
RFLAGS.ZF := 0;
FI;
FI;
RFLAGS.OF, SF, AF, PF, CF := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF is set to 0 if the operation succeeded and set to 1 if the operation failed due to a handle violation. The other arithmetic flags (OF, SF, AF, PF, CF) are cleared to 0.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>AESENC256KL unsigned char _mm_aesenc256kl_u8(__m128i* odata, __m128i idata, const void* h);
</pre>
<pre>1. Further details on Key Locker and usage of this instruction can be found here:
</pre>
<h3 id="https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html.<a class="anchor" href="#https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">
</a></h3>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p>
<p>If CPUID.07H:ECX.KL[bit 23] = 0.</p>
<p>If CR4.KL = 0.</p>
<p>If CPUID.19H:EBX.AESKLE[bit 0] = 0.</p>
<p>If CR0.EM = 1.</p>
<p>If CR4.OSFXSR = 0.</p>
<p>#NM If CR0.TS = 1.</p>
<p>#PF If a page fault occurs.</p>
<p>#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</p>
<p>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</p>
<p>If the memory address is in a non-canonical form.</p>
<p>#SS(0) If a memory operand effective address is outside the SS segment limit.</p>
<p>If a memory address referencing the SS segment is in a non-canonical form.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

145
x86/aesenclast.html Normal file
View file

@ -0,0 +1,145 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESENCLAST
— Perform Last Round of an AES Encryption Flow</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESENCLAST
— Perform Last Round of an AES Encryption Flow</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 38 DD /r AESENCLAST xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AES</td>
<td>Perform the last round of an AES encryption flow, using one 128-bit data (state) from xmm1 with one 128-bit round key from xmm2/m128.</td></tr>
<tr>
<td>VEX.128.66.0F38.WIG DD /r VAESENCLAST xmm1, xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AES AVX</td>
<td>Perform the last round of an AES encryption flow, using one 128-bit data (state) from xmm2 with one 128-bit round key from xmm3/m128; store the result in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F38.WIG DD /r VAESENCLAST ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>VAES</td>
<td>Perform the last round of an AES encryption flow, using two 128-bit data (state) from ymm2 with two 128-bit round keys from ymm3/m256; store the result in ymm1.</td></tr>
<tr>
<td>EVEX.128.66.0F38.WIG DD /r VAESENCLAST xmm1, xmm2, xmm3/m128</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512VL</td>
<td>Perform the last round of an AES encryption flow, using one 128-bit data (state) from xmm2 with one 128-bit round key from xmm3/m128; store the result in xmm1.</td></tr>
<tr>
<td>EVEX.256.66.0F38.WIG DD /r VAESENCLAST ymm1, ymm2, ymm3/m256</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512VL</td>
<td>Perform the last round of an AES encryption flow, using two 128-bit data (state) from ymm2 with two 128-bit round keys from ymm3/m256; store the result in ymm1.</td></tr>
<tr>
<td>EVEX.512.66.0F38.WIG DD /r VAESENCLAST zmm1, zmm2, zmm3/m512</td>
<td>C</td>
<td>V/V</td>
<td>VAES AVX512F</td>
<td>Perform the last round of an AES encryption flow, using four 128-bit data (state) from zmm2 with four 128-bit round keys from zmm3/m512; store the result in zmm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full Mem</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>This instruction performs the last round of an AES encryption flow using one/two/four (depending on vector length) 128-bit data (state) from the first source operand with one/two/four (depending on vector length) round key(s) from the second source operand, and stores the result in the destination operand.</p>
<p>VEX and EVEX encoded versions of the instruction allows 3-operand (non-destructive) operation. The legacy encoded versions of the instruction require that the first source operand and the destination operand are the same and must be an XMM register.</p>
<p>The EVEX encoded form of this instruction does not support memory fault suppression.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="aesenclast">AESENCLAST<a class="anchor" href="#aesenclast">
</a></h3>
<pre>STATE := SRC1;
RoundKey := SRC2;
STATE := ShiftRows( STATE );
STATE := SubBytes( STATE );
DEST[127:0] := STATE XOR RoundKey;
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vaesenclast--128b-and-256b-vex-encoded-versions-">VAESENCLAST (128b and 256b VEX Encoded Versions)<a class="anchor" href="#vaesenclast--128b-and-256b-vex-encoded-versions-">
</a></h3>
<pre>(KL, VL) = (1,128), (2,256)
FOR I=0 to KL-1:
STATE := SRC1.xmm[i]
RoundKey := SRC2.xmm[i]
STATE := ShiftRows( STATE )
STATE := SubBytes( STATE )
DEST.xmm[i] := STATE XOR RoundKey
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vaesenclast--evex-encoded-version-">VAESENCLAST (EVEX Encoded Version)<a class="anchor" href="#vaesenclast--evex-encoded-version-">
</a></h3>
<pre>(KL,VL) = (1,128), (2,256), (4,512)
FOR i = 0 to KL-1:
STATE := SRC1.xmm[i]
RoundKey := SRC2.xmm[i]
STATE := ShiftRows( STATE )
STATE := SubBytes( STATE )
DEST.xmm[i] := STATE XOR RoundKey
DEST[MAXVL-1:VL] := 0
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>(V)AESENCLAST __m128i _mm_aesenclast (__m128i, __m128i)
</pre>
<pre>VAESENCLAST __m256i _mm256_aesenclast_epi128(__m256i, __m256i);
</pre>
<pre>VAESENCLAST __m512i _mm512_aesenclast_epi128(__m512i, __m512i);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>EVEX-encoded: See <span class="not-imported">Table 2-50</span>, “Type E4NF Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

103
x86/aesencwide128kl.html Normal file
View file

@ -0,0 +1,103 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESENCWIDE128KL
— Perform Ten Rounds of AES Encryption Flow With Key Locker on 8 BlocksUsing 128-Bit Key</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESENCWIDE128KL
— Perform Ten Rounds of AES Encryption Flow With Key Locker on 8 BlocksUsing 128-Bit Key</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 38 D8 !(11):000:bbb AESENCWIDE128KL m384, &lt;XMM0-7&gt;</td>
<td>A</td>
<td>V/V</td>
<td>AESKLE WIDE_KL</td>
<td>Encrypt XMM0-7 using 128-bit AES key indicated by handle at m384 and store each resultant block back to its corresponding register.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operands 2—9</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:r/m (r)</td>
<td>Implicit XMM0-7 (r, w)</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>The AESENCWIDE128KL<sup>1</sup> instruction performs ten rounds of AES to encrypt each of the eight blocks in XMM0-7 using the 128-bit key indicated by the handle from the second operand. It replaces each input block in XMM0-7 with its corresponding encrypted block if the operation succeeds (e.g., does not run into a handle violation failure).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h4 id="aesencwide128kl">AESENCWIDE128KL<a class="anchor" href="#aesencwide128kl">
</a></h4>
<pre>Handle := UnalignedLoad of 384 bit (SRC); // Load is not guaranteed to be atomic.
Illegal Handle = (
HandleReservedBitSet (Handle) ||
(Handle[0] AND (CPL &gt; 0)) ||
Handle [1] ||
HandleKeyType (Handle) != HANDLE_KEY_TYPE_AES128
);
IF (Illegal Handle)
THEN RFLAGS.ZF := 1;
ELSE
(UnwrappedKey, Authentic) := UnwrapKeyAndAuthenticate384 (Handle[383:0], IWKey);
IF Authentic == 0
THEN RFLAGS.ZF := 1;
ELSE
XMM0 := AES128Encrypt (XMM0, UnwrappedKey) ;
XMM1 := AES128Encrypt (XMM1, UnwrappedKey) ;
XMM2 := AES128Encrypt (XMM2, UnwrappedKey) ;
XMM3 := AES128Encrypt (XMM3, UnwrappedKey) ;
XMM4 := AES128Encrypt (XMM4, UnwrappedKey) ;
XMM5 := AES128Encrypt (XMM5, UnwrappedKey) ;
XMM6 := AES128Encrypt (XMM6, UnwrappedKey) ;
XMM7 := AES128Encrypt (XMM7, UnwrappedKey) ;
RFLAGS.ZF := 0;
FI;
FI;
RFLAGS.OF, SF, AF, PF, CF := 0;
1. Further details on Key Locker and usage of this instruction can be found here:
</pre>
<h3 id="https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html.<a class="anchor" href="#https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">
</a></h3>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF is set to 0 if the operation succeeded and set to 1 if the operation failed due to a handle violation. The other arithmetic flags (OF, SF, AF, PF, CF) are cleared to 0.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>AESENCWIDE128KLunsigned char _mm_aesencwide128kl_u8(__m128i odata[8], const __m128i idata[8], const void* h);
</pre>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p>
<p>If CPUID.07H:ECX.KL[bit 23] = 0.</p>
<p>If CR4.KL = 0.</p>
<p>If CPUID.AESKLE = 0.</p>
<p>If CR0.EM = 1.</p>
<p>If CR4.OSFXSR = 0.</p>
<p>If CPUID.19H:EBX.WIDE_KL[bit 2] = 0.</p>
<p>#NM If CR0.TS = 1.</p>
<p>#PF If a page fault occurs.</p>
<p>#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</p>
<p>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</p>
<p>If the memory address is in a non-canonical form.</p>
<p>#SS(0) If a memory operand effective address is outside the SS segment limit.</p>
<p>If a memory address referencing the SS segment is in a non-canonical form.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

103
x86/aesencwide256kl.html Normal file
View file

@ -0,0 +1,103 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESENCWIDE256KL
— Perform 14 Rounds of AES Encryption Flow With Key Locker on 8 BlocksUsing 256-Bit Key</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESENCWIDE256KL
— Perform 14 Rounds of AES Encryption Flow With Key Locker on 8 BlocksUsing 256-Bit Key</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 38 D8 !(11):010:bbb AESENCWIDE256KL m512, &lt;XMM0-7&gt;</td>
<td>A</td>
<td>V/V</td>
<td>AESKLE WIDE_KL</td>
<td>Encrypt XMM0-7 using 256-bit AES key indicated by handle at m512 and store each resultant block back to its corresponding register.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operands 2—9</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:r/m (r)</td>
<td>Implicit XMM0-7 (r, w)</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>The AESENCWIDE256KL<sup>1</sup> instruction performs 14 rounds of AES to encrypt each of the eight blocks in XMM0-7 using the 256-bit key indicated by the handle from the second operand. It replaces each input block in XMM0-7 with its corresponding encrypted block if the operation succeeds (e.g., does not run into a handle violation failure).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h4 id="aesencwide256kl">AESENCWIDE256KL<a class="anchor" href="#aesencwide256kl">
</a></h4>
<pre>Handle := UnalignedLoad of 512 bit (SRC); // Load is not guaranteed to be atomic.
Illegal Handle = (
HandleReservedBitSet (Handle) ||
(Handle[0] AND (CPL &gt; 0)) ||
Handle [1] ||
HandleKeyType (Handle) != HANDLE_KEY_TYPE_AES256
);
IF (Illegal Handle)
THEN RFLAGS.ZF := 1;
ELSE
(UnwrappedKey, Authentic) := UnwrapKeyAndAuthenticate512 (Handle[511:0], IWKey);
IF (Authentic == 0)
THEN RFLAGS.ZF := 1;
ELSE
XMM0 := AES256Encrypt (XMM0, UnwrappedKey) ;
XMM1 := AES256Encrypt (XMM1, UnwrappedKey) ;
XMM2 := AES256Encrypt (XMM2, UnwrappedKey) ;
XMM3 := AES256Encrypt (XMM3, UnwrappedKey) ;
XMM4 := AES256Encrypt (XMM4, UnwrappedKey) ;
XMM5 := AES256Encrypt (XMM5, UnwrappedKey) ;
XMM6 := AES256Encrypt (XMM6, UnwrappedKey) ;
XMM7 := AES256Encrypt (XMM7, UnwrappedKey) ;
RFLAGS.ZF := 0;
FI;
FI;
RFLAGS.OF, SF, AF, PF, CF := 0;
1. Further details on Key Locker and usage of this instruction can be found here:
</pre>
<h3 id="https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">https://software.intel.com/content/www/us/en/develop/download/intel-key-locker-specification.html.<a class="anchor" href="#https---software-intel-com-content-www-us-en-develop-download-intel-key-locker-specification-html-">
</a></h3>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF is set to 0 if the operation succeeded and set to 1 if the operation failed due to a handle violation. The other arithmetic flags (OF, SF, AF, PF, CF) are cleared to 0.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>AESENCWIDE256KLunsigned char _mm_aesencwide256kl_u8(__m128i odata[8], const __m128i idata[8], const void* h);
</pre>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p>
<p>If CPUID.07H:ECX.KL[bit 23] = 0.</p>
<p>If CR4.KL = 0.</p>
<p>If CPUID.19H:EBX.AESKLE[bit 0] = 0.</p>
<p>If CR0.EM = 1.</p>
<p>If CR4.OSFXSR = 0.</p>
<p>If CPUID.19H:EBX.WIDE_KL[bit 2] = 0.</p>
<p>#NM If CR0.TS = 1.</p>
<p>#PF If a page fault occurs.</p>
<p>#GP(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</p>
<p>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</p>
<p>If the memory address is in a non-canonical form.</p>
<p>#SS(0) If a memory operand effective address is outside the SS segment limit.</p>
<p>If a memory address referencing the SS segment is in a non-canonical form.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

84
x86/aesimc.html Normal file
View file

@ -0,0 +1,84 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESIMC
— Perform the AES InvMixColumn Transformation</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESIMC
— Perform the AES InvMixColumn Transformation</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 38 DB /r AESIMC xmm1, xmm2/m128</td>
<td>RM</td>
<td>V/V</td>
<td>AES</td>
<td>Perform the InvMixColumn transformation on a 128-bit round key from xmm2/m128 and store the result in xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F38.WIG DB /r VAESIMC xmm1, xmm2/m128</td>
<td>RM</td>
<td>V/V</td>
<td>Both AES and AVX flags</td>
<td>Perform the InvMixColumn transformation on a 128-bit round key from xmm2/m128 and store the result in xmm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Perform the InvMixColumns transformation on the source operand and store the result in the destination operand. The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location.</p>
<p>Note: the AESIMC instruction should be applied to the expanded AES round keys (except for the first and last round key) in order to prepare them for decryption using the “Equivalent Inverse Cipher” (defined in FIPS 197).</p>
<p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p>
<p>VEX.128 encoded version: Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p>
<p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="aesimc">AESIMC<a class="anchor" href="#aesimc">
</a></h3>
<pre>DEST[127:0] := InvMixColumns( SRC );
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vaesimc">VAESIMC<a class="anchor" href="#vaesimc">
</a></h3>
<pre>DEST[127:0] := InvMixColumns( SRC );
DEST[MAXVL-1:128] := 0;
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>(V)AESIMC __m128i _mm_aesimc (__m128i)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions,” additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv ≠ 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

100
x86/aeskeygenassist.html Normal file
View file

@ -0,0 +1,100 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AESKEYGENASSIST
— AES Round Key Generation Assist</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AESKEYGENASSIST
— AES Round Key Generation Assist</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 3A DF /r ib AESKEYGENASSIST xmm1, xmm2/m128, imm8</td>
<td>RMI</td>
<td>V/V</td>
<td>AES</td>
<td>Assist in AES round key generation using an 8 bits Round Constant (RCON) specified in the immediate byte, operating on 128 bits of data specified in xmm2/m128 and stores the result in xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F3A.WIG DF /r ib VAESKEYGENASSIST xmm1, xmm2/m128, imm8</td>
<td>RMI</td>
<td>V/V</td>
<td>Both AES and AVX flags</td>
<td>Assist in AES round key generation using 8 bits Round Constant (RCON) specified in the immediate byte, operating on 128 bits of data specified in xmm2/m128 and stores the result in xmm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RMI</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Assist in expanding the AES cipher key, by computing steps towards generating a round key for encryption, using 128-bit data specified in the source operand and an 8-bit round constant specified as an immediate, store the result in the destination operand.</p>
<p>The destination operand is an XMM register. The source operand can be an XMM register or a 128-bit memory location.</p>
<p>128-bit Legacy SSE version: Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged.</p>
<p>VEX.128 encoded version: Bits (MAXVL-1:128) of the destination YMM register are zeroed.</p>
<p>Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="aeskeygenassist">AESKEYGENASSIST<a class="anchor" href="#aeskeygenassist">
</a></h3>
<pre>X3[31:0] := SRC [127: 96];
X2[31:0] := SRC [95: 64];
X1[31:0] := SRC [63: 32];
X0[31:0] := SRC [31: 0];
RCON[31:0] := ZeroExtend(imm8[7:0]);
DEST[31:0] := SubWord(X1);
DEST[63:32 ] := RotWord( SubWord(X1) ) XOR RCON;
DEST[95:64] := SubWord(X3);
DEST[127:96] := RotWord( SubWord(X3) ) XOR RCON;
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vaeskeygenassist">VAESKEYGENASSIST<a class="anchor" href="#vaeskeygenassist">
</a></h3>
<pre>X3[31:0] := SRC [127: 96];
X2[31:0] := SRC [95: 64];
X1[31:0] := SRC [63: 32];
X0[31:0] := SRC [31: 0];
RCON[31:0] := ZeroExtend(imm8[7:0]);
DEST[31:0] := SubWord(X1);
DEST[63:32 ] := RotWord( SubWord(X1) ) XOR RCON;
DEST[95:64] := SubWord(X3);
DEST[127:96] := RotWord( SubWord(X3) ) XOR RCON;
DEST[MAXVL-1:128] := 0;
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>(V)AESKEYGENASSIST __m128i _mm_aeskeygenassist (__m128i, const int)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions,” additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv ≠ 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

300
x86/and.html Normal file
View file

@ -0,0 +1,300 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>AND
— Logical AND</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>AND
— Logical AND</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>24 ib</td>
<td>AND AL, imm8</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>AL AND imm8.</td></tr>
<tr>
<td>25 iw</td>
<td>AND AX, imm16</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>AX AND imm16.</td></tr>
<tr>
<td>25 id</td>
<td>AND EAX, imm32</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>EAX AND imm32.</td></tr>
<tr>
<td>REX.W + 25 id</td>
<td>AND RAX, imm32</td>
<td>I</td>
<td>Valid</td>
<td>N.E.</td>
<td>RAX AND imm32 sign-extended to 64-bits.</td></tr>
<tr>
<td>80 /4 ib</td>
<td>AND r/m8, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>r/m8 AND imm8.</td></tr>
<tr>
<td>REX + 80 /4 ib</td>
<td>AND r/m8<sup>*</sup>, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>r/m8 AND imm8.</td></tr>
<tr>
<td>81 /4 iw</td>
<td>AND r/m16, imm16</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>r/m16 AND imm16.</td></tr>
<tr>
<td>81 /4 id</td>
<td>AND r/m32, imm32</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>r/m32 AND imm32.</td></tr>
<tr>
<td>REX.W + 81 /4 id</td>
<td>AND r/m64, imm32</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>r/m64 AND imm32 sign extended to 64-bits.</td></tr>
<tr>
<td>83 /4 ib</td>
<td>AND r/m16, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>r/m16 AND imm8 (sign-extended).</td></tr>
<tr>
<td>83 /4 ib</td>
<td>AND r/m32, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>r/m32 AND imm8 (sign-extended).</td></tr>
<tr>
<td>REX.W + 83 /4 ib</td>
<td>AND r/m64, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>r/m64 AND imm8 (sign-extended).</td></tr>
<tr>
<td>20 /r</td>
<td>AND r/m8, r8</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>r/m8 AND r8.</td></tr>
<tr>
<td>REX + 20 /r</td>
<td>AND r/m8<sup>*</sup>, r8<sup>*</sup></td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>r/m64 AND r8 (sign-extended).</td></tr>
<tr>
<td>21 /r</td>
<td>AND r/m16, r16</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>r/m16 AND r16.</td></tr>
<tr>
<td>21 /r</td>
<td>AND r/m32, r32</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>r/m32 AND r32.</td></tr>
<tr>
<td>REX.W + 21 /r</td>
<td>AND r/m64, r64</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>r/m64 AND r32.</td></tr>
<tr>
<td>22 /r</td>
<td>AND r8, r/m8</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>r8 AND r/m8.</td></tr>
<tr>
<td>REX + 22 /r</td>
<td>AND r8<sup>*</sup>, r/m8<sup>*</sup></td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>r/m64 AND r8 (sign-extended).</td></tr>
<tr>
<td>23 /r</td>
<td>AND r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>r16 AND r/m16.</td></tr>
<tr>
<td>23 /r</td>
<td>AND r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>r32 AND r/m32.</td></tr>
<tr>
<td>REX.W + 23 /r</td>
<td>AND r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>r64 AND r/m64.</td></tr></table>
<blockquote>
<p>*In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.</p></blockquote>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (r, w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MI</td>
<td>ModRM:r/m (r, w)</td>
<td>imm8/16/32</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>I</td>
<td>AL/AX/EAX/RAX</td>
<td>imm8/16/32</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs a bitwise AND operation on the destination (first) and source (second) operands and stores the result in the destination operand location. The source operand can be an immediate, a register, or a memory location; the destination operand can be a register or a memory location. (However, two memory operands cannot be used in one instruction.) Each bit of the result is set to 1 if both corresponding bits of the first and second operands are 1; otherwise, it is set to 0.</p>
<p>This instruction can be used with a LOCK prefix to allow the it to be executed atomically.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>DEST := DEST AND SRC;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The OF and CF flags are cleared; the SF, ZF, and PF flags are set according to the result. The state of the AF flag is undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the destination operand points to a non-writable segment.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

74
x86/andn.html Normal file
View file

@ -0,0 +1,74 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ANDN
— Logical AND NOT</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ANDN
— Logical AND NOT</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>VEX.LZ.0F38.W0 F2 /r ANDN r32a, r32b, r/m32</td>
<td>RVM</td>
<td>V/V</td>
<td>BMI1</td>
<td>Bitwise AND of inverted r32b with r/m32, store result in r32a.</td></tr>
<tr>
<td>VEX.LZ. 0F38.W1 F2 /r ANDN r64a, r64b, r/m64</td>
<td>RVM</td>
<td>V/N.E.</td>
<td>BMI1</td>
<td>Bitwise AND of inverted r64b with r/m64, store result in r64a.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RVM</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs a bitwise logical AND of inverted second operand (the first source operand) with the third operand (the</p>
<p>second source operand). The result is stored in the first operand (destination operand).</p>
<p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>DEST := (NOT SRC1) bitwiseAND SRC2;
SF := DEST[OperandSize -1];
ZF := (DEST = 0);
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>SF and ZF are updated based on result. OF and CF flags are cleared. AF and PF flags are undefined.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>Auto-generated from high-level language.
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-29</span>, “Type 13 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

171
x86/andnpd.html Normal file
View file

@ -0,0 +1,171 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ANDNPD
— Bitwise Logical AND NOT of Packed Double Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ANDNPD
— Bitwise Logical AND NOT of Packed Double Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 55 /r ANDNPD xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Return the bitwise logical AND NOT of packed double precision floating-point values in xmm1 and xmm2/mem.</td></tr>
<tr>
<td>VEX.128.66.0F 55 /r VANDNPD xmm1, xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Return the bitwise logical AND NOT of packed double precision floating-point values in xmm2 and xmm3/mem.</td></tr>
<tr>
<td>VEX.256.66.0F 55/r VANDNPD ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Return the bitwise logical AND NOT of packed double precision floating-point values in ymm2 and ymm3/mem.</td></tr>
<tr>
<td>EVEX.128.66.0F.W1 55 /r VANDNPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Return the bitwise logical AND NOT of packed double precision floating-point values in xmm2 and xmm3/m128/m64bcst subject to writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W1 55 /r VANDNPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Return the bitwise logical AND NOT of packed double precision floating-point values in ymm2 and ymm3/m256/m64bcst subject to writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W1 55 /r VANDNPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512DQ</td>
<td>Return the bitwise logical AND NOT of packed double precision floating-point values in zmm2 and zmm3/m512/m64bcst subject to writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs a bitwise logical AND NOT of the two, four or eight packed double precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand.</p>
<p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p>
<p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding register destination are unmodified.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vandnpd--evex-encoded-versions-">VANDNPD (EVEX Encoded Versions)<a class="anchor" href="#vandnpd--evex-encoded-versions-">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask*
IF (EVEX.b == 1) AND (SRC2 *is memory*)
THEN
DEST[i+63:i] := (NOT(SRC1[i+63:i])) BITWISE AND SRC2[63:0]
ELSE
DEST[i+63:i] := (NOT(SRC1[i+63:i])) BITWISE AND SRC2[i+63:i]
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+63:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+63:i] = 0
FI;
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vandnpd--vex-256-encoded-version-">VANDNPD (VEX.256 Encoded Version)<a class="anchor" href="#vandnpd--vex-256-encoded-version-">
</a></h3>
<pre>DEST[63:0] := (NOT(SRC1[63:0])) BITWISE AND SRC2[63:0]
DEST[127:64] := (NOT(SRC1[127:64])) BITWISE AND SRC2[127:64]
DEST[191:128] := (NOT(SRC1[191:128])) BITWISE AND SRC2[191:128]
DEST[255:192] := (NOT(SRC1[255:192])) BITWISE AND SRC2[255:192]
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vandnpd--vex-128-encoded-version-">VANDNPD (VEX.128 Encoded Version)<a class="anchor" href="#vandnpd--vex-128-encoded-version-">
</a></h3>
<pre>DEST[63:0] := (NOT(SRC1[63:0])) BITWISE AND SRC2[63:0]
DEST[127:64] := (NOT(SRC1[127:64])) BITWISE AND SRC2[127:64]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="andnpd--128-bit-legacy-sse-version-">ANDNPD (128-bit Legacy SSE Version)<a class="anchor" href="#andnpd--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[63:0] := (NOT(DEST[63:0])) BITWISE AND SRC[63:0]
DEST[127:64] := (NOT(DEST[127:64])) BITWISE AND SRC[127:64]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VANDNPD __m512d _mm512_andnot_pd (__m512d a, __m512d b);
</pre>
<pre>VANDNPD __m512d _mm512_mask_andnot_pd (__m512d s, __mmask8 k, __m512d a, __m512d b);
</pre>
<pre>VANDNPD __m512d _mm512_maskz_andnot_pd (__mmask8 k, __m512d a, __m512d b);
</pre>
<pre>VANDNPD __m256d _mm256_mask_andnot_pd (__m256d s, __mmask8 k, __m256d a, __m256d b);
</pre>
<pre>VANDNPD __m256d _mm256_maskz_andnot_pd (__mmask8 k, __m256d a, __m256d b);
</pre>
<pre>VANDNPD __m128d _mm_mask_andnot_pd (__m128d s, __mmask8 k, __m128d a, __m128d b);
</pre>
<pre>VANDNPD __m128d _mm_maskz_andnot_pd (__mmask8 k, __m128d a, __m128d b);
</pre>
<pre>VANDNPD __m256d _mm256_andnot_pd (__m256d a, __m256d b);
</pre>
<pre>ANDNPD __m128d _mm_andnot_pd (__m128d a, __m128d b);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instruction, see <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

179
x86/andnps.html Normal file
View file

@ -0,0 +1,179 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ANDNPS
— Bitwise Logical AND NOT of Packed Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ANDNPS
— Bitwise Logical AND NOT of Packed Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 55 /r ANDNPS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE</td>
<td>Return the bitwise logical AND NOT of packed single precision floating-point values in xmm1 and xmm2/mem.</td></tr>
<tr>
<td>VEX.128.0F 55 /r VANDNPS xmm1, xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Return the bitwise logical AND NOT of packed single precision floating-point values in xmm2 and xmm3/mem.</td></tr>
<tr>
<td>VEX.256.0F 55 /r VANDNPS ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Return the bitwise logical AND NOT of packed single precision floating-point values in ymm2 and ymm3/mem.</td></tr>
<tr>
<td>EVEX.128.0F.W0 55 /r VANDNPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Return the bitwise logical AND of packed single precision floating-point values in xmm2 and xmm3/m128/m32bcst subject to writemask k1.</td></tr>
<tr>
<td>EVEX.256.0F.W0 55 /r VANDNPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Return the bitwise logical AND of packed single precision floating-point values in ymm2 and ymm3/m256/m32bcst subject to writemask k1.</td></tr>
<tr>
<td>EVEX.512.0F.W0 55 /r VANDNPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512DQ</td>
<td>Return the bitwise logical AND of packed single precision floating-point values in zmm2 and zmm3/m512/m32bcst subject to writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs a bitwise logical AND NOT of the four, eight or sixteen packed single precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand.</p>
<p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p>
<p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vandnps--evex-encoded-versions-">VANDNPS (EVEX Encoded Versions)<a class="anchor" href="#vandnps--evex-encoded-versions-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
IF (EVEX.b == 1) AND (SRC2 *is memory*)
THEN
DEST[i+31:i] := (NOT(SRC1[i+31:i])) BITWISE AND SRC2[31:0]
ELSE
DEST[i+31:i] := (NOT(SRC1[i+31:i])) BITWISE AND SRC2[i+31:i]
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] = 0
FI;
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vandnps--vex-256-encoded-version-">VANDNPS (VEX.256 Encoded Version)<a class="anchor" href="#vandnps--vex-256-encoded-version-">
</a></h3>
<pre>DEST[31:0] := (NOT(SRC1[31:0])) BITWISE AND SRC2[31:0]
DEST[63:32] := (NOT(SRC1[63:32])) BITWISE AND SRC2[63:32]
DEST[95:64] := (NOT(SRC1[95:64])) BITWISE AND SRC2[95:64]
DEST[127:96] := (NOT(SRC1[127:96])) BITWISE AND SRC2[127:96]
DEST[159:128] := (NOT(SRC1[159:128])) BITWISE AND SRC2[159:128]
DEST[191:160] := (NOT(SRC1[191:160])) BITWISE AND SRC2[191:160]
DEST[223:192] := (NOT(SRC1[223:192])) BITWISE AND SRC2[223:192]
DEST[255:224] := (NOT(SRC1[255:224])) BITWISE AND SRC2[255:224].
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vandnps--vex-128-encoded-version-">VANDNPS (VEX.128 Encoded Version)<a class="anchor" href="#vandnps--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := (NOT(SRC1[31:0])) BITWISE AND SRC2[31:0]
DEST[63:32] := (NOT(SRC1[63:32])) BITWISE AND SRC2[63:32]
DEST[95:64] := (NOT(SRC1[95:64])) BITWISE AND SRC2[95:64]
DEST[127:96] := (NOT(SRC1[127:96])) BITWISE AND SRC2[127:96]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="andnps--128-bit-legacy-sse-version-">ANDNPS (128-bit Legacy SSE Version)<a class="anchor" href="#andnps--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := (NOT(DEST[31:0])) BITWISE AND SRC[31:0]
DEST[63:32] := (NOT(DEST[63:32])) BITWISE AND SRC[63:32]
DEST[95:64] := (NOT(DEST[95:64])) BITWISE AND SRC[95:64]
DEST[127:96] := (NOT(DEST[127:96])) BITWISE AND SRC[127:96]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VANDNPS __m512 _mm512_andnot_ps (__m512 a, __m512 b);
</pre>
<pre>VANDNPS __m512 _mm512_mask_andnot_ps (__m512 s, __mmask16 k, __m512 a, __m512 b);
</pre>
<pre>VANDNPS __m512 _mm512_maskz_andnot_ps (__mmask16 k, __m512 a, __m512 b);
</pre>
<pre>VANDNPS __m256 _mm256_mask_andnot_ps (__m256 s, __mmask8 k, __m256 a, __m256 b);
</pre>
<pre>VANDNPS __m256 _mm256_maskz_andnot_ps (__mmask8 k, __m256 a, __m256 b);
</pre>
<pre>VANDNPS __m128 _mm_mask_andnot_ps (__m128 s, __mmask8 k, __m128 a, __m128 b);
</pre>
<pre>VANDNPS __m128 _mm_maskz_andnot_ps (__mmask8 k, __m128 a, __m128 b);
</pre>
<pre>VANDNPS __m256 _mm256_andnot_ps (__m256 a, __m256 b);
</pre>
<pre>ANDNPS __m128 _mm_andnot_ps (__m128 a, __m128 b);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instruction, see <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

172
x86/andpd.html Normal file
View file

@ -0,0 +1,172 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ANDPD
— Bitwise Logical AND of Packed Double Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ANDPD
— Bitwise Logical AND of Packed Double Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 54 /r ANDPD xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Return the bitwise logical AND of packed double precision floating-point values in xmm1 and xmm2/mem.</td></tr>
<tr>
<td>VEX.128.66.0F 54 /r VANDPD xmm1, xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Return the bitwise logical AND of packed double precision floating-point values in xmm2 and xmm3/mem.</td></tr>
<tr>
<td>VEX.256.66.0F 54 /r VANDPD ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Return the bitwise logical AND of packed double precision floating-point values in ymm2 and ymm3/mem.</td></tr>
<tr>
<td>EVEX.128.66.0F.W1 54 /r VANDPD xmm1 {k1}{z}, xmm2, xmm3/m128/m64bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Return the bitwise logical AND of packed double precision floating-point values in xmm2 and xmm3/m128/m64bcst subject to writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W1 54 /r VANDPD ymm1 {k1}{z}, ymm2, ymm3/m256/m64bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Return the bitwise logical AND of packed double precision floating-point values in ymm2 and ymm3/m256/m64bcst subject to writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W1 54 /r VANDPD zmm1 {k1}{z}, zmm2, zmm3/m512/m64bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512DQ</td>
<td>Return the bitwise logical AND of packed double precision floating-point values in zmm2 and zmm3/m512/m64bcst subject to writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs a bitwise logical AND of the two, four or eight packed double precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand.</p>
<p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p>
<p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding register destination are unmodified.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vandpd--evex-encoded-versions-">VANDPD (EVEX Encoded Versions)<a class="anchor" href="#vandpd--evex-encoded-versions-">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
IF k1[j] OR *no writemask*
THEN
IF (EVEX.b == 1) AND (SRC2 *is memory*)
THEN
DEST[i+63:i] := SRC1[i+63:i] BITWISE AND SRC2[63:0]
ELSE
DEST[i+63:i] := SRC1[i+63:i] BITWISE AND SRC2[i+63:i]
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+63:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+63:i] = 0
FI;
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vandpd--vex-256-encoded-version-">VANDPD (VEX.256 Encoded Version)<a class="anchor" href="#vandpd--vex-256-encoded-version-">
</a></h3>
<pre>DEST[63:0] := SRC1[63:0] BITWISE AND SRC2[63:0]
DEST[127:64] := SRC1[127:64] BITWISE AND SRC2[127:64]
DEST[191:128] := SRC1[191:128] BITWISE AND SRC2[191:128]
DEST[255:192] := SRC1[255:192] BITWISE AND SRC2[255:192]
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vandpd--vex-128-encoded-version-">VANDPD (VEX.128 Encoded Version)<a class="anchor" href="#vandpd--vex-128-encoded-version-">
</a></h3>
<pre>DEST[63:0] := SRC1[63:0] BITWISE AND SRC2[63:0]
DEST[127:64] := SRC1[127:64] BITWISE AND SRC2[127:64]
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="andpd--128-bit-legacy-sse-version-">ANDPD (128-bit Legacy SSE Version)<a class="anchor" href="#andpd--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[63:0] := DEST[63:0] BITWISE AND SRC[63:0]
DEST[127:64] := DEST[127:64] BITWISE AND SRC[127:64]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VANDPD __m512d _mm512_and_pd (__m512d a, __m512d b);
</pre>
<pre>VANDPD __m512d _mm512_mask_and_pd (__m512d s, __mmask8 k, __m512d a, __m512d b);
</pre>
<pre>VANDPD __m512d _mm512_maskz_and_pd (__mmask8 k, __m512d a, __m512d b);
</pre>
<pre>VANDPD __m256d _mm256_mask_and_pd (__m256d s, __mmask8 k, __m256d a, __m256d b);
</pre>
<pre>VANDPD __m256d _mm256_maskz_and_pd (__mmask8 k, __m256d a, __m256d b);
</pre>
<pre>VANDPD __m128d _mm_mask_and_pd (__m128d s, __mmask8 k, __m128d a, __m128d b);
</pre>
<pre>VANDPD __m128d _mm_maskz_and_pd (__mmask8 k, __m128d a, __m128d b);
</pre>
<pre>VANDPD __m256d _mm256_and_pd (__m256d a, __m256d b);
</pre>
<pre>ANDPD __m128d _mm_and_pd (__m128d a, __m128d b);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instruction, see <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

179
x86/andps.html Normal file
View file

@ -0,0 +1,179 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ANDPS
— Bitwise Logical AND of Packed Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ANDPS
— Bitwise Logical AND of Packed Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 54 /r ANDPS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE</td>
<td>Return the bitwise logical AND of packed single precision floating-point values in xmm1 and xmm2/mem.</td></tr>
<tr>
<td>VEX.128.0F 54 /r VANDPS xmm1,xmm2, xmm3/m128</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Return the bitwise logical AND of packed single precision floating-point values in xmm2 and xmm3/mem.</td></tr>
<tr>
<td>VEX.256.0F 54 /r VANDPS ymm1, ymm2, ymm3/m256</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Return the bitwise logical AND of packed single precision floating-point values in ymm2 and ymm3/mem.</td></tr>
<tr>
<td>EVEX.128.0F.W0 54 /r VANDPS xmm1 {k1}{z}, xmm2, xmm3/m128/m32bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Return the bitwise logical AND of packed single precision floating-point values in xmm2 and xmm3/m128/m32bcst subject to writemask k1.</td></tr>
<tr>
<td>EVEX.256.0F.W0 54 /r VANDPS ymm1 {k1}{z}, ymm2, ymm3/m256/m32bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512DQ</td>
<td>Return the bitwise logical AND of packed single precision floating-point values in ymm2 and ymm3/m256/m32bcst subject to writemask k1.</td></tr>
<tr>
<td>EVEX.512.0F.W0 54 /r VANDPS zmm1 {k1}{z}, zmm2, zmm3/m512/m32bcst</td>
<td>C</td>
<td>V/V</td>
<td>AVX512DQ</td>
<td>Return the bitwise logical AND of packed single precision floating-point values in zmm2 and zmm3/m512/m32bcst subject to writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>C</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs a bitwise logical AND of the four, eight or sixteen packed single precision floating-point values from the first source operand and the second source operand, and stores the result in the destination operand.</p>
<p>EVEX encoded versions: The first source operand is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand is a YMM register or a 256-bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p>
<p>VEX.128 encoded version: The first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vandps--evex-encoded-versions-">VANDPS (EVEX Encoded Versions)<a class="anchor" href="#vandps--evex-encoded-versions-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
IF (EVEX.b == 1) AND (SRC2 *is memory*)
THEN
DEST[i+63:i] := SRC1[i+31:i] BITWISE AND SRC2[31:0]
ELSE
DEST[i+31:i] := SRC1[i+31:i] BITWISE AND SRC2[i+31:i]
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] := 0
FI;
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0;
</pre>
<h3 id="vandps--vex-256-encoded-version-">VANDPS (VEX.256 Encoded Version)<a class="anchor" href="#vandps--vex-256-encoded-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] BITWISE AND SRC2[31:0]
DEST[63:32] := SRC1[63:32] BITWISE AND SRC2[63:32]
DEST[95:64] := SRC1[95:64] BITWISE AND SRC2[95:64]
DEST[127:96] := SRC1[127:96] BITWISE AND SRC2[127:96]
DEST[159:128] := SRC1[159:128] BITWISE AND SRC2[159:128]
DEST[191:160] := SRC1[191:160] BITWISE AND SRC2[191:160]
DEST[223:192] := SRC1[223:192] BITWISE AND SRC2[223:192]
DEST[255:224] := SRC1[255:224] BITWISE AND SRC2[255:224].
DEST[MAXVL-1:256] := 0;
</pre>
<h3 id="vandps--vex-128-encoded-version-">VANDPS (VEX.128 Encoded Version)<a class="anchor" href="#vandps--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := SRC1[31:0] BITWISE AND SRC2[31:0]
DEST[63:32] := SRC1[63:32] BITWISE AND SRC2[63:32]
DEST[95:64] := SRC1[95:64] BITWISE AND SRC2[95:64]
DEST[127:96] := SRC1[127:96] BITWISE AND SRC2[127:96]
DEST[MAXVL-1:128] := 0;
</pre>
<h3 id="andps--128-bit-legacy-sse-version-">ANDPS (128-bit Legacy SSE Version)<a class="anchor" href="#andps--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := DEST[31:0] BITWISE AND SRC[31:0]
DEST[63:32] := DEST[63:32] BITWISE AND SRC[63:32]
DEST[95:64] := DEST[95:64] BITWISE AND SRC[95:64]
DEST[127:96] := DEST[127:96] BITWISE AND SRC[127:96]
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VANDPS __m512 _mm512_and_ps (__m512 a, __m512 b);
</pre>
<pre>VANDPS __m512 _mm512_mask_and_ps (__m512 s, __mmask16 k, __m512 a, __m512 b);
</pre>
<pre>VANDPS __m512 _mm512_maskz_and_ps (__mmask16 k, __m512 a, __m512 b);
</pre>
<pre>VANDPS __m256 _mm256_mask_and_ps (__m256 s, __mmask8 k, __m256 a, __m256 b);
</pre>
<pre>VANDPS __m256 _mm256_maskz_and_ps (__mmask8 k, __m256 a, __m256 b);
</pre>
<pre>VANDPS __m128 _mm_mask_and_ps (__m128 s, __mmask8 k, __m128 a, __m128 b);
</pre>
<pre>VANDPS __m128 _mm_maskz_and_ps (__mmask8 k, __m128 a, __m128 b);
</pre>
<pre>VANDPS __m256 _mm256_and_ps (__m256 a, __m256 b);
</pre>
<pre>ANDPS __m128 _mm_and_ps (__m128 a, __m128 b);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instruction, see <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p>
<p>EVEX-encoded instruction, see <span class="not-imported">Table 2-49</span>, “Type E4 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

116
x86/arpl.html Normal file
View file

@ -0,0 +1,116 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>ARPL
— Adjust RPL Field of Segment Selector</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>ARPL
— Adjust RPL Field of Segment Selector</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>63 /r</td>
<td>ARPL r/m16, r16</td>
<td>MR</td>
<td>N. E.</td>
<td>Valid</td>
<td>Adjust RPL of r/m16 to not less than RPL of r16.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compares the RPL fields of two segment selectors. The first operand (the destination operand) contains one segment selector and the second operand (source operand) contains the other. (The RPL field is located in bits 0 and 1 of each operand.) If the RPL field of the destination operand is less than the RPL field of the source operand, the ZF flag is set and the RPL field of the destination operand is increased to match that of the source operand. Otherwise, the ZF flag is cleared and no change is made to the destination operand. (The destination operand can be a word register or a memory location; the source operand must be a word register.)</p>
<p>The ARPL instruction is provided for use by operating-system procedures (however, it can also be used by applications). It is generally used to adjust the RPL of a segment selector that has been passed to the operating system by an application program to match the privilege level of the application program. Here the segment selector passed to the operating system is placed in the destination operand and segment selector for the application programs code segment is placed in the source operand. (The RPL field in the source operand represents the privilege level of the application program.) Execution of the ARPL instruction then ensures that the RPL of the segment selector received by the operating system is no lower (does not have a higher privilege) than the privilege level of the application program (the segment selector for the application programs code segment can be read from the stack following a procedure call).</p>
<p>This instruction executes as described in compatibility mode and legacy mode. It is not encodable in 64-bit mode.</p>
<p>See “Checking Caller Access Privileges” in Chapter 3, “Protected-Mode Memory Management,” of the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 3A, for more information about the use of this instruction.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF 64-BIT MODE
THEN
See MOVSXD;
ELSE
IF DEST[RPL] &lt; SRC[RPL]
THEN
ZF := 1;
DEST[RPL] := SRC[RPL];
ELSE
ZF := 0;
FI;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The ZF flag is set to 1 if the RPL field of the destination operand is less than that of the source operand; otherwise, it is set to 0.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the destination is located in a non-writable segment.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>The ARPL instruction is not recognized in real-address mode.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>The ARPL instruction is not recognized in virtual-8086 mode.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<p>Not applicable.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

81
x86/bextr.html Normal file
View file

@ -0,0 +1,81 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BEXTR
— Bit Field Extract</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BEXTR
— Bit Field Extract</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>VEX.LZ.0F38.W0 F7 /r BEXTR r32a, r/m32, r32b</td>
<td>RMV</td>
<td>V/V</td>
<td>BMI1</td>
<td>Contiguous bitwise extract from r/m32 using r32b as control; store result in r32a.</td></tr>
<tr>
<td>VEX.LZ.0F38.W1 F7 /r BEXTR r64a, r/m64, r64b</td>
<td>RMV</td>
<td>V/N.E.</td>
<td>BMI1</td>
<td>Contiguous bitwise extract from r/m64 using r64b as control; store result in r64a.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RMV</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>VEX.vvvv (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Extracts contiguous bits from the first source operand (the second operand) using an index value and length value specified in the second source operand (the third operand). Bit 7:0 of the second source operand specifies the starting bit position of bit extraction. A START value exceeding the operand size will not extract any bits from the second source operand. Bit 15:8 of the second source operand specifies the maximum number of bits (LENGTH) beginning at the START position to extract. Only bit positions up to (OperandSize -1) of the first source operand are extracted. The extracted bits are written to the destination register, starting from the least significant bit. All higher order bits in the destination operand (starting at bit position LENGTH) are zeroed. The destination register is cleared if no bits are extracted.</p>
<p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>START := SRC2[7:0];
LEN := SRC2[15:8];
TEMP := ZERO_EXTEND_TO_512 (SRC1 );
DEST := ZERO_EXTEND(TEMP[START+LEN -1: START]);
ZF := (DEST = 0);
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF is updated based on the result. AF, SF, and PF are undefined. All other flags are cleared.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BEXTR unsigned __int32 _bextr_u32(unsigned __int32 src, unsigned __int32 start. unsigned __int32 len);
</pre>
<pre>BEXTR unsigned __int64 _bextr_u64(unsigned __int64 src, unsigned __int32 start. unsigned __int32 len);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-29</span>, “Type 13 Class Exception Conditions,” additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.W = 1.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

111
x86/blendpd.html Normal file
View file

@ -0,0 +1,111 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BLENDPD
— Blend Packed Double Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BLENDPD
— Blend Packed Double Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 3A 0D /r ib BLENDPD xmm1, xmm2/m128, imm8</td>
<td>RMI</td>
<td>V/V</td>
<td>SSE4_1</td>
<td>Select packed double precision floating-point values from xmm1 and xmm2/m128 from mask specified in imm8 and store the values into xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F3A.WIG 0D /r ib VBLENDPD xmm1, xmm2, xmm3/m128, imm8</td>
<td>RVMI</td>
<td>V/V</td>
<td>AVX</td>
<td>Select packed double precision floating-point Values from xmm2 and xmm3/m128 from mask in imm8 and store the values in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F3A.WIG 0D /r ib VBLENDPD ymm1, ymm2, ymm3/m256, imm8</td>
<td>RVMI</td>
<td>V/V</td>
<td>AVX</td>
<td>Select packed double precision floating-point Values from ymm2 and ymm3/m256 from mask in imm8 and store the values in ymm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RMI</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr>
<tr>
<td>RVMI</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8[3:0]</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Double-precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [3:0] determine whether the corresponding double precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is ”1”, then the double precision floating-point value in the second source operand is copied, else the value in the first source operand is copied.</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified.</p>
<p>VEX.128 encoded version: the first source operand is an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding YMM register destination are zeroed.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="blendpd--128-bit-legacy-sse-version-">BLENDPD (128-bit Legacy SSE Version)<a class="anchor" href="#blendpd--128-bit-legacy-sse-version-">
</a></h3>
<pre>IF (IMM8[0] = 0)THEN DEST[63:0] := DEST[63:0]
ELSE DEST [63:0] := SRC[63:0] FI
IF (IMM8[1] = 0) THEN DEST[127:64] := DEST[127:64]
ELSE DEST [127:64] := SRC[127:64] FI
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vblendpd--vex-128-encoded-version-">VBLENDPD (VEX.128 Encoded Version)<a class="anchor" href="#vblendpd--vex-128-encoded-version-">
</a></h3>
<pre>IF (IMM8[0] = 0)THEN DEST[63:0] := SRC1[63:0]
ELSE DEST [63:0] := SRC2[63:0] FI
IF (IMM8[1] = 0) THEN DEST[127:64] := SRC1[127:64]
ELSE DEST [127:64] := SRC2[127:64] FI
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vblendpd--vex-256-encoded-version-">VBLENDPD (VEX.256 Encoded Version)<a class="anchor" href="#vblendpd--vex-256-encoded-version-">
</a></h3>
<pre>IF (IMM8[0] = 0)THEN DEST[63:0] := SRC1[63:0]
ELSE DEST [63:0] := SRC2[63:0] FI
IF (IMM8[1] = 0) THEN DEST[127:64] := SRC1[127:64]
ELSE DEST [127:64] := SRC2[127:64] FI
IF (IMM8[2] = 0) THEN DEST[191:128] := SRC1[191:128]
ELSE DEST [191:128] := SRC2[191:128] FI
IF (IMM8[3] = 0) THEN DEST[255:192] := SRC1[255:192]
ELSE DEST [255:192] := SRC2[255:192] FI
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BLENDPD __m128d _mm_blend_pd (__m128d v1, __m128d v2, const int mask);
</pre>
<pre>VBLENDPD __m256d _mm256_blend_pd (__m256d a, __m256d b, const int mask);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

127
x86/blendps.html Normal file
View file

@ -0,0 +1,127 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BLENDPS
— Blend Packed Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BLENDPS
— Blend Packed Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 3A 0C /r ib BLENDPS xmm1, xmm2/m128, imm8</td>
<td>RMI</td>
<td>V/V</td>
<td>SSE4_1</td>
<td>Select packed single precision floating-point values from xmm1 and xmm2/m128 from mask specified in imm8 and store the values into xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F3A.WIG 0C /r ib VBLENDPS xmm1, xmm2, xmm3/m128, imm8</td>
<td>RVMI</td>
<td>V/V</td>
<td>AVX</td>
<td>Select packed single precision floating-point values from xmm2 and xmm3/m128 from mask in imm8 and store the values in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F3A.WIG 0C /r ib VBLENDPS ymm1, ymm2, ymm3/m256, imm8</td>
<td>RVMI</td>
<td>V/V</td>
<td>AVX</td>
<td>Select packed single precision floating-point values from ymm2 and ymm3/m256 from mask in imm8 and store the values in ymm1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RMI</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr>
<tr>
<td>RVMI</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Packed single precision floating-point values from the second source operand (third operand) are conditionally merged with values from the first source operand (second operand) and written to the destination operand (first operand). The immediate bits [7:0] determine whether the corresponding single precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is “1”, then the single precision floating-point value in the second source operand is copied, else the value in the first source operand is copied.</p>
<p>128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (MAXVL-1:128) of the corresponding YMM register destination are unmodified.</p>
<p>VEX.128 encoded version: The first source operand an XMM register. The second source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding YMM register destination are zeroed.</p>
<p>VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="blendps--128-bit-legacy-sse-version-">BLENDPS (128-bit Legacy SSE Version)<a class="anchor" href="#blendps--128-bit-legacy-sse-version-">
</a></h3>
<pre>IF (IMM8[0] = 0) THEN DEST[31:0] :=DEST[31:0]
ELSE DEST [31:0] := SRC[31:0] FI
IF (IMM8[1] = 0) THEN DEST[63:32] := DEST[63:32]
ELSE DEST [63:32] := SRC[63:32] FI
IF (IMM8[2] = 0) THEN DEST[95:64] := DEST[95:64]
ELSE DEST [95:64] := SRC[95:64] FI
IF (IMM8[3] = 0) THEN DEST[127:96] := DEST[127:96]
ELSE DEST [127:96] := SRC[127:96] FI
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vblendps--vex-128-encoded-version-">VBLENDPS (VEX.128 Encoded Version)<a class="anchor" href="#vblendps--vex-128-encoded-version-">
</a></h3>
<pre>IF (IMM8[0] = 0) THEN DEST[31:0] :=SRC1[31:0]
ELSE DEST [31:0] := SRC2[31:0] FI
IF (IMM8[1] = 0) THEN DEST[63:32] := SRC1[63:32]
ELSE DEST [63:32] := SRC2[63:32] FI
IF (IMM8[2] = 0) THEN DEST[95:64] := SRC1[95:64]
ELSE DEST [95:64] := SRC2[95:64] FI
IF (IMM8[3] = 0) THEN DEST[127:96] := SRC1[127:96]
ELSE DEST [127:96] := SRC2[127:96] FI
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vblendps--vex-256-encoded-version-">VBLENDPS (VEX.256 Encoded Version)<a class="anchor" href="#vblendps--vex-256-encoded-version-">
</a></h3>
<pre>IF (IMM8[0] = 0) THEN DEST[31:0] :=SRC1[31:0]
ELSE DEST [31:0] := SRC2[31:0] FI
IF (IMM8[1] = 0) THEN DEST[63:32] := SRC1[63:32]
ELSE DEST [63:32] := SRC2[63:32] FI
IF (IMM8[2] = 0) THEN DEST[95:64] := SRC1[95:64]
ELSE DEST [95:64] := SRC2[95:64] FI
IF (IMM8[3] = 0) THEN DEST[127:96] := SRC1[127:96]
ELSE DEST [127:96] := SRC2[127:96] FI
IF (IMM8[4] = 0) THEN DEST[159:128] := SRC1[159:128]
ELSE DEST [159:128] := SRC2[159:128] FI
IF (IMM8[5] = 0) THEN DEST[191:160] := SRC1[191:160]
ELSE DEST [191:160] := SRC2[191:160] FI
IF (IMM8[6] = 0) THEN DEST[223:192] := SRC1[223:192]
ELSE DEST [223:192] := SRC2[223:192] FI
IF (IMM8[7] = 0) THEN DEST[255:224] := SRC1[255:224]
ELSE DEST [255:224] := SRC2[255:224] FI.
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BLENDPS __m128 _mm_blend_ps (__m128 v1, __m128 v2, const int mask);
</pre>
<pre>VBLENDPS __m256 _mm256_blend_ps (__m256 a, __m256 b, const int mask);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

126
x86/blendvpd.html Normal file
View file

@ -0,0 +1,126 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BLENDVPD
— Variable Blend Packed Double Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BLENDVPD
— Variable Blend Packed Double Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 38 15 /r BLENDVPD xmm1, xmm2/m128 , &lt;XMM0&gt;</td>
<td>RM0</td>
<td>V/V</td>
<td>SSE4_1</td>
<td>Select packed double precision floating-point values from xmm1 and xmm2 from mask specified in XMM0 and store the values in xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F3A.W0 4B /r /is4 VBLENDVPD xmm1, xmm2, xmm3/m128, xmm4</td>
<td>RVMR</td>
<td>V/V</td>
<td>AVX</td>
<td>Conditionally copy double precision floating-point values from xmm2 or xmm3/m128 to xmm1, based on mask bits in the mask operand, xmm4.</td></tr>
<tr>
<td>VEX.256.66.0F3A.W0 4B /r /is4 VBLENDVPD ymm1, ymm2, ymm3/m256, ymm4</td>
<td>RVMR</td>
<td>V/V</td>
<td>AVX</td>
<td>Conditionally copy double precision floating-point values from ymm2 or ymm3/m256 to ymm1, based on mask bits in the mask operand, ymm4.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM0</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>implicit XMM0</td>
<td>N/A</td></tr>
<tr>
<td>RVMR</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8[7:4]</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Conditionally copy each quadword data element of double precision floating-point value from the second source operand and the first source operand depending on mask bits defined in the mask register operand. The mask bits are the most significant bit in each quadword element of the mask register.</p>
<p>Each quadword element of the destination operand is copied from:</p>
<ul>
<li>the corresponding quadword element in the second source operand, if a mask bit is “1”; or</li>
<li>the corresponding quadword element in the first source operand, if a mask bit is “0”</li></ul>
<p>The register assignment of the implicit mask operand for BLENDVPD is defined to be the architectural register XMM0.</p>
<p>128-bit Legacy SSE version: The first source operand and the destination operand is the same. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged. The mask register operand is implicitly defined to be the architectural register XMM0. An attempt to execute BLENDVPD with a VEX prefix will cause #UD.</p>
<p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand is an XMM register or 128-bit memory location. The mask operand is the third source register, and encoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is ignored. The upper bits (MAXVL-1:128) of the corresponding YMM register (destination register) are zeroed. VEX.W must be 0, otherwise, the instruction will #UD.</p>
<p>VEX.256 encoded version: The first source operand and destination operand are YMM registers. The second source operand can be a YMM register or a 256-bit memory location. The mask operand is the third source register, and encoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is ignored. VEX.W must be 0, otherwise, the instruction will #UD.</p>
<p>VBLENDVPD permits the mask to be any XMM or YMM register. In contrast, BLENDVPD treats XMM0 implicitly as the mask and do not support non-destructive destination operation.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="blendvpd--128-bit-legacy-sse-version-">BLENDVPD (128-bit Legacy SSE Version)<a class="anchor" href="#blendvpd--128-bit-legacy-sse-version-">
</a></h3>
<pre>MASK := XMM0
IF (MASK[63] = 0) THEN DEST[63:0] := DEST[63:0]
ELSE DEST [63:0] := SRC[63:0] FI
IF (MASK[127] = 0) THEN DEST[127:64] := DEST[127:64]
ELSE DEST [127:64] := SRC[127:64] FI
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vblendvpd--vex-128-encoded-version-">VBLENDVPD (VEX.128 Encoded Version)<a class="anchor" href="#vblendvpd--vex-128-encoded-version-">
</a></h3>
<pre>MASK := SRC3
IF (MASK[63] = 0) THEN DEST[63:0] := SRC1[63:0]
ELSE DEST [63:0] := SRC2[63:0] FI
IF (MASK[127] = 0) THEN DEST[127:64] := SRC1[127:64]
ELSE DEST [127:64] := SRC2[127:64] FI
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vblendvpd--vex-256-encoded-version-">VBLENDVPD (VEX.256 Encoded Version)<a class="anchor" href="#vblendvpd--vex-256-encoded-version-">
</a></h3>
<pre>MASK := SRC3
IF (MASK[63] = 0) THEN DEST[63:0] := SRC1[63:0]
ELSE DEST [63:0] := SRC2[63:0] FI
IF (MASK[127] = 0) THEN DEST[127:64] := SRC1[127:64]
ELSE DEST [127:64] := SRC2[127:64] FI
IF (MASK[191] = 0) THEN DEST[191:128] := SRC1[191:128]
ELSE DEST [191:128] := SRC2[191:128] FI
IF (MASK[255] = 0) THEN DEST[255:192] := SRC1[255:192]
ELSE DEST [255:192] := SRC2[255:192] FI
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BLENDVPD __m128d _mm_blendv_pd(__m128d v1, __m128d v2, __m128d v3);
</pre>
<pre>VBLENDVPD __m128 _mm_blendv_pd (__m128d a, __m128d b, __m128d mask);
</pre>
<pre>VBLENDVPD __m256 _mm256_blendv_pd (__m256d a, __m256d b, __m256d mask);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions,” additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.W = 1.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

142
x86/blendvps.html Normal file
View file

@ -0,0 +1,142 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BLENDVPS
— Variable Blend Packed Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BLENDVPS
— Variable Blend Packed Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 38 14 /r BLENDVPS xmm1, xmm2/m128, &lt;XMM0&gt;</td>
<td>RM0</td>
<td>V/V</td>
<td>SSE4_1</td>
<td>Select packed single precision floating-point values from xmm1 and xmm2/m128 from mask specified in XMM0 and store the values into xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F3A.W0 4A /r /is4 VBLENDVPS xmm1, xmm2, xmm3/m128, xmm4</td>
<td>RVMR</td>
<td>V/V</td>
<td>AVX</td>
<td>Conditionally copy single precision floating-point values from xmm2 or xmm3/m128 to xmm1, based on mask bits in the specified mask operand, xmm4.</td></tr>
<tr>
<td>VEX.256.66.0F3A.W0 4A /r /is4 VBLENDVPS ymm1, ymm2, ymm3/m256, ymm4</td>
<td>RVMR</td>
<td>V/V</td>
<td>AVX</td>
<td>Conditionally copy single precision floating-point values from ymm2 or ymm3/m256 to ymm1, based on mask bits in the specified mask register, ymm4.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM0</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>implicit XMM0</td>
<td>N/A</td></tr>
<tr>
<td>RVMR</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8[7:4]</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Conditionally copy each dword data element of single precision floating-point value from the second source operand and the first source operand depending on mask bits defined in the mask register operand. The mask bits are the most significant bit in each dword element of the mask register.</p>
<p>Each quadword element of the destination operand is copied from:</p>
<ul>
<li>the corresponding dword element in the second source operand, if a mask bit is “1”; or</li>
<li>the corresponding dword element in the first source operand, if a mask bit is “0”.</li></ul>
<p>The register assignment of the implicit mask operand for BLENDVPS is defined to be the architectural register XMM0.</p>
<p>128-bit Legacy SSE version: The first source operand and the destination operand is the same. Bits (MAXVL-1:128) of the corresponding YMM destination register remain unchanged. The mask register operand is implicitly defined to be the architectural register XMM0. An attempt to execute BLENDVPS with a VEX prefix will cause #UD.</p>
<p>VEX.128 encoded version: The first source operand and the destination operand are XMM registers. The second source operand is an XMM register or 128-bit memory location. The mask operand is the third source register, and encoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is ignored. The upper bits (MAXVL-1:128) of the corresponding YMM register (destination register) are zeroed. VEX.W must be 0, otherwise, the instruction will #UD.</p>
<p>VEX.256 encoded version: The first source operand and destination operand are YMM registers. The second source operand can be a YMM register or a 256-bit memory location. The mask operand is the third source register, and encoded in bits[7:4] of the immediate byte(imm8). The bits[3:0] of imm8 are ignored. In 32-bit mode, imm8[7] is ignored. VEX.W must be 0, otherwise, the instruction will #UD.</p>
<p>VBLENDVPS permits the mask to be any XMM or YMM register. In contrast, BLENDVPS treats XMM0 implicitly as the mask and do not support non-destructive destination operation.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="blendvps--128-bit-legacy-sse-version-">BLENDVPS (128-bit Legacy SSE Version)<a class="anchor" href="#blendvps--128-bit-legacy-sse-version-">
</a></h3>
<pre>MASK := XMM0
IF (MASK[31] = 0) THEN DEST[31:0] := DEST[31:0]
ELSE DEST [31:0] := SRC[31:0] FI
IF (MASK[63] = 0) THEN DEST[63:32] := DEST[63:32]
ELSE DEST [63:32] := SRC[63:32] FI
IF (MASK[95] = 0) THEN DEST[95:64] := DEST[95:64]
ELSE DEST [95:64] := SRC[95:64] FI
IF (MASK[127] = 0) THEN DEST[127:96] := DEST[127:96]
ELSE DEST [127:96] := SRC[127:96] FI
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h3 id="vblendvps--vex-128-encoded-version-">VBLENDVPS (VEX.128 Encoded Version)<a class="anchor" href="#vblendvps--vex-128-encoded-version-">
</a></h3>
<pre>MASK := SRC3
IF (MASK[31] = 0) THEN DEST[31:0] := SRC1[31:0]
ELSE DEST [31:0] := SRC2[31:0] FI
IF (MASK[63] = 0) THEN DEST[63:32] := SRC1[63:32]
ELSE DEST [63:32] := SRC2[63:32] FI
IF (MASK[95] = 0) THEN DEST[95:64] := SRC1[95:64]
ELSE DEST [95:64] := SRC2[95:64] FI
IF (MASK[127] = 0) THEN DEST[127:96] := SRC1[127:96]
ELSE DEST [127:96] := SRC2[127:96] FI
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vblendvps--vex-256-encoded-version-">VBLENDVPS (VEX.256 Encoded Version)<a class="anchor" href="#vblendvps--vex-256-encoded-version-">
</a></h3>
<pre>MASK := SRC3
IF (MASK[31] = 0) THEN DEST[31:0] := SRC1[31:0]
ELSE DEST [31:0] := SRC2[31:0] FI
IF (MASK[63] = 0) THEN DEST[63:32] := SRC1[63:32]
ELSE DEST [63:32] := SRC2[63:32] FI
IF (MASK[95] = 0) THEN DEST[95:64] := SRC1[95:64]
ELSE DEST [95:64] := SRC2[95:64] FI
IF (MASK[127] = 0) THEN DEST[127:96] := SRC1[127:96]
ELSE DEST [127:96] := SRC2[127:96] FI
IF (MASK[159] = 0) THEN DEST[159:128] := SRC1[159:128]
ELSE DEST [159:128] := SRC2[159:128] FI
IF (MASK[191] = 0) THEN DEST[191:160] := SRC1[191:160]
ELSE DEST [191:160] := SRC2[191:160] FI
IF (MASK[223] = 0) THEN DEST[223:192] := SRC1[223:192]
ELSE DEST [223:192] := SRC2[223:192] FI
IF (MASK[255] = 0) THEN DEST[255:224] := SRC1[255:224]
ELSE DEST [255:224] := SRC2[255:224] FI
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BLENDVPS __m128 _mm_blendv_ps(__m128 v1, __m128 v2, __m128 v3);
</pre>
<pre>VBLENDVPS __m128 _mm_blendv_ps (__m128 a, __m128 b, __m128 mask);
</pre>
<pre>VBLENDVPS __m256 _mm256_blendv_ps (__m256 a, __m256 b, __m256 mask);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-21</span>, “Type 4 Class Exception Conditions,” additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.W = 1.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

81
x86/blsi.html Normal file
View file

@ -0,0 +1,81 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BLSI
— Extract Lowest Set Isolated Bit</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BLSI
— Extract Lowest Set Isolated Bit</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>VEX.LZ.0F38.W0 F3 /3 BLSI r32, r/m32</td>
<td>VM</td>
<td>V/V</td>
<td>BMI1</td>
<td>Extract lowest set bit from r/m32 and set that bit in r32.</td></tr>
<tr>
<td>VEX.LZ.0F38.W1 F3 /3 BLSI r64, r/m64</td>
<td>VM</td>
<td>V/N.E.</td>
<td>BMI1</td>
<td>Extract lowest set bit from r/m64, and set that bit in r64.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>VM</td>
<td>VEX.vvvv (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Extracts the lowest set bit from the source operand and set the corresponding bit in the destination register. All other bits in the destination operand are zeroed. If no bits are set in the source operand, BLSI sets all the bits in the destination to 0 and sets ZF and CF.</p>
<p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>temp := (-SRC) bitwiseAND (SRC);
SF := temp[OperandSize -1];
ZF := (temp = 0);
IF SRC = 0
CF := 0;
ELSE
CF := 1;
FI
DEST := temp;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF and SF are updated based on the result. CF is set if the source is not zero. OF flags are cleared. AF and PF flags are undefined.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BLSI unsigned __int32 _blsi_u32(unsigned __int32 src);
</pre>
<pre>BLSI unsigned __int64 _blsi_u64(unsigned __int64 src);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-29</span>, “Type 13 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

81
x86/blsmsk.html Normal file
View file

@ -0,0 +1,81 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BLSMSK
— Get Mask Up to Lowest Set Bit</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BLSMSK
— Get Mask Up to Lowest Set Bit</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>VEX.LZ.0F38.W0 F3 /2 BLSMSK r32, r/m32</td>
<td>VM</td>
<td>V/V</td>
<td>BMI1</td>
<td>Set all lower bits in r32 to “1” starting from bit 0 to lowest set bit in r/m32.</td></tr>
<tr>
<td>VEX.LZ.0F38.W1 F3 /2 BLSMSK r64, r/m64</td>
<td>VM</td>
<td>V/N.E.</td>
<td>BMI1</td>
<td>Set all lower bits in r64 to “1” starting from bit 0 to lowest set bit in r/m64.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>VM</td>
<td>VEX.vvvv (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Sets all the lower bits of the destination operand to “1” up to and including lowest set bit (=1) in the source operand. If source operand is zero, BLSMSK sets all bits of the destination operand to 1 and also sets CF to 1.</p>
<p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>temp := (SRC-1) XOR (SRC) ;
SF := temp[OperandSize -1];
ZF := 0;
IF SRC = 0
CF := 1;
ELSE
CF := 0;
FI
DEST := temp;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>SF is updated based on the result. CF is set if the source if zero. ZF and OF flags are cleared. AF and PF flag are undefined.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BLSMSK unsigned __int32 _blsmsk_u32(unsigned __int32 src);
</pre>
<pre>BLSMSK unsigned __int64 _blsmsk_u64(unsigned __int64 src);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-29</span>, “Type 13 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

81
x86/blsr.html Normal file
View file

@ -0,0 +1,81 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BLSR
— Reset Lowest Set Bit</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BLSR
— Reset Lowest Set Bit</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>VEX.LZ.0F38.W0 F3 /1 BLSR r32, r/m32</td>
<td>VM</td>
<td>V/V</td>
<td>BMI1</td>
<td>Reset lowest set bit of r/m32, keep all other bits of r/m32 and write result to r32.</td></tr>
<tr>
<td>VEX.LZ.0F38.W1 F3 /1 BLSR r64, r/m64</td>
<td>VM</td>
<td>V/N.E.</td>
<td>BMI1</td>
<td>Reset lowest set bit of r/m64, keep all other bits of r/m64 and write result to r64.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>VM</td>
<td>VEX.vvvv (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Copies all bits from the source operand to the destination operand and resets (=0) the bit position in the destination operand that corresponds to the lowest set bit of the source operand. If the source operand is zero BLSR sets CF.</p>
<p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>temp := (SRC-1) bitwiseAND ( SRC );
SF := temp[OperandSize -1];
ZF := (temp = 0);
IF SRC = 0
CF := 1;
ELSE
CF := 0;
FI
DEST := temp;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF and SF flags are updated based on the result. CF is set if the source is zero. OF flag is cleared. AF and PF flags are undefined.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BLSR unsigned __int32 _blsr_u32(unsigned __int32 src);
</pre>
<pre>BLSR unsigned __int64 _blsr_u64(unsigned __int64 src);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-29</span>, “Type 13 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

132
x86/bndcl.html Normal file
View file

@ -0,0 +1,132 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BNDCL
— Check Lower Bound</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BNDCL
— Check Lower Bound</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 1A /r BNDCL bnd, r/m32</td>
<td>RM</td>
<td>N.E./V</td>
<td>MPX</td>
<td>Generate a #BR if the address in r/m32 is lower than the lower bound in bnd.LB.</td></tr>
<tr>
<td>F3 0F 1A /r BNDCL bnd, r/m64</td>
<td>RM</td>
<td>V/N.E.</td>
<td>MPX</td>
<td>Generate a #BR if the address in r/m64 is lower than the lower bound in bnd.LB.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compare the address in the second operand with the lower bound in bnd. The second operand can be either a register or memory operand. If the address is lower than the lower bound in bnd.LB, it will set BNDSTATUS to 01H and signal a #BR exception.</p>
<p>This instruction does not cause any memory access, and does not read or write any flags.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="bndcl-bnd--reg">BNDCL BND, reg<a class="anchor" href="#bndcl-bnd--reg">
</a></h3>
<pre>IF reg &lt; BND.LB Then
BNDSTATUS := 01H;
#BR;
FI;
</pre>
<h3 id="bndcl-bnd--mem">BNDCL BND, mem<a class="anchor" href="#bndcl-bnd--mem">
</a></h3>
<pre>TEMP := LEA(mem);
IF TEMP &lt; BND.LB Then
BNDSTATUS := 01H;
#BR;
FI;
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BNDCL void _bnd_chk_ptr_lbounds(const void *q)
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If lower bound check fails.</td></tr>
<tr>
<td rowspan="4">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 67H prefix is not used and CS.D=0.</td></tr>
<tr>
<td>If 67H prefix is used and CS.D=1.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If lower bound check fails.</td></tr>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If lower bound check fails.</td></tr>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.</td></tr></table>
<p>Same exceptions as in protected mode.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

164
x86/bndcu.bndcn.html Normal file
View file

@ -0,0 +1,164 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BNDCU/BNDCN
— Check Upper Bound</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BNDCU/BNDCN
— Check Upper Bound</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F2 0F 1A /r BNDCU bnd, r/m32</td>
<td>RM</td>
<td>N.E./V</td>
<td>MPX</td>
<td>Generate a #BR if the address in r/m32 is higher than the upper bound in bnd.UB (bnb.UB in 1's complement form).</td></tr>
<tr>
<td>F2 0F 1A /r BNDCU bnd, r/m64</td>
<td>RM</td>
<td>V/N.E.</td>
<td>MPX</td>
<td>Generate a #BR if the address in r/m64 is higher than the upper bound in bnd.UB (bnb.UB in 1's complement form).</td></tr>
<tr>
<td>F2 0F 1B /r BNDCN bnd, r/m32</td>
<td>RM</td>
<td>N.E./V</td>
<td>MPX</td>
<td>Generate a #BR if the address in r/m32 is higher than the upper bound in bnd.UB (bnb.UB not in 1's complement form).</td></tr>
<tr>
<td>F2 0F 1B /r BNDCN bnd, r/m64</td>
<td>RM</td>
<td>V/N.E.</td>
<td>MPX</td>
<td>Generate a #BR if the address in r/m64 is higher than the upper bound in bnd.UB (bnb.UB not in 1's complement form).</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compare the address in the second operand with the upper bound in bnd. The second operand can be either a register or a memory operand. If the address is higher than the upper bound in bnd.UB, it will set BNDSTATUS to 01H and signal a #BR exception.</p>
<p>BNDCU perform 1s complement operation on the upper bound of bnd first before proceeding with address comparison. BNDCN perform address comparison directly using the upper bound in bnd that is already reverted out of 1s complement form.</p>
<p>This instruction does not cause any memory access, and does not read or write any flags.</p>
<p>Effective address computation of m32/64 has identical behavior to LEA</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="bndcu-bnd--reg">BNDCU BND, reg<a class="anchor" href="#bndcu-bnd--reg">
</a></h3>
<pre>IF reg &gt; NOT(BND.UB) Then
BNDSTATUS := 01H;
#BR;
FI;
</pre>
<h3 id="bndcu-bnd--mem">BNDCU BND, mem<a class="anchor" href="#bndcu-bnd--mem">
</a></h3>
<pre>TEMP := LEA(mem);
IF TEMP &gt; NOT(BND.UB) Then
BNDSTATUS := 01H;
#BR;
FI;
</pre>
<h3 id="bndcn-bnd--reg">BNDCN BND, reg<a class="anchor" href="#bndcn-bnd--reg">
</a></h3>
<pre>IF reg &gt; BND.UB Then
BNDSTATUS := 01H;
#BR;
FI;
</pre>
<h3 id="bndcn-bnd--mem">BNDCN BND, mem<a class="anchor" href="#bndcn-bnd--mem">
</a></h3>
<pre>TEMP := LEA(mem);
IF TEMP &gt; BND.UB Then
BNDSTATUS := 01H;
#BR;
FI;
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BNDCU .void _bnd_chk_ptr_ubounds(const void *q)
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If upper bound check fails.</td></tr>
<tr>
<td rowspan="4">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 67H prefix is not used and CS.D=0.</td></tr>
<tr>
<td>If 67H prefix is used and CS.D=1.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If upper bound check fails.</td></tr>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If upper bound check fails.</td></tr>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.</td></tr></table>
<p>Same exceptions as in protected mode.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

185
x86/bndldx.html Normal file
View file

@ -0,0 +1,185 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BNDLDX
— Load Extended Bounds Using Address Translation</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BNDLDX
— Load Extended Bounds Using Address Translation</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 1A /r BNDLDX bnd, mib</td>
<td>RM</td>
<td>V/V</td>
<td>MPX</td>
<td>Load the bounds stored in a bound table entry (BTE) into bnd with address translation using the base of mib and conditional on the index of mib matching the pointer value in the BTE.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>SIB.base (r): Address of pointer SIB.index(r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>BNDLDX uses the linear address constructed from the base register and displacement of the SIB-addressing form of the memory operand (mib) to perform address translation to access a bound table entry and conditionally load the bounds in the BTE to the destination. The destination register is updated with the bounds in the BTE, if the content of the index register of mib matches the pointer value stored in the BTE.</p>
<p>If the pointer value comparison fails, the destination is updated with INIT bounds (lb = 0x0, ub = 0x0) (note: as articulated earlier, the upper bound is represented using 1's complement, therefore, the 0x0 value of upper bound allows for access to full memory).</p>
<p>This instruction does not cause memory access to the linear address of mib nor the effective address referenced by the base, and does not read or write any flags.</p>
<p>Segment overrides apply to the linear address computation with the base of mib, and are used during address translation to generate the address of the bound table entry. By default, the address of the BTE is assumed to be linear address. There are no segmentation checks performed on the base of mib.</p>
<p>The base of mib will not be checked for canonical address violation as it does not access memory.</p>
<p>Any encoding of this instruction that does not specify base or index register will treat those registers as zero (constant). The reg-reg form of this instruction will remain a NOP.</p>
<p>The scale field of the SIB byte has no effect on these instructions and is ignored.</p>
<p>The bound register may be partially updated on memory faults. The order in which memory operands are loaded is implementation specific.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>base := mib.SIB.base ? mib.SIB.base + Disp: 0;
ptr_value := mib.SIB.index ? mib.SIB.index : 0;
</pre>
<h3 id="outside-64-bit-mode">Outside 64-bit Mode<a class="anchor" href="#outside-64-bit-mode">
</a></h3>
<pre>A_BDE[31:0] := (Zero_extend32(base[31:12] « 2) + (BNDCFG[31:12] «12 );
A_BT[31:0] := LoadFrom(A_BDE );
IF A_BT[0] equal 0 Then
BNDSTATUS := A_BDE | 02H;
#BR;
FI;
A_BTE[31:0] := (Zero_extend32(base[11:2] « 4) + (A_BT[31:2] « 2 );
Temp_lb[31:0] := LoadFrom(A_BTE);
Temp_ub[31:0] := LoadFrom(A_BTE + 4);
Temp_ptr[31:0] := LoadFrom(A_BTE + 8);
IF Temp_ptr equal ptr_value Then
BND.LB := Temp_lb;
BND.UB := Temp_ub;
ELSE
BND.LB := 0;
BND.UB := 0;
FI;
</pre>
<h3 id="in-64-bit-mode">In 64-bit Mode<a class="anchor" href="#in-64-bit-mode">
</a></h3>
<pre>A_BDE[63:0] := (Zero_extend64(base[47+MAWA:20] « 3) + (BNDCFG[63:12] «12 );<sup>1</sup>
A_BT[63:0] := LoadFrom(A_BDE);
IF A_BT[0] equal 0 Then
BNDSTATUS := A_BDE | 02H;
#BR;
FI;
A_BTE[63:0] := (Zero_extend64(base[19:3] « 5) + (A_BT[63:3] « 3 );
Temp_lb[63:0] := LoadFrom(A_BTE);
Temp_ub[63:0] := LoadFrom(A_BTE + 8);
Temp_ptr[63:0] := LoadFrom(A_BTE + 16);
IF Temp_ptr equal ptr_value Then
BND.LB := Temp_lb;
BND.UB := Temp_ub;
ELSE
BND.LB := 0;
BND.UB := 0;
FI;
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BNDLDX: Generated by compiler as needed.
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If the bound directory entry is invalid.</td></tr>
<tr>
<td rowspan="4">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 67H prefix is not used and CS.D=0.</td></tr>
<tr>
<td>If 67H prefix is used and CS.D=1.</td></tr>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If a destination effective address of the Bound Table entry is outside the DS segment limit.</td></tr>
<tr>
<td>If DS register contains a NULL segment selector.</td></tr>
<tr>
<td>#PF(fault</td>
<td>code) If a page fault occurs.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If a destination effective address of the Bound Table entry is outside the DS segment limit.</td></tr></table>
<blockquote>
<p>1. If CPL &lt; 3, the supervisor MAWA (MAWAS) is used; this value is 0. If CPL = 3, the user MAWA (MAWAU) is used; this value is enumerated in CPUID.(EAX=07H,ECX=0H):ECX.MAWAU[bits 21:17]. See Appendix E.3.1 of Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 1.</p></blockquote>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If a destination effective address of the Bound Table entry is outside the DS segment limit.</td></tr>
<tr>
<td>#PF(fault</td>
<td>code) If a page fault occurs.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If the bound directory entry is invalid.</td></tr>
<tr>
<td rowspan="3">#UD</td>
<td>If ModRM is RIP relative.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address (A_BDE or A_BTE) is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault</td>
<td>code) If a page fault occurs.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

125
x86/bndmk.html Normal file
View file

@ -0,0 +1,125 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BNDMK
— Make Bounds</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BNDMK
— Make Bounds</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 1B /r BNDMK bnd, m32</td>
<td>RM</td>
<td>N.E./V</td>
<td>MPX</td>
<td>Make lower and upper bounds from m32 and store them in bnd.</td></tr>
<tr>
<td>F3 0F 1B /r BNDMK bnd, m64</td>
<td>RM</td>
<td>V/N.E.</td>
<td>MPX</td>
<td>Make lower and upper bounds from m64 and store them in bnd.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Makes bounds from the second operand and stores the lower and upper bounds in the bound register bnd. The second operand must be a memory operand. The content of the base register from the memory operand is stored in the lower bound bnd.LB. The 1's complement of the effective address of m32/m64 is stored in the upper bound b.UB. Computation of m32/m64 has identical behavior to LEA.</p>
<p>This instruction does not cause any memory access, and does not read or write any flags.</p>
<p>If the instruction did not specify base register, the lower bound will be zero. The reg-reg form of this instruction retains legacy behavior (NOP).</p>
<p>The instruction causes an invalid-opcode exception (#UD) if executed in 64-bit mode with RIP-relative addressing.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>BND.LB := SRCMEM.base;
IF 64-bit mode Then
BND.UB := NOT(LEA.64_bits(SRCMEM));
ELSE
BND.UB := Zero_Extend.64_bits(NOT(LEA.32_bits(SRCMEM)));
FI;
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BNDMKvoid * _bnd_set_ptr_bounds(const void * q, size_t size);
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="4">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 67H prefix is not used and CS.D=0.</td></tr>
<tr>
<td>If 67H prefix is used and CS.D=1.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.</td></tr>
<tr>
<td>If RIP-relative addressing is used.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If the memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr></table>
<p>Same exceptions as in protected mode.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

274
x86/bndmov.html Normal file
View file

@ -0,0 +1,274 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BNDMOV
— Move Bounds</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BNDMOV
— Move Bounds</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 1A /r BNDMOV bnd1, bnd2/m64</td>
<td>RM</td>
<td>N.E./V</td>
<td>MPX</td>
<td>Move lower and upper bound from bnd2/m64 to bound register bnd1.</td></tr>
<tr>
<td>66 0F 1A /r BNDMOV bnd1, bnd2/m128</td>
<td>RM</td>
<td>V/N.E.</td>
<td>MPX</td>
<td>Move lower and upper bound from bnd2/m128 to bound register bnd1.</td></tr>
<tr>
<td>66 0F 1B /r BNDMOV bnd1/m64, bnd2</td>
<td>MR</td>
<td>N.E./V</td>
<td>MPX</td>
<td>Move lower and upper bound from bnd2 to bnd1/m64.</td></tr>
<tr>
<td>66 0F 1B /r BNDMOV bnd1/m128, bnd2</td>
<td>MR</td>
<td>V/N.E.</td>
<td>MPX</td>
<td>Move lower and upper bound from bnd2 to bound register bnd1/m128.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>BNDMOV moves a pair of lower and upper bound values from the source operand (the second operand) to the destination (the first operand). Each operation is 128-bit move. The exceptions are same as the MOV instruction. The memory format for loading/store bounds in 64-bit mode is shown in <a href='bndmov.html#fig-3-5'>Figure 3-5</a>.</p>
<figure id="fig-3-5">
<svg style="width: 596.736pt; height: 219.527976pt" viewBox="50.84 0.0 502.28 187.93998">
<g xmlns="http://www.w3.org/2000/svg" style="stroke: none; fill: none">
<rect height="181.98" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="53.34" y="0.47998000000001184"></rect>
<rect height="181.98" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="550.14" y="0.47998000000001184"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="497.28000000000003" x="53.34" y="0.0"></rect>
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="497.28000000000003" x="53.34" y="182.45997"></rect>
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="286.14" x="84.06" y="23.819970000000012"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="369.72" y="24.059979999999996"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="286.14" x="83.82000000000001" y="37.31999999999999"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="83.82000000000001" y="23.819979999999987"></rect>
<rect height="0.24002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="233.04" y="23.699959999999976"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="233.04" y="23.93997999999999"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="320.1" x="144.24" y="66.89997999999997"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="320.34000000000003" x="144.24" y="66.65999999999997"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="464.1" y="66.89997999999997"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="320.34000000000003" x="144.0" y="80.15999999999997"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="144.0" y="66.65997999999996"></rect>
<rect height="0.24002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.62" y="66.47996"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="313.62" y="66.71997999999996"></rect>
<path d="M 433.26 61.67998 L 433.8 59.099980000000016 L 433.98 58.499979999999994 L 434.58 58.799980000000005 L 443.03999999999996 63.23998 L 444.53999999999996 64.01997999999998 L 442.8 64.13997999999998 L 433.26 64.85998000000001 L 432.59999999999997 64.91998000000001 L 432.71999999999997 64.25997999999998 L 433.2 63.89998000000003 L 442.74 63.17998 L 442.8 64.13997999999998 L 442.5 64.07997999999998 L 434.03999999999996 59.63997999999998 L 434.58 58.799980000000005 L 434.82 59.339980000000025 L 434.28 61.91998000000001" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 432.72 64.25997999999998 L 433.26000000000005 61.67998 L 434.28000000000003 61.91998000000001 L 433.74 64.49998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 433.74 61.799980000000005 L 434.28000000000003 59.21998000000002 L 442.74 63.65998000000002 L 433.2 64.37997999999999" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 346.20000000000005 43.37997999999999 L 345.96000000000004 43.31997999999999 L 345.84000000000003 43.799980000000005 L 346.08000000000004 43.85998000000001" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 433.26 61.43997999999999 L 433.5 61.499979999999994 L 433.38 61.97998000000001 L 433.14 61.91998000000001" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 346.20000000000005 43.37997999999999 L 346.08000000000004 43.85998000000001 L 433.14000000000004 61.91998000000001 L 433.26000000000005 61.43997999999999" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 284.64 59.45997999999997 L 285.18 56.87997999999999 L 285.36 56.279979999999966 L 285.96 56.57997999999998 L 294.41999999999996 61.019979999999975 L 295.91999999999996 61.79997999999995 L 294.18 61.91997999999995 L 284.64 62.63997999999998 L 283.97999999999996 62.69997999999998 L 284.09999999999997 62.03997999999996 L 284.58 61.67998 L 294.12 60.95997999999997 L 294.18 61.91997999999995 L 293.88 61.85997999999995 L 285.41999999999996 57.41997999999995 L 285.96 56.57997999999998 L 286.2 57.11998 L 285.65999999999997 59.69997999999998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 284.1 62.03997999999996 L 284.64000000000004 59.45997999999997 L 285.66 59.69997999999998 L 285.12 62.279979999999966" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 285.12 59.57997999999998 L 285.66 56.999979999999994 L 294.12 61.43997999999999 L 284.58 62.15997999999996" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 197.58 41.21998000000002 L 197.34 41.15998000000002 L 197.22 41.63998000000004 L 197.46 41.69998000000004" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 284.7 59.27998000000002 L 284.94 59.339980000000025 L 284.82 59.819980000000044 L 284.58 59.75998000000004" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 197.58 41.21998000000002 L 197.46 41.69998000000004 L 284.58000000000004 59.75998000000004 L 284.70000000000005 59.27998000000002" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="285.90000000000003" x="76.44" y="108.29998"></rect>
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="286.14" x="76.44" y="108.05997000000002"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="362.1" y="108.29998"></rect>
<rect height="0.48001000000000005" style="fill: rgb(0%, 0%, 0%)" width="286.14" x="76.2" y="121.55997000000002"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="76.2" y="108.05998"></rect>
<rect height="0.23999" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="225.42000000000002" y="107.87999000000002"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="225.42000000000002" y="108.11998"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="320.1" x="136.62" y="151.07998000000003"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="320.34000000000003" x="136.62" y="150.83999999999997"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="456.48" y="151.07997999999998"></rect>
<rect height="0.48" style="fill: rgb(0%, 0%, 0%)" width="320.34000000000003" x="136.38" y="164.33998"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="136.38" y="150.83998000000003"></rect>
<rect height="0.23999" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="306.06" y="150.71999"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="306.06" y="150.95998000000003"></rect>
<path d="M 427.38 153.35998 L 428.4 150.89998000000003 L 428.7 150.29998 L 429.18 150.71998000000002 L 436.68 156.71998000000002 L 438.06 157.79998 L 436.32 157.61998 L 426.84 156.53998 L 426.12 156.41998 L 426.42 155.81998000000002 L 426.9 155.51998 L 436.38 156.59998000000002 L 436.32 157.61998 L 436.02 157.49998 L 428.52 151.49998 L 429.18 150.71998000000002 L 429.3 151.25997999999998 L 428.28 153.71998000000002" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 426.42 155.81998 L 427.38 153.35997999999998 L 428.28000000000003 153.71998 L 427.32 156.17998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 427.8 153.53997999999999 L 428.82 151.07997999999998 L 436.32 157.07997999999998 L 426.84000000000003 155.99998" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 342.12 118.07997999999998 L 341.94 118.01997999999998 L 341.76 118.43997999999999 L 341.94 118.49998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 427.5 153.17998 L 427.68 153.23998 L 427.5 153.65998 L 427.32 153.59998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 342.12 118.07997999999998 L 341.94 118.49998 L 427.32 153.59998 L 427.5 153.17997999999997" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 363.24 153.89998 L 363.78000000000003 151.31998 L 363.96000000000004 150.71998000000002 L 364.56 151.01997999999998 L 373.02 155.45998 L 374.52 156.23998 L 372.78000000000003 156.35998 L 363.24 157.07998 L 362.58 157.13998 L 362.7 156.47998 L 363.18 156.11998 L 372.72 155.39998 L 372.78000000000003 156.35998 L 372.48 156.29998 L 364.02 151.85998 L 364.56 151.01997999999998 L 364.8 151.55998 L 364.26 154.13998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 362.70000000000005 156.47997999999998 L 363.24000000000007 153.89997999999997 L 364.26000000000005 154.13997999999998 L 363.72 156.71998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 363.72 154.01998 L 364.26000000000005 151.43998 L 372.72 155.87998000000002 L 363.18 156.59998000000002" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 192.06 118.49998 L 191.82 118.43997999999999 L 191.7 118.91998000000001 L 191.94 118.97998000000001" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 363.3 153.65998 L 363.54 153.71998 L 363.42 154.19997999999998 L 363.18 154.13997999999998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 192.06 118.49998 L 191.94 118.97998000000001 L 363.18 154.13997999999998 L 363.3 153.65998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<rect height="0.24002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="385.56" y="151.19995999999998"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="385.56" y="151.43998"></rect>
<rect height="0.24002" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="224.04" y="151.67996"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="224.04" y="151.91998"></rect>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.04699072423972pt; fill: #000" textLength="110.37412605178969" x="395.93869273" y="30.74140924983385">BNDMOV to memory in 64-bit mode</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.04699072423972pt; fill: #000" textLength="56.7711470999748" x="133.44" y="34.64140924983383">Upper Bound (UB)</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.04699072423972pt; fill: #000" textLength="55.55452844997467" x="259.13923652" y="34.70140924983383">Lower Bound (LB)</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.82699072423975pt; fill: #000" textLength="44.05975405589487" x="458.99932715" y="64.64140924983383">0 Byteoffset</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.04699072423972pt; fill: #000" textLength="7.848181971789671" x="132.06" y="64.34140924983382">16</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.046990724239715pt; fill: #000" textLength="4.009074231789668" x="306.0" y="64.40140924983382">8</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.04699072423972pt; fill: #000" textLength="110.28677907178968" x="388.31838172000005" y="114.98140924983386">BNDMOV to memory in 32-bit mode</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.04699072423972pt; fill: #000" textLength="56.7115293199747" x="125.82000000000001" y="118.82140924983383">Upper Bound (UB)</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.04699072423972pt; fill: #000" textLength="55.567699819974706" x="251.51923652000002" y="118.88140924983384">Lower Bound (LB)</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.886990724239752pt; fill: #000" textLength="44.065299895894725" x="451.37901614000003" y="148.88140924983384">0 Byteoffset</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.04699072423972pt; fill: #000" textLength="7.848181971789671" x="124.5" y="148.52140924983382">16</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.046990724239715pt; fill: #000" textLength="4.009074231789668" x="298.44" y="148.58140924983383">8</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.046990724239715pt; fill: #000" textLength="4.009074231789668" x="379.44" y="148.58140924983383">4</text></g></svg>
<figcaption><a href='bndmov.html#fig-3-5'>Figure 3-5</a>. Memory Layout of BNDMOV to/from Memory</figcaption></figure>
<p>This instruction does not change flags.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="bndmov-register-to-register">BNDMOV register to register<a class="anchor" href="#bndmov-register-to-register">
</a></h3>
<pre>DEST.LB := SRC.LB;
DEST.UB := SRC.UB;
</pre>
<h3 id="bndmov-from-memory">BNDMOV from memory<a class="anchor" href="#bndmov-from-memory">
</a></h3>
<pre>IF 64-bit mode THEN
DEST.LB := LOAD_QWORD(SRC);
DEST.UB := LOAD_QWORD(SRC+8);
ELSE
DEST.LB := LOAD_DWORD_ZERO_EXT(SRC);
DEST.UB := LOAD_DWORD_ZERO_EXT(SRC+4);
FI;
</pre>
<h3 id="bndmov-to-memory">BNDMOV to memory<a class="anchor" href="#bndmov-to-memory">
</a></h3>
<pre>IF 64-bit mode THEN
DEST[63:0] := SRC.LB;
DEST[127:64] := SRC.UB;
ELSE
DEST[31:0] := SRC.LB;
DEST[63:32] := SRC.UB;
FI;
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BNDMOV void * _bnd_copy_ptr_bounds(const void *q, const void *r)
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="4">#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 67H prefix is not used and CS.D=0.</td></tr>
<tr>
<td>If 67H prefix is used and CS.D=1.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If the memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the destination operand points to a non-writable segment</td></tr>
<tr>
<td>If the DS, ES, FS, or GS segment register contains a NULL segment selector.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while CPL is 3.</td></tr>
<tr>
<td>#PF(fault</td>
<td>code) If a page fault occurs.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If the memory operand effective address is outside the SS segment limit.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If the memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while CPL is 3.</td></tr>
<tr>
<td>#PF(fault</td>
<td>code) If a page fault occurs.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr>
<tr>
<td>If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If the memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while CPL is 3.</td></tr>
<tr>
<td>#PF(fault</td>
<td>code) If a page fault occurs.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

174
x86/bndstx.html Normal file
View file

@ -0,0 +1,174 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BNDSTX
— Store Extended Bounds Using Address Translation</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BNDSTX
— Store Extended Bounds Using Address Translation</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 1B /r BNDSTX mib, bnd</td>
<td>MR</td>
<td>V/V</td>
<td>MPX</td>
<td>Store the bounds in bnd and the pointer value in the index register of mib to a bound table entry (BTE) with address translation using the base of mib.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th></tr>
<tr>
<td>MR</td>
<td>SIB.base (r): Address of pointer SIB.index(r)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>BNDSTX uses the linear address constructed from the displacement and base register of the SIB-addressing form of the memory operand (mib) to perform address translation to store to a bound table entry. The bounds in the source operand bnd are written to the lower and upper bounds in the BTE. The content of the index register of mib is written to the pointer value field in the BTE.</p>
<p>This instruction does not cause memory access to the linear address of mib nor the effective address referenced by the base, and does not read or write any flags.</p>
<p>Segment overrides apply to the linear address computation with the base of mib, and are used during address translation to generate the address of the bound table entry. By default, the address of the BTE is assumed to be linear address. There are no segmentation checks performed on the base of mib.</p>
<p>The base of mib will not be checked for canonical address violation as it does not access memory.</p>
<p>Any encoding of this instruction that does not specify base or index register will treat those registers as zero (constant). The reg-reg form of this instruction will remain a NOP.</p>
<p>The scale field of the SIB byte has no effect on these instructions and is ignored.</p>
<p>The bound register may be partially updated on memory faults. The order in which memory operands are loaded is implementation specific.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>base := mib.SIB.base ? mib.SIB.base + Disp: 0;
ptr_value := mib.SIB.index ? mib.SIB.index : 0;
</pre>
<h3 id="outside-64-bit-mode">Outside 64-bit Mode<a class="anchor" href="#outside-64-bit-mode">
</a></h3>
<pre>A_BDE[31:0] := (Zero_extend32(base[31:12] « 2) + (BNDCFG[31:12] «12 );
A_BT[31:0] := LoadFrom(A_BDE);
IF A_BT[0] equal 0 Then
BNDSTATUS := A_BDE | 02H;
#BR;
FI;
A_DEST[31:0] := (Zero_extend32(base[11:2] « 4) + (A_BT[31:2] « 2 ); // address of Bound table entry
A_DEST[8][31:0] := ptr_value;
A_DEST[0][31:0] := BND.LB;
A_DEST[4][31:0] := BND.UB;
</pre>
<h3 id="in-64-bit-mode">In 64-bit Mode<a class="anchor" href="#in-64-bit-mode">
</a></h3>
<pre>A_BDE[63:0] := (Zero_extend64(base[47+MAWA:20] « 3) + (BNDCFG[63:12] «12 );<sup>1</sup>
A_BT[63:0] := LoadFrom(A_BDE);
IF A_BT[0] equal 0 Then
BNDSTATUS := A_BDE | 02H;
#BR;
FI;
A_DEST[63:0] := (Zero_extend64(base[19:3] « 5) + (A_BT[63:3] « 3 ); // address of Bound table entry
A_DEST[16][63:0] := ptr_value;
A_DEST[0][63:0] := BND.LB;
A_DEST[8][63:0] := BND.UB;
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BNDSTX: _bnd_store_ptr_bounds(const void **ptr_addr, const void *ptr_val);
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If the bound directory entry is invalid.</td></tr>
<tr>
<td rowspan="4">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 67H prefix is not used and CS.D=0.</td></tr>
<tr>
<td>If 67H prefix is used and CS.D=1.</td></tr>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If a destination effective address of the Bound Table entry is outside the DS segment limit.</td></tr>
<tr>
<td>If DS register contains a NULL segment selector.</td></tr>
<tr>
<td>If the destination operand points to a non-writable segment</td></tr>
<tr>
<td>#PF(fault</td>
<td>code) If a page fault occurs.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If a destination effective address of the Bound Table entry is outside the DS segment limit.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m encodes BND4-BND7 when Intel MPX is enabled.</td></tr>
<tr>
<td>If 16-bit addressing is used.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If a destination effective address of the Bound Table entry is outside the DS segment limit.</td></tr>
<tr>
<td>#PF(fault</td>
<td>code) If a page fault occurs.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<blockquote>
<p>1. If CPL &lt; 3, the supervisor MAWA (MAWAS) is used; this value is 0. If CPL = 3, the user MAWA (MAWAU) is used; this value is enumerated in CPUID.(EAX=07H,ECX=0H):ECX.MAWAU[bits 21:17]. See Appendix E.3.1 of Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 1.</p></blockquote>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If the bound directory entry is invalid.</td></tr>
<tr>
<td rowspan="3">#UD</td>
<td>If ModRM is RIP relative.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If ModRM.r/m and REX encodes BND4-BND15 when Intel MPX is enabled.</td></tr>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If the memory address (A_BDE or A_BTE) is in a non-canonical form.</td></tr>
<tr>
<td>If the destination operand points to a non-writable segment</td></tr>
<tr>
<td>#PF(fault</td>
<td>code) If a page fault occurs.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

150
x86/bound.html Normal file
View file

@ -0,0 +1,150 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BOUND
— Check Array Index Against Bounds</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BOUND
— Check Array Index Against Bounds</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>62 /r</td>
<td>BOUND r16, m16&amp;16</td>
<td>RM</td>
<td>Invalid</td>
<td>Valid</td>
<td>Check if r16 (array index) is within bounds specified by m16&amp;16.</td></tr>
<tr>
<td>62 /r</td>
<td>BOUND r32, m32&amp;32</td>
<td>RM</td>
<td>Invalid</td>
<td>Valid</td>
<td>Check if r32 (array index) is within bounds specified by m32&amp;32.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>BOUND determines if the first operand (array index) is within the bounds of an array specified the second operand (bounds operand). The array index is a signed integer located in a register. The bounds operand is a memory location that contains a pair of signed doubleword-integers (when the operand-size attribute is 32) or a pair of signed word-integers (when the operand-size attribute is 16). The first doubleword (or word) is the lower bound of the array and the second doubleword (or word) is the upper bound of the array. The array index must be greater than or equal to the lower bound and less than or equal to the upper bound plus the operand size in bytes. If the index is not within bounds, a BOUND range exceeded exception (#BR) is signaled. When this exception is generated, the saved return instruction pointer points to the BOUND instruction.</p>
<p>The bounds limit data structure (two words or doublewords containing the lower and upper limits of the array) is usually placed just before the array itself, making the limits addressable via a constant offset from the beginning of the array. Because the address of the array already will be present in a register, this practice avoids extra bus cycles to obtain the effective address of the array bounds.</p>
<p>This instruction executes as described in compatibility mode and legacy mode. It is not valid in 64-bit mode.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF 64bit Mode
THEN
#UD;
ELSE
IF (ArrayIndex &lt; LowerBound OR ArrayIndex &gt; UpperBound) THEN
(* Below lower bound or above upper bound *)
IF &lt;equation for PL enabled&gt; THEN BNDSTATUS := 0
#BR;
FI;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If the bounds test fails.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If second operand is not a memory location.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If the bounds test fails.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If second operand is not a memory location.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#BR</td>
<td>If the bounds test fails.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If second operand is not a memory location.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If in 64-bit mode.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

156
x86/bsf.html Normal file
View file

@ -0,0 +1,156 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BSF
— Bit Scan Forward</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BSF
— Bit Scan Forward</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F BC /r</td>
<td>BSF r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Bit scan forward on r/m16.</td></tr>
<tr>
<td>0F BC /r</td>
<td>BSF r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Bit scan forward on r/m32.</td></tr>
<tr>
<td>REX.W + 0F BC /r</td>
<td>BSF r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Bit scan forward on r/m64.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Searches the source operand (second operand) for the least significant set bit (1 bit). If a least significant 1 bit is found, its bit index is stored in the destination operand (first operand). The source operand can be a register or a memory location; the destination operand is a register. The bit index is an unsigned offset from bit 0 of the source operand. If the content of the source operand is 0, the content of the destination operand is undefined.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF SRC = 0
THEN
ZF := 1;
DEST is undefined;
ELSE
ZF := 0;
temp := 0;
WHILE Bit(SRC, temp) = 0
DO
temp := temp + 1;
OD;
DEST := temp;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The ZF flag is set to 1 if the source operand is 0; otherwise, the ZF flag is cleared. The CF, OF, SF, AF, and PF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

156
x86/bsr.html Normal file
View file

@ -0,0 +1,156 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BSR
— Bit Scan Reverse</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BSR
— Bit Scan Reverse</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F BD /r</td>
<td>BSR r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Bit scan reverse on r/m16.</td></tr>
<tr>
<td>0F BD /r</td>
<td>BSR r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Bit scan reverse on r/m32.</td></tr>
<tr>
<td>REX.W + 0F BD /r</td>
<td>BSR r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Bit scan reverse on r/m64.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Searches the source operand (second operand) for the most significant set bit (1 bit). If a most significant 1 bit is found, its bit index is stored in the destination operand (first operand). The source operand can be a register or a memory location; the destination operand is a register. The bit index is an unsigned offset from bit 0 of the source operand. If the content source operand is 0, the content of the destination operand is undefined.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF SRC = 0
THEN
ZF := 1;
DEST is undefined;
ELSE
ZF := 0;
temp := OperandSize 1;
WHILE Bit(SRC, temp) = 0
DO
temp := temp - 1;
OD;
DEST := temp;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The ZF flag is set to 1 if the source operand is 0; otherwise, the ZF flag is cleared. The CF, OF, SF, AF, and PF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

87
x86/bswap.html Normal file
View file

@ -0,0 +1,87 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BSWAP
— Byte Swap</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BSWAP
— Byte Swap</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F C8+<em>rd</em></td>
<td>BSWAP <em>r32</em></td>
<td>O</td>
<td>Valid*</td>
<td>Valid</td>
<td>Reverses the byte order of a 32-bit register.</td></tr>
<tr>
<td>REX.W + 0F C8+<em>rd</em></td>
<td>BSWAP <em>r64</em></td>
<td>O</td>
<td>Valid</td>
<td>N.E.</td>
<td>Reverses the byte order of a 64-bit register.</td></tr></table>
<blockquote>
<p>* SeeIA-32ArchitectureCompatibilitysectionbelow.</p></blockquote>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>O</td>
<td>opcode + rd (r, w)</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Reverses the byte order of a 32-bit or 64-bit (destination) register. This instruction is provided for converting little-endian values to big-endian format and vice versa. To swap bytes in a word value (16-bit register), use the XCHG instruction. When the BSWAP instruction references a 16-bit register, the result is undefined.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="ia-32-architecture-legacy-compatibility">IA-32 Architecture Legacy Compatibility<a class="anchor" href="#ia-32-architecture-legacy-compatibility">
</a></h2>
<p>The BSWAP instruction is not supported on IA-32 processors earlier than the Intel486TM processor family. For compatibility with this instruction, software should include functionally equivalent code for execution on Intel processors earlier than the Intel486 processor family.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>TEMP := DEST
IF 64-bit mode AND OperandSize = 64
THEN
DEST[7:0] := TEMP[63:56];
DEST[15:8] := TEMP[55:48];
DEST[23:16] := TEMP[47:40];
DEST[31:24] := TEMP[39:32];
DEST[39:32] := TEMP[31:24];
DEST[47:40] := TEMP[23:16];
DEST[55:48] := TEMP[15:8];
DEST[63:56] := TEMP[7:0];
ELSE
DEST[7:0] := TEMP[31:24];
DEST[15:8] := TEMP[23:16];
DEST[23:16] := TEMP[15:8];
DEST[31:24] := TEMP[7:0];
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

181
x86/bt.html Normal file
View file

@ -0,0 +1,181 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BT
— Bit Test</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BT
— Bit Test</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F A3 /r</td>
<td>BT r/m16, r16</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag.</td></tr>
<tr>
<td>0F A3 /r</td>
<td>BT r/m32, r32</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag.</td></tr>
<tr>
<td>REX.W + 0F A3 /r</td>
<td>BT r/m64, r64</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Store selected bit in CF flag.</td></tr>
<tr>
<td>0F BA /4 ib</td>
<td>BT r/m16, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag.</td></tr>
<tr>
<td>0F BA /4 ib</td>
<td>BT r/m32, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag.</td></tr>
<tr>
<td>REX.W + 0F BA /4 ib</td>
<td>BT r/m64, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Store selected bit in CF flag.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (r)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MI</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset (specified by the second operand) and stores the value of the bit in the CF flag. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value:</p>
<ul>
<li>If the bit base operand specifies a register, the instruction takes the modulo 16, 32, or 64 of the bit offset operand (modulo size depends on the mode and register size; 64-bit operands are available only in 64-bit mode).</li>
<li>If the bit base operand specifies a memory location, the operand represents the address of the byte in memory that contains the bit base (bit 0 of the specified byte) of the bit string. The range of the bit position that can be referenced by the offset operand depends on the operand size.</li></ul>
<p>See also: <strong>Bit(BitBase, BitOffset)</strong> on page 3-11.</p>
<p>Some assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combination with the displacement field of the memory operand. In this case, the low-order 3 or 5 bits (3 for 16-bit operands, 5 for 32-bit operands) of the immediate bit offset are stored in the immediate bit offset field, and the high-order bits are shifted and combined with the byte displacement in the addressing mode by the assembler. The processor will ignore the high order bits if they are not zero.</p>
<p>When accessing a bit in memory, the processor may access 4 bytes starting from the memory address for a 32-bit operand size, using by the following relationship:</p>
<p>Effective Address + (4 (BitOffset DIV 32))</p>
<p>Or, it may access 2 bytes starting from the memory address for a 16-bit operand, using this relationship:</p>
<p>Effective Address + (2 (BitOffset DIV 16))</p>
<p>It may do so even when only a single byte needs to be accessed to reach the given bit. When using this bit addressing mechanism, software should avoid referencing areas of memory close to address space holes. In particular, it should avoid references to memory-mapped I/O registers. Instead, software should use the MOV instructions to load from or store to these addresses, and use the register form of these instructions to manipulate the data.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bit operands. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CF := Bit(BitBase, BitOffset);
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The CF flag contains the value of the selected bit. The ZF flag is unaffected. The OF, SF, AF, and PF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

180
x86/btc.html Normal file
View file

@ -0,0 +1,180 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BTC
— Bit Test and Complement</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BTC
— Bit Test and Complement</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F BB /r</td>
<td>BTC r/m16, r16</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and complement.</td></tr>
<tr>
<td>0F BB /r</td>
<td>BTC r/m32, r32</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and complement.</td></tr>
<tr>
<td>REX.W + 0F BB /r</td>
<td>BTC r/m64, r64</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Store selected bit in CF flag and complement.</td></tr>
<tr>
<td>0F BA /7 ib</td>
<td>BTC r/m16, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and complement.</td></tr>
<tr>
<td>0F BA /7 ib</td>
<td>BTC r/m32, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and complement.</td></tr>
<tr>
<td>REX.W + 0F BA /7 ib</td>
<td>BTC r/m64, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Store selected bit in CF flag and complement.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (r, w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MI</td>
<td>ModRM:r/m (r, w)</td>
<td>imm8</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset operand (second operand), stores the value of the bit in the CF flag, and complements the selected bit in the bit string. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value:</p>
<ul>
<li>If the bit base operand specifies a register, the instruction takes the modulo 16, 32, or 64 of the bit offset operand (modulo size depends on the mode and register size; 64-bit operands are available only in 64-bit mode). This allows any bit position to be selected.</li>
<li>If the bit base operand specifies a memory location, the operand represents the address of the byte in memory that contains the bit base (bit 0 of the specified byte) of the bit string. The range of the bit position that can be referenced by the offset operand depends on the operand size.</li></ul>
<p>See also: <strong>Bit(BitBase, BitOffset)</strong> on page 3-11.</p>
<p>Some assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combination with the displacement field of the memory operand. See “BT—Bit Test” in this chapter for more information on this addressing mechanism.</p>
<p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CF := Bit(BitBase, BitOffset);
Bit(BitBase, BitOffset) := NOT Bit(BitBase, BitOffset);
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The CF flag contains the value of the selected bit before it is complemented. The ZF flag is unaffected. The OF, SF, AF, and PF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the destination operand points to a non-writable segment.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

180
x86/btr.html Normal file
View file

@ -0,0 +1,180 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BTR
— Bit Test and Reset</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BTR
— Bit Test and Reset</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F B3 /r</td>
<td>BTR r/m16, r16</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and clear.</td></tr>
<tr>
<td>0F B3 /r</td>
<td>BTR r/m32, r32</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and clear.</td></tr>
<tr>
<td>REX.W + 0F B3 /r</td>
<td>BTR r/m64, r64</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Store selected bit in CF flag and clear.</td></tr>
<tr>
<td>0F BA /6 ib</td>
<td>BTR r/m16, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and clear.</td></tr>
<tr>
<td>0F BA /6 ib</td>
<td>BTR r/m32, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and clear.</td></tr>
<tr>
<td>REX.W + 0F BA /6 ib</td>
<td>BTR r/m64, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Store selected bit in CF flag and clear.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (r, w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MI</td>
<td>ModRM:r/m (r, w)</td>
<td>imm8</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset operand (second operand), stores the value of the bit in the CF flag, and clears the selected bit in the bit string to 0. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value:</p>
<ul>
<li>If the bit base operand specifies a register, the instruction takes the modulo 16, 32, or 64 of the bit offset operand (modulo size depends on the mode and register size; 64-bit operands are available only in 64-bit mode). This allows any bit position to be selected.</li>
<li>If the bit base operand specifies a memory location, the operand represents the address of the byte in memory that contains the bit base (bit 0 of the specified byte) of the bit string. The range of the bit position that can be referenced by the offset operand depends on the operand size.</li></ul>
<p>See also: <strong>Bit(BitBase, BitOffset)</strong> on page 3-11.</p>
<p>Some assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combination with the displacement field of the memory operand. See “BT—Bit Test” in this chapter for more information on this addressing mechanism.</p>
<p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CF := Bit(BitBase, BitOffset);
Bit(BitBase, BitOffset) := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The CF flag contains the value of the selected bit before it is cleared. The ZF flag is unaffected. The OF, SF, AF, and PF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the destination operand points to a non-writable segment.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

180
x86/bts.html Normal file
View file

@ -0,0 +1,180 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BTS
— Bit Test and Set</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BTS
— Bit Test and Set</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F AB /r</td>
<td>BTS r/m16, r16</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and set.</td></tr>
<tr>
<td>0F AB /r</td>
<td>BTS r/m32, r32</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and set.</td></tr>
<tr>
<td>REX.W + 0F AB /r</td>
<td>BTS r/m64, r64</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Store selected bit in CF flag and set.</td></tr>
<tr>
<td>0F BA /5 ib</td>
<td>BTS r/m16, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and set.</td></tr>
<tr>
<td>0F BA /5 ib</td>
<td>BTS r/m32, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Store selected bit in CF flag and set.</td></tr>
<tr>
<td>REX.W + 0F BA /5 ib</td>
<td>BTS r/m64, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Store selected bit in CF flag and set.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (r, w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MI</td>
<td>ModRM:r/m (r, w)</td>
<td>imm8</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Selects the bit in a bit string (specified with the first operand, called the bit base) at the bit-position designated by the bit offset operand (second operand), stores the value of the bit in the CF flag, and sets the selected bit in the bit string to 1. The bit base operand can be a register or a memory location; the bit offset operand can be a register or an immediate value:</p>
<ul>
<li>If the bit base operand specifies a register, the instruction takes the modulo 16, 32, or 64 of the bit offset operand (modulo size depends on the mode and register size; 64-bit operands are available only in 64-bit mode). This allows any bit position to be selected.</li>
<li>If the bit base operand specifies a memory location, the operand represents the address of the byte in memory that contains the bit base (bit 0 of the specified byte) of the bit string. The range of the bit position that can be referenced by the offset operand depends on the operand size.</li></ul>
<p>See also: <strong>Bit(BitBase, BitOffset)</strong> on page 3-11.</p>
<p>Some assemblers support immediate bit offsets larger than 31 by using the immediate bit offset field in combination with the displacement field of the memory operand. See “BT—Bit Test” in this chapter for more information on this addressing mechanism.</p>
<p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CF := Bit(BitBase, BitOffset);
Bit(BitBase, BitOffset) := 1;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The CF flag contains the value of the selected bit before it is set. The ZF flag is unaffected. The OF, SF, AF, and PF flags are undefined.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the destination operand points to a non-writable segment.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

82
x86/bzhi.html Normal file
View file

@ -0,0 +1,82 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>BZHI
— Zero High Bits Starting with Specified Bit Position</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>BZHI
— Zero High Bits Starting with Specified Bit Position</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32-bit Mode</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>VEX.LZ.0F38.W0 F5 /r BZHI r32a, r/m32, r32b</td>
<td>RMV</td>
<td>V/V</td>
<td>BMI2</td>
<td>Zero bits in r/m32 starting with the position in r32b, write result to r32a.</td></tr>
<tr>
<td>VEX.LZ.0F38.W1 F5 /r BZHI r64a, r/m64, r64b</td>
<td>RMV</td>
<td>V/N.E.</td>
<td>BMI2</td>
<td>Zero bits in r/m64 starting with the position in r64b, write result to r64a.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RMV</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>VEX.vvvv (r)</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>BZHI copies the bits of the first source operand (the second operand) into the destination operand (the first operand) and clears the higher bits in the destination according to the INDEX value specified by the second source operand (the third operand). The INDEX is specified by bits 7:0 of the second source operand. The INDEX value is saturated at the value of OperandSize -1. CF is set, if the number contained in the 8 low bits of the third operand is greater than OperandSize -1.</p>
<p>This instruction is not supported in real mode and virtual-8086 mode. The operand size is always 32 bits if not in 64-bit mode. In 64-bit mode operand size 64 requires VEX.W1. VEX.W1 is ignored in non-64-bit modes. An attempt to execute this instruction with VEX.L not equal to 0 will cause #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>N := SRC2[7:0]
DEST := SRC1
IF (N &lt; OperandSize)
DEST[OperandSize-1:N] := 0
FI
IF (N &gt; OperandSize - 1)
CF := 1
ELSE
CF := 0
FI
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>ZF and SF flags are updated based on the result. CF flag is set as specified in the Operation section. OF flag is cleared. AF and PF flags are undefined.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>BZHI unsigned __int32 _bzhi_u32(unsigned __int32 src, unsigned __int32 index);
</pre>
<pre>BZHI unsigned __int64 _bzhi_u64(unsigned __int64 src, unsigned __int32 index);
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-29</span>, “Type 13 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

890
x86/call.html Normal file
View file

@ -0,0 +1,890 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CALL
— Call Procedure</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CALL
— Call Procedure</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>E8 cw</td>
<td>CALL rel16</td>
<td>D</td>
<td>N.S.</td>
<td>Valid</td>
<td>Call near, relative, displacement relative to next instruction.</td></tr>
<tr>
<td>E8 cd</td>
<td>CALL rel32</td>
<td>D</td>
<td>Valid</td>
<td>Valid</td>
<td>Call near, relative, displacement relative to next instruction. 32-bit displacement sign extended to 64-bits in 64-bit mode.</td></tr>
<tr>
<td>FF /2</td>
<td>CALL r/m16</td>
<td>M</td>
<td>N.E.</td>
<td>Valid</td>
<td>Call near, absolute indirect, address given in r/m16.</td></tr>
<tr>
<td>FF /2</td>
<td>CALL r/m32</td>
<td>M</td>
<td>N.E.</td>
<td>Valid</td>
<td>Call near, absolute indirect, address given in r/m32.</td></tr>
<tr>
<td>FF /2</td>
<td>CALL r/m64</td>
<td>M</td>
<td>Valid</td>
<td>N.E.</td>
<td>Call near, absolute indirect, address given in r/m64.</td></tr>
<tr>
<td>9A cd</td>
<td>CALL ptr16:16</td>
<td>D</td>
<td>Invalid</td>
<td>Valid</td>
<td>Call far, absolute, address given in operand.</td></tr>
<tr>
<td>9A cp</td>
<td>CALL ptr16:32</td>
<td>D</td>
<td>Invalid</td>
<td>Valid</td>
<td>Call far, absolute, address given in operand.</td></tr>
<tr>
<td>FF /3</td>
<td>CALL m16:16</td>
<td>M</td>
<td>Valid</td>
<td>Valid</td>
<td>Call far, absolute indirect address given in m16:16. In 32-bit mode: if selector points to a gate, then RIP = 32-bit zero extended displacement taken from gate; else RIP = zero extended 16-bit offset from far pointer referenced in the instruction.</td></tr>
<tr>
<td>FF /3</td>
<td>CALL m16:32</td>
<td>M</td>
<td>Valid</td>
<td>Valid</td>
<td>In 64-bit mode: If selector points to a gate, then RIP = 64-bit displacement taken from gate; else RIP = zero extended 32-bit offset from far pointer referenced in the instruction.</td></tr>
<tr>
<td>REX.W FF /3</td>
<td>CALL m16:64</td>
<td>M</td>
<td>Valid</td>
<td>N.E.</td>
<td>In 64-bit mode: If selector points to a gate, then RIP = 64-bit displacement taken from gate; else RIP = 64-bit offset from far pointer referenced in the instruction.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>D</td>
<td>Offset</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>M</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Saves procedure linking information on the stack and branches to the called procedure specified using the target operand. The target operand specifies the address of the first instruction in the called procedure. The operand can be an immediate value, a general-purpose register, or a memory location.</p>
<p>This instruction can be used to execute four types of calls:</p>
<ul>
<li><strong>Near Call </strong>— A call to a procedure in the current code segment (the segment currently pointed to by the CS register), sometimes referred to as an intra-segment call.</li>
<li><strong>Far Call </strong>— A call to a procedure located in a different segment than the current code segment, sometimes referred to as an inter-segment call.</li>
<li><strong>Inter-privilege-level far call </strong>— A far call to a procedure in a segment at a different privilege level than that of the currently executing program or procedure.</li>
<li><strong>Task switch </strong>— A call to a procedure located in a different task.</li></ul>
<p>The latter two call types (inter-privilege-level call and task switch) can only be executed in protected mode. See “Calling Procedures Using Call and RET” in Chapter 6 of the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 1, for additional information on near, far, and inter-privilege-level calls. See Chapter 8, “Task Management,” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 3A, for information on performing task switches with the CALL instruction.</p>
<p><strong>Near Call.</strong> When executing a near call, the processor pushes the value of the EIP register (which contains the offset of the instruction following the CALL instruction) on the stack (for use later as a return-instruction pointer). The processor then branches to the address in the current code segment specified by the target operand. The target operand specifies either an absolute offset in the code segment (an offset from the base of the code segment) or a relative offset (a signed displacement relative to the current value of the instruction pointer in the EIP register; this value points to the instruction following the CALL instruction). The CS register is not changed on near calls.</p>
<p>For a near call absolute, an absolute offset is specified indirectly in a general-purpose register or a memory location (<em>r/m16</em>, <em>r/m32, or r/m64</em>). The operand-size attribute determines the size of the target operand (16, 32 or 64 bits). When in 64-bit mode, the operand size for near call (and all near branches) is forced to 64-bits. Absolute offsets are loaded directly into the EIP(RIP) register. If the operand size attribute is 16, the upper two bytes of the EIP register are cleared, resulting in a maximum instruction pointer size of 16 bits. When accessing an absolute offset indirectly using the stack pointer [ESP] as the base register, the base value used is the value of the ESP before the instruction executes.</p>
<p>A relative offset (<em>rel16</em> or <em>rel32</em>) is generally specified as a label in assembly code. But at the machine code level, it is encoded as a signed, 16- or 32-bit immediate value. This value is added to the value in the EIP(RIP) register. In 64-bit mode the relative offset is always a 32-bit immediate value which is sign extended to 64-bits before it is added to the value in the RIP register for the target calculation. As with absolute offsets, the operand-size attribute determines the size of the target operand (16, 32, or 64 bits). In 64-bit mode the target operand will always be 64-bits because the operand size is forced to 64-bits for near branches.</p>
<p><strong>Far Calls in Real-Address or Virtual-8086 Mode.</strong> When executing a far call in real- address or virtual-8086 mode, the processor pushes the current value of both the CS and EIP registers on the stack for use as a return-instruction pointer. The processor then performs a “far branch” to the code segment and offset specified with the target operand for the called procedure. The target operand specifies an absolute far address either directly with a pointer (<em>ptr16:16</em> or <em>ptr16:32</em>) or indirectly with a memory location (<em>m16:16</em> or <em>m16:32</em>). With the pointer method, the segment and offset of the called procedure is encoded in the instruction using a 4-byte (16-bit operand size) or 6-byte (32-bit operand size) far address immediate. With the indirect method, the target operand specifies a memory location that contains a 4-byte (16-bit operand size) or 6-byte (32-bit operand size) far address. The operand-size attribute determines the size of the offset (16 or 32 bits) in the far address. The far address is loaded directly into the CS and EIP registers. If the operand-size attribute is 16, the upper two bytes of the EIP register are cleared.</p>
<p><strong>Far Calls in Protected Mode.</strong> When the processor is operating in protected mode, the CALL instruction can be used to perform the following types of far calls:</p>
<ul>
<li>Far call to the same privilege level</li>
<li>Far call to a different privilege level (inter-privilege level call)</li>
<li>Task switch (far call to another task)</li></ul>
<p>In protected mode, the processor always uses the segment selector part of the far address to access the corresponding descriptor in the GDT or LDT. The descriptor type (code segment, call gate, task gate, or TSS) and access rights determine the type of call operation to be performed.</p>
<p>If the selected descriptor is for a code segment, a far call to a code segment at the same privilege level is performed. (If the selected code segment is at a different privilege level and the code segment is non-conforming, a general-protection exception is generated.) A far call to the same privilege level in protected mode is very similar to one carried out in real-address or virtual-8086 mode. The target operand specifies an absolute far address either directly with a pointer (<em>ptr16:16</em> or <em>ptr16:32</em>) or indirectly with a memory location (<em>m16:16</em> or <em>m16:32</em>). The operand- size attribute determines the size of the offset (16 or 32 bits) in the far address. The new code segment selector and its descriptor are loaded into CS register; the offset from the instruction is loaded into the EIP register.</p>
<p>A call gate (described in the next paragraph) can also be used to perform a far call to a code segment at the same privilege level. Using this mechanism provides an extra level of indirection and is the preferred method of making calls between 16-bit and 32-bit code segments.</p>
<p>When executing an inter-privilege-level far call, the code segment for the procedure being called must be accessed through a call gate. The segment selector specified by the target operand identifies the call gate. The target operand can specify the call gate segment selector either directly with a pointer (<em>ptr16:16</em> or <em>ptr16:32</em>) or indirectly with a memory location (<em>m16:16</em> or <em>m16:32</em>). The processor obtains the segment selector for the new code segment and the new instruction pointer (offset) from the call gate descriptor. (The offset from the target operand is ignored when a call gate is used.)</p>
<p>On inter-privilege-level calls, the processor switches to the stack for the privilege level of the called procedure. The segment selector for the new stack segment is specified in the TSS for the currently running task. The branch to the new code segment occurs after the stack switch. (Note that when using a call gate to perform a far call to a segment at the same privilege level, no stack switch occurs.) On the new stack, the processor pushes the segment selector and stack pointer for the calling procedures stack, an optional set of parameters from the calling procedures stack, and the segment selector and instruction pointer for the calling procedures code segment. (A value in the call gate descriptor determines how many parameters to copy to the new stack.) Finally, the processor branches to the address of the procedure being called within the new code segment.</p>
<p>Executing a task switch with the CALL instruction is similar to executing a call through a call gate. The target operand specifies the segment selector of the task gate for the new task activated by the switch (the offset in the target operand is ignored). The task gate in turn points to the TSS for the new task, which contains the segment selectors for the tasks code and stack segments. Note that the TSS also contains the EIP value for the next instruction that was to be executed before the calling task was suspended. This instruction pointer value is loaded into the EIP register to re-start the calling task.</p>
<p>The CALL instruction can also specify the segment selector of the TSS directly, which eliminates the indirection of the task gate. See Chapter 8, “Task Management,” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 3A, for information on the mechanics of a task switch.</p>
<p>When you execute at task switch with a CALL instruction, the nested task flag (NT) is set in the EFLAGS register and the new TSSs previous task link field is loaded with the old tasks TSS selector. Code is expected to suspend this nested task by executing an IRET instruction which, because the NT flag is set, automatically uses the previous task link to return to the calling task. (See “Task Linking” in Chapter 8 of the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 3A, for information on nested tasks.) Switching tasks with the CALL instruction differs in this regard from JMP instruction. JMP does not set the NT flag and therefore does not expect an IRET instruction to suspend the task.</p>
<p><strong>Mixing 16-Bit and 32-Bit Calls.</strong> When making far calls between 16-bit and 32-bit code segments, use a call gate. If the far call is from a 32-bit code segment to a 16-bit code segment, the call should be made from the first 64 KBytes of the 32-bit code segment. This is because the operand-size attribute of the instruction is set to 16, so only a 16-bit return address offset can be saved. Also, the call should be made using a 16-bit call gate so that 16-bit values can be pushed on the stack. See Chapter 22, “Mixing 16-Bit and 32-Bit Code,” in the Intel<em><sup>® </sup></em>64 and IA-32 Architectures Software Developers Manual, Volume 3B, for more information.</p>
<p><strong>Far Calls in Compatibility Mode.</strong> When the processor is operating in compatibility mode, the CALL instruction can be used to perform the following types of far calls:</p>
<ul>
<li>Far call to the same privilege level, remaining in compatibility mode</li>
<li>Far call to the same privilege level, transitioning to 64-bit mode</li>
<li>Far call to a different privilege level (inter-privilege level call), transitioning to 64-bit mode</li></ul>
<p>Note that a CALL instruction can not be used to cause a task switch in compatibility mode since task switches are not supported in IA-32e mode.</p>
<p>In compatibility mode, the processor always uses the segment selector part of the far address to access the corresponding descriptor in the GDT or LDT. The descriptor type (code segment, call gate) and access rights determine the type of call operation to be performed.</p>
<p>If the selected descriptor is for a code segment, a far call to a code segment at the same privilege level is performed. (If the selected code segment is at a different privilege level and the code segment is non-conforming, a general-protection exception is generated.) A far call to the same privilege level in compatibility mode is very similar to one carried out in protected mode. The target operand specifies an absolute far address either directly with a pointer (<em>ptr16:16</em> or <em>ptr16:32</em>) or indirectly with a memory location (<em>m16:16</em> or <em>m16:32</em>). The operand-size attribute determines the size of the offset (16 or 32 bits) in the far address. The new code segment selector and its descriptor are loaded into CS register and the offset from the instruction is loaded into the EIP register. The difference is that 64-bit mode may be entered. This specified by the L bit in the new code segment descriptor.</p>
<p>Note that a 64-bit call gate (described in the next paragraph) can also be used to perform a far call to a code segment at the same privilege level. However, using this mechanism requires that the target code segment descriptor have the L bit set, causing an entry to 64-bit mode.</p>
<p>When executing an inter-privilege-level far call, the code segment for the procedure being called must be accessed through a 64-bit call gate. The segment selector specified by the target operand identifies the call gate. The target</p>
<p>operand can specify the call gate segment selector either directly with a pointer (<em>ptr16:16</em> or <em>ptr16:32</em>) or indirectly with a memory location (<em>m16:16</em> or <em>m16:32</em>). The processor obtains the segment selector for the new code segment and the new instruction pointer (offset) from the 16-byte call gate descriptor. (The offset from the target operand is ignored when a call gate is used.)</p>
<p>On inter-privilege-level calls, the processor switches to the stack for the privilege level of the called procedure. The segment selector for the new stack segment is set to NULL. The new stack pointer is specified in the TSS for the currently running task. The branch to the new code segment occurs after the stack switch. (Note that when using a call gate to perform a far call to a segment at the same privilege level, an implicit stack switch occurs as a result of entering 64-bit mode. The SS selector is unchanged, but stack segment accesses use a segment base of 0x0, the limit is ignored, and the default stack size is 64-bits. The full value of RSP is used for the offset, of which the upper 32-bits are undefined.) On the new stack, the processor pushes the segment selector and stack pointer for the calling procedures stack and the segment selector and instruction pointer for the calling procedures code segment. (Parameter copy is not supported in IA-32e mode.) Finally, the processor branches to the address of the procedure being called within the new code segment.</p>
<p><strong>Near/(Far) Calls in 64-bit Mode.</strong> When the processor is operating in 64-bit mode, the CALL instruction can be used to perform the following types of far calls:</p>
<ul>
<li>Far call to the same privilege level, transitioning to compatibility mode</li>
<li>Far call to the same privilege level, remaining in 64-bit mode</li>
<li>Far call to a different privilege level (inter-privilege level call), remaining in 64-bit mode</li></ul>
<p>Note that in this mode the CALL instruction can not be used to cause a task switch in 64-bit mode since task switches are not supported in IA-32e mode.</p>
<p>In 64-bit mode, the processor always uses the segment selector part of the far address to access the corresponding descriptor in the GDT or LDT. The descriptor type (code segment, call gate) and access rights determine the type of call operation to be performed.</p>
<p>If the selected descriptor is for a code segment, a far call to a code segment at the same privilege level is performed. (If the selected code segment is at a different privilege level and the code segment is non-conforming, a general-protection exception is generated.) A far call to the same privilege level in 64-bit mode is very similar to one carried out in compatibility mode. The target operand specifies an absolute far address indirectly with a memory location (<em>m16:16, m16:32</em> or <em>m16:64</em>). The form of CALL with a direct specification of absolute far address is not defined in 64-bit mode. The operand-size attribute determines the size of the offset (16, 32, or 64 bits) in the far address. The new code segment selector and its descriptor are loaded into the CS register; the offset from the instruction is loaded into the EIP register. The new code segment may specify entry either into compatibility or 64-bit mode, based on the L bit value.</p>
<p>A 64-bit call gate (described in the next paragraph) can also be used to perform a far call to a code segment at the same privilege level. However, using this mechanism requires that the target code segment descriptor have the L bit set.</p>
<p>When executing an inter-privilege-level far call, the code segment for the procedure being called must be accessed through a 64-bit call gate. The segment selector specified by the target operand identifies the call gate. The target operand can only specify the call gate segment selector indirectly with a memory location (<em>m16:16, m16:32</em> or <em>m16:64</em>). The processor obtains the segment selector for the new code segment and the new instruction pointer (offset) from the 16-byte call gate descriptor. (The offset from the target operand is ignored when a call gate is used.)</p>
<p>On inter-privilege-level calls, the processor switches to the stack for the privilege level of the called procedure. The segment selector for the new stack segment is set to NULL. The new stack pointer is specified in the TSS for the currently running task. The branch to the new code segment occurs after the stack switch.</p>
<p>Note that when using a call gate to perform a far call to a segment at the same privilege level, an implicit stack switch occurs as a result of entering 64-bit mode. The SS selector is unchanged, but stack segment accesses use a segment base of 0x0, the limit is ignored, and the default stack size is 64-bits. (The full value of RSP is used for the offset.) On the new stack, the processor pushes the segment selector and stack pointer for the calling procedures stack and the segment selector and instruction pointer for the calling procedures code segment. (Parameter copy is not supported in IA-32e mode.) Finally, the processor branches to the address of the procedure being called within the new code segment.</p>
<p>Refer to Chapter 6, “Procedure Calls, Interrupts, and Exceptions” and Chapter 17, “Control-flow Enforcement Technology (CET)‚” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 1, for CET details.</p>
<p><strong>Instruction ordering.</strong> Instructions following a far call may be fetched from memory before earlier instructions complete execution, but they will not execute (even speculatively) until all instructions prior to the far call have completed execution (the later instructions may execute before data stored by the earlier instructions have become globally visible).</p>
<p>Instructions sequentially following a near indirect CALL instruction (i.e., those not at the target) may be executed speculatively. If software needs to prevent this (e.g., in order to prevent a speculative execution side channel), then an LFENCE instruction opcode can be placed after the near indirect CALL in order to block speculative execution.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF near call
THEN IF near relative call
THEN
IF OperandSize = 64
THEN
tempDEST := SignExtend(DEST); (* DEST is rel32 *)
tempRIP := RIP + tempDEST;
IF stack not large enough for a 8-byte return address
THEN #SS(0); FI;
Push(RIP);
IF ShadowStackEnabled(CPL) AND DEST != 0
ShadowStackPush8B(RIP);
FI;
RIP := tempRIP;
FI;
IF OperandSize = 32
THEN
tempEIP := EIP + DEST; (* DEST is <em>rel32</em> *)
IF tempEIP is not within code segment limit THEN #GP(0); FI;
IF stack not large enough for a 4-byte return address
THEN #SS(0); FI;
Push(EIP);
IF ShadowStackEnabled(CPL) AND DEST != 0
ShadowStackPush4B(EIP);
FI;
EIP := tempEIP;
FI;
IF OperandSize = 16
THEN
tempEIP := (EIP + DEST) AND 0000FFFFH; (* DEST is <em>rel16</em> *)
IF tempEIP is not within code segment limit THEN #GP(0); FI;
IF stack not large enough for a 2-byte return address
THEN #SS(0); FI;
Push(IP);
IF ShadowStackEnabled(CPL) AND DEST != 0
(* IP is zero extended and pushed as a 32 bit value on shadow stack *)
ShadowStackPush4B(IP);
FI;
EIP := tempEIP;
FI;
ELSE (* Near absolute call *)
IF OperandSize = 64
THEN
tempRIP := DEST; (* DEST is <em>r/m64</em> *)
IF stack not large enough for a 8-byte return address
THEN #SS(0); FI;
Push(RIP);
IF ShadowStackEnabled(CPL)
ShadowStackPush8B(RIP);
FI;
RIP := tempRIP;
FI;
IF OperandSize = 32
THEN
tempEIP := DEST; (* DEST is <em>r/m32</em> *)
IF tempEIP is not within code segment limit THEN #GP(0); FI;
IF stack not large enough for a 4-byte return address
THEN #SS(0); FI;
Push(EIP);
IF ShadowStackEnabled(CPL)
ShadowStackPush4B(EIP);
FI;
EIP := tempEIP;
FI;
IF OperandSize = 16
THEN
tempEIP := DEST AND 0000FFFFH; (* DEST is <em>r/m16</em> *)
IF tempEIP is not within code segment limit THEN #GP(0); FI;
IF stack not large enough for a 2-byte return address
THEN #SS(0); FI;
Push(IP);
IF ShadowStackEnabled(CPL)
(* IP is zero extended and pushed as a 32 bit value on shadow stack *)
ShadowStackPush4B(IP);
FI;
EIP := tempEIP;
FI;
FI;rel/abs
IF (Call near indirect, absolute indirect)
IF EndbranchEnabledAndNotSuppressed(CPL)
IF CPL = 3
THEN
IF ( no 3EH prefix OR IA32_U_CET.NO_TRACK_EN == 0 )
THEN
IA32_U_CET.TRACKER = WAIT_FOR_ENDBRANCH
FI;
ELSE
IF ( no 3EH prefix OR IA32_S_CET.NO_TRACK_EN == 0 )
THEN
IA32_S_CET.TRACKER = WAIT_FOR_ENDBRANCH
FI;
FI;
FI;
FI;
FI; near
IF far call and (PE = 0 or (PE = 1 and VM = 1)) (* Real-address or virtual-8086 mode *)
THEN
IF OperandSize = 32
THEN
IF stack not large enough for a 6-byte return address
THEN #SS(0); FI;
IF DEST[31:16] is not zero THEN #GP(0); FI;
Push(CS); (* Padded with 16 high-order bits *)
Push(EIP);
CS := DEST[47:32]; (* DEST is <em>ptr16:32</em> or [<em>m16:32</em>] *)
EIP := DEST[31:0]; (* DEST is <em>ptr16:32</em> or [<em>m16:32</em>] *)
ELSE (* OperandSize = 16 *)
IF stack not large enough for a 4-byte return address
THEN #SS(0); FI;
Push(CS);
Push(IP);
CS := DEST[31:16]; (* DEST is <em>ptr16:16</em> or [<em>m16:16</em>] *)
EIP := DEST[15:0]; (* DEST is <em>ptr16:16</em> or [<em>m16:16</em>]; clear upper 16 bits *)
FI;
FI;
IF far call and (PE = 1 and VM = 0) (* Protected mode or IA-32e Mode, not virtual-8086 mode*)
THEN
IF segment selector in target operand NULL
THEN #GP(0); FI;
IF segment selector index not within descriptor table limits
THEN #GP(new code segment selector); FI;
Read type and access rights of selected segment descriptor;
IF IA32_EFER.LMA = 0
THEN
IF segment type is not a conforming or nonconforming code segment, call
gate, task gate, or TSS
THEN #GP(segment selector); FI;
ELSE
IF segment type is not a conforming or nonconforming code segment or
64-bit call gate,
THEN #GP(segment selector); FI;
FI;
Depending on type and access rights:
GO TO CONFORMING-CODE-SEGMENT;
GO TO NONCONFORMING-CODE-SEGMENT;
GO TO CALL-GATE;
GO TO TASK-GATE;
GO TO TASK-STATE-SEGMENT;
FI;
CONFORMING-CODE-SEGMENT:
IF L bit = 1 and D bit = 1 and IA32_EFER.LMA = 1
THEN GP(new code segment selector); FI;
IF DPL &gt; CPL
THEN #GP(new code segment selector); FI;
IF segment not present
THEN #NP(new code segment selector); FI;
IF stack not large enough for return address
THEN #SS(0); FI;
tempEIP := DEST(Offset);
IF target mode = Compatibility mode
THEN tempEIP := tempEIP AND 00000000_FFFFFFFFH; FI;
IF OperandSize = 16
THEN
tempEIP := tempEIP AND 0000FFFFH; FI; (* Clear upper 16 bits *)
IF (IA32_EFER.LMA = 0 or target mode = Compatibility mode) and (tempEIP outside new code segment limit)
THEN #GP(0); FI;
IF tempEIP is non-canonical
THEN #GP(0); FI;
IF ShadowStackEnabled(CPL)
IF OperandSize = 32
THEN
tempPushLIP = CSBASE + EIP;
ELSE
IF OperandSize = 16
THEN
tempPushLIP = CSBASE + IP;
ELSE (* OperandSize = 64 *)
tempPushLIP = RIP;
FI;
FI;
tempPushCS = CS;
FI;
IF OperandSize = 32
THEN
Push(CS); (* Padded with 16 high-order bits *)
Push(EIP);
CS := DEST(CodeSegmentSelector);
(* Segment descriptor information also loaded *)
CS(RPL) := CPL;
EIP := tempEIP;
ELSE
IF OperandSize = 16
THEN
Push(CS);
Push(IP);
CS := DEST(CodeSegmentSelector);
(* Segment descriptor information also loaded *)
CS(RPL) := CPL;
EIP := tempEIP;
ELSE (* OperandSize = 64 *)
Push(CS); (* Padded with 48 high-order bits *)
Push(RIP);
CS := DEST(CodeSegmentSelector);
(* Segment descriptor information also loaded *)
CS(RPL) := CPL;
RIP := tempEIP;
FI;
FI;
IF ShadowStackEnabled(CPL)
IF (IA32_EFER.LMA and DEST(CodeSegmentSelector).L) = 0
(* If target is legacy or compatibility mode then the SSP must be in low 4GB *)
IF (SSP &amp; 0xFFFFFFFF00000000 != 0)
THEN #GP(0); FI;
FI;
(* align to 8 byte boundary if not already aligned *)
tempSSP = SSP;
Shadow_stack_store 4 bytes of 0 to (SSP 4)
SSP = SSP &amp; 0xFFFFFFFFFFFFFFF8H
ShadowStackPush8B(tempPushCS); (* Padded with 48 high-order bits of 0 *)
ShadowStackPush8B(tempPushLIP); (* Padded with 32 high-order bits of 0 for 32 bit LIP*)
ShadowStackPush8B(tempSSP);
FI;
IF EndbranchEnabled(CPL)
IF CPL = 3
THEN
IA32_U_CET.TRACKER = WAIT_FOR_ENDBRANCH
IA32_U_CET.SUPPRESS = 0
ELSE
IA32_S_CET.TRACKER = WAIT_FOR_ENDBRANCH
IA32_S_CET.SUPPRESS = 0
FI;
FI;
END;
NONCONFORMING-CODE-SEGMENT:
IF L-Bit = 1 and D-BIT = 1 and IA32_EFER.LMA = 1
THEN GP(new code segment selector); FI;
IF (RPL &gt; CPL) or (DPL ≠ CPL)
THEN #GP(new code segment selector); FI;
IF segment not present
THEN #NP(new code segment selector); FI;
IF stack not large enough for return address
THEN #SS(0); FI;
tempEIP := DEST(Offset);
IF target mode = Compatibility mode
THEN tempEIP := tempEIP AND 00000000_FFFFFFFFH; FI;
IF OperandSize = 16
THEN tempEIP := tempEIP AND 0000FFFFH; FI; (* Clear upper 16 bits *)
IF (IA32_EFER.LMA = 0 or target mode = Compatibility mode) and (tempEIP outside new code segment limit)
THEN #GP(0); FI;
IF tempEIP is non-canonical
THEN #GP(0); FI;
IF ShadowStackEnabled(CPL)
IF IA32_EFER.LMA &amp; CS.L
tempPushLIP = RIP
ELSE
tempPushLIP = CSBASE + EIP;
FI;
tempPushCS = CS;
FI;
IF OperandSize = 32
THEN
Push(CS); (* Padded with 16 high-order bits *)
Push(EIP);
CS := DEST(CodeSegmentSelector);
(* Segment descriptor information also loaded *)
CS(RPL) := CPL;
EIP := tempEIP;
ELSE
IF OperandSize = 16
THEN
Push(CS);
Push(IP);
CS := DEST(CodeSegmentSelector);
(* Segment descriptor information also loaded *)
CS(RPL) := CPL;
EIP := tempEIP;
ELSE (* OperandSize = 64 *)
Push(CS); (* Padded with 48 high-order bits *)
Push(RIP);
CS := DEST(CodeSegmentSelector);
(* Segment descriptor information also loaded *)
CS(RPL) := CPL;
RIP := tempEIP;
FI;
FI;
IF ShadowStackEnabled(CPL)
IF (IA32_EFER.LMA and DEST(CodeSegmentSelector).L) = 0
(* If target is legacy or compatibility mode then the SSP must be in low 4GB *)
IF (SSP &amp; 0xFFFFFFFF00000000 != 0)
THEN #GP(0); FI;
FI;
(* align to 8 byte boundary if not already aligned *)
tempSSP = SSP;
Shadow_stack_store 4 bytes of 0 to (SSP 4)
SSP = SSP &amp; 0xFFFFFFFFFFFFFFF8H
ShadowStackPush8B(tempPushCS); (* Padded with 48 high-order 0 bits *)
ShadowStackPush8B(tempPushLIP); (* Padded 32 high-order bits of 0 for 32 bit LIP*)
ShadowStackPush8B(tempSSP);
FI;
IF EndbranchEnabled(CPL)
IF CPL = 3
THEN
IA32_U_CET.TRACKER = WAIT_FOR_ENDBRANCH
IA32_U_CET.SUPPRESS = 0
ELSE
IA32_S_CET.TRACKER = WAIT_FOR_ENDBRANCH
IA32_S_CET.SUPPRESS = 0
FI;
FI;
END;
CALL-GATE:
IF call gate (DPL &lt; CPL) or (RPL &gt; DPL)
THEN #GP(call-gate selector); FI;
IF call gate not present
THEN #NP(call-gate selector); FI;
IF call-gate code-segment selector is NULL
THEN #GP(0); FI;
IF call-gate code-segment selector index is outside descriptor table limits
THEN #GP(call-gate code-segment selector); FI;
Read call-gate code-segment descriptor;
IF call-gate code-segment descriptor does not indicate a code segment
or call-gate code-segment descriptor DPL &gt; CPL
THEN #GP(call-gate code-segment selector); FI;
IF IA32_EFER.LMA = 1 AND (call-gate code-segment descriptor is
not a 64-bit code segment or call-gate code-segment descriptor has both L-bit and D-bit set)
THEN #GP(call-gate code-segment selector); FI;
IF call-gate code segment not present
THEN #NP(call-gate code-segment selector); FI;
IF call-gate code segment is non-conforming and DPL &lt; CPL
THEN go to MORE-PRIVILEGE;
ELSE go to SAME-PRIVILEGE;
FI;
END;
MORE-PRIVILEGE:
IF current TSS is 32-bit
THEN
TSSstackAddress := (new code-segment DPL 8) + 4;
IF (TSSstackAddress + 5) &gt; current TSS limit
THEN #TS(current TSS selector); FI;
NewSS := 2 bytes loaded from (TSS base + TSSstackAddress + 4);
NewESP := 4 bytes loaded from (TSS base + TSSstackAddress);
ELSE
IF current TSS is 16-bit
THEN
TSSstackAddress := (new code-segment DPL 4) + 2
IF (TSSstackAddress + 3) &gt; current TSS limit
THEN #TS(current TSS selector); FI;
NewSS := 2 bytes loaded from (TSS base + TSSstackAddress + 2);
NewESP := 2 bytes loaded from (TSS base + TSSstackAddress);
ELSE (* current TSS is 64-bit *)
TSSstackAddress := (new code-segment DPL 8) + 4;
IF (TSSstackAddress + 7) &gt; current TSS limit
THEN #TS(current TSS selector); FI;
NewSS := new code-segment DPL; (* NULL selector with RPL = new CPL *)
NewRSP := 8 bytes loaded from (current TSS base + TSSstackAddress);
FI;
FI;
IF IA32_EFER.LMA = 0 and NewSS is NULL
THEN #TS(NewSS); FI;
Read new stack-segment descriptor;
IF IA32_EFER.LMA = 0 and (NewSS RPL ≠ new code-segment DPL
or new stack-segment DPL ≠ new code-segment DPL or new stack segment is not a
writable data segment)
THEN #TS(NewSS); FI
IF IA32_EFER.LMA = 0 and new stack segment not present
THEN #SS(NewSS); FI;
IF CallGateSize = 32
THEN
IF new stack does not have room for parameters plus 16 bytes
THEN #SS(NewSS); FI;
IF CallGate(InstructionPointer) not within new code-segment limit
THEN #GP(0); FI;
SS:=newSS; (*Segmentdescriptorinformationalsoloaded*)
ESP := newESP;
CS:EIP := CallGate(CS:InstructionPointer);
(* Segment descriptor information also loaded *)
Push(oldSS:oldESP); (* From calling procedure *)
temp := parameter count from call gate, masked to 5 bits;
Push(parameters from calling procedures stack, temp)
Push(oldCS:oldEIP); (* Return address to calling procedure *)
ELSE
IF CallGateSize = 16
THEN
IF new stack does not have room for parameters plus 8 bytes
THEN #SS(NewSS); FI;
IF (CallGate(InstructionPointer) AND FFFFH) not in new code-segment limit
THEN #GP(0); FI;
SS:=newSS; (*Segmentdescriptorinformationalsoloaded*)
ESP := newESP;
CS:IP := CallGate(CS:InstructionPointer);
(* Segment descriptor information also loaded *)
Push(oldSS:oldESP); (* From calling procedure *)
temp := parameter count from call gate, masked to 5 bits;
Push(parameters from calling procedures stack, temp)
Push(oldCS:oldEIP); (* Return address to calling procedure *)
ELSE (* CallGateSize = 64 *)
IF pushing 32 bytes on the stack would use a non-canonical address
THEN #SS(NewSS); FI;
IF (CallGate(InstructionPointer) is non-canonical)
THEN #GP(0); FI;
SS := NewSS; (* NewSS is NULL)
RSP := NewESP;
CS:IP := CallGate(CS:InstructionPointer);
(* Segment descriptor information also loaded *)
Push(oldSS:oldESP); (* From calling procedure *)
Push(oldCS:oldEIP); (* Return address to calling procedure *)
FI;
FI;
IF ShadowStackEnabled(CPL) AND CPL = 3
THEN
IF IA32_EFER.LMA = 0
THEN IA32_PL3_SSP := SSP;
ELSE (* adjust so bits 63:N get the value of bit N1, where N is the CPUs maximum linear-address width *)
IA32_PL3_SSP := LA_adjust(SSP);
FI;
FI;
CPL := CodeSegment(DPL)
CS(RPL) := CPL
IF ShadowStackEnabled(CPL)
oldSSP := SSP
SSP := IA32_PLi_SSP; (* where i is the CPL *)
IF SSP &amp; 0x07 != 0 (* if SSP not aligned to 8 bytes then #GP *)
THEN #GP(0); FI;
(* Token and CS:LIP:oldSSP pushed on shadow stack must be contained in a naturally aligned 32-byte region*)
IF (SSP &amp; ~0x1F) != ((SSP 24) &amp; ~0x1F)
#GP(0); FI;
IF ((IA32_EFER.LMA and CS.L) = 0 AND SSP[63:32] != 0)
THEN #GP(0); FI;
expected_token_value = SSP (* busy bit - bit position 0 - must be clear *)
new_token_value = SSP | BUSY_BIT (* Set the busy bit *)
IF shadow_stack_lock_cmpxchg8b(SSP, new_token_value, expected_token_value) != expected_token_value
THEN #GP(0); FI;
IF oldSS.DPL != 3
ShadowStackPush8B(oldCS); (* Padded with 48 high-order bits of 0 *)
ShadowStackPush8B(oldCSBASE+oldRIP); (* Padded with 32 high-order bits of 0 for 32 bit LIP*)
ShadowStackPush8B(oldSSP);
FI;
FI;
IF EndbranchEnabled (CPL)
IA32_S_CET.TRACKER = WAIT_FOR_ENDBRANCH
IA32_S_CET.SUPPRESS = 0
FI;
END;
SAME-PRIVILEGE:
IF CallGateSize = 32
THEN
IF stack does not have room for 8 bytes
THEN #SS(0); FI;
IF CallGate(InstructionPointer) not within code segment limit
THEN #GP(0); FI;
CS:EIP := CallGate(CS:EIP) (* Segment descriptor information also loaded *)
Push(oldCS:oldEIP); (* Return address to calling procedure *)
ELSE
If CallGateSize = 16
THEN
IF stack does not have room for 4 bytes
THEN #SS(0); FI;
IF CallGate(InstructionPointer) not within code segment limit
THEN #GP(0); FI;
CS:IP := CallGate(CS:instruction pointer);
(* Segment descriptor information also loaded *)
Push(oldCS:oldIP); (* Return address to calling procedure *)
ELSE (* CallGateSize = 64)
IF pushing 16 bytes on the stack touches non-canonical addresses
THEN #SS(0); FI;
IF RIP non-canonical
THEN #GP(0); FI;
CS:IP := CallGate(CS:instruction pointer);
(* Segment descriptor information also loaded *)
Push(oldCS:oldIP); (* Return address to calling procedure *)
FI;
FI;
CS(RPL) := CPL
IF ShadowStackEnabled(CPL)
(* Align to next 8 byte boundary *)
tempSSP = SSP;
Shadow_stack_store 4 bytes of 0 to (SSP 4)
SSP = SSP &amp; 0xFFFFFFFFFFFFFFF8H;
(* push cs:lip:ssp on shadow stack *)
ShadowStackPush8B(oldCS); (* Padded with 48 high-order bits of 0 *)
ShadowStackPush8B(oldCSBASE + oldRIP); (* Padded with 32 high-order bits of 0 for 32 bit LIP*)
ShadowStackPush8B(tempSSP);
FI;
IF EndbranchEnabled (CPL)
IF CPL = 3
THEN
IA32_U_CET.TRACKER = WAIT_FOR_ENDBRANCH;
IA32_U_CET.SUPPRESS = 0
ELSE
IA32_S_CET.TRACKER = WAIT_FOR_ENDBRANCH;
IA32_S_CET.SUPPRESS = 0
FI;
FI;
END;
TASK-GATE:
IF task gate DPL &lt; CPL or RPL
THEN #GP(task gate selector); FI;
IF task gate not present
THEN #NP(task gate selector); FI;
Read the TSS segment selector in the task-gate descriptor;
IF TSS segment selector local/global bit is set to local
or index not within GDT limits
THEN #GP(TSS selector); FI;
Access TSS descriptor in GDT;
IF descriptor is not a TSS segment
THEN #GP(TSS selector); FI;
IF TSS descriptor specifies that the TSS is busy
THEN #GP(TSS selector); FI;
IF TSS not present
THEN #NP(TSS selector); FI;
SWITCH-TASKS (with nesting) to TSS;
IF EIP not within code segment limit
THEN #GP(0); FI;
END;
TASK-STATE-SEGMENT:
IF TSS DPL &lt; CPL or RPL
or TSS descriptor indicates TSS not available
THEN #GP(TSS selector); FI;
IF TSS is not present
THEN #NP(TSS selector); FI;
SWITCH-TASKS (with nesting) to TSS;
IF EIP not within code segment limit
THEN #GP(0); FI;
END;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>All flags are affected if a task switch occurs; no flags are affected if a task switch does not occur.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="11">#GP(0)</td>
<td>If the target offset in destination operand is beyond the new code segment limit.</td></tr>
<tr>
<td>If the segment selector in the destination operand is NULL.</td></tr>
<tr>
<td>If the code segment selector in the gate is NULL.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</td></tr>
<tr>
<td>If target mode is compatibility mode and SSP is not in low 4GB.</td></tr>
<tr>
<td>If SSP in IA32_PLi_SSP (where i is the new CPL) is not 8 byte aligned.</td></tr>
<tr>
<td>If the token and the stack frame to be pushed on shadow stack are not contained in a naturally aligned 32-byte region of the shadow stack.</td></tr>
<tr>
<td>If “supervisor Shadow Stack” token on new shadow stack is marked busy.</td></tr>
<tr>
<td>If destination mode is 32-bit or compatibility mode, but SSP address in “supervisor shadow stack” token is beyond 4GB.</td></tr>
<tr>
<td>If SSP address in “supervisor shadow stack” token does not match SSP address in IA32_PLi_SSP (where i is the new CPL).</td></tr>
<tr>
<td rowspan="10">#GP(selector)</td>
<td>If a code segment or gate or TSS selector index is outside descriptor table limits.</td></tr>
<tr>
<td>If the segment descriptor pointed to by the segment selector in the destination operand is not for a conforming-code segment, nonconforming-code segment, call gate, task gate, or task state segment.</td></tr>
<tr>
<td>If the DPL for a nonconforming-code segment is not equal to the CPL or the RPL for the segments segment selector is greater than the CPL.</td></tr>
<tr>
<td>If the DPL for a conforming-code segment is greater than the CPL.</td></tr>
<tr>
<td>If the DPL from a call-gate, task-gate, or TSS segment descriptor is less than the CPL or than the RPL of the call-gate, task-gate, or TSSs segment selector.</td></tr>
<tr>
<td>If the segment descriptor for a segment selector from a call gate does not indicate it is a code segment.</td></tr>
<tr>
<td>If the segment selector from a call gate is beyond the descriptor table limits.</td></tr>
<tr>
<td>If the DPL for a code-segment obtained from a call gate is greater than the CPL.</td></tr>
<tr>
<td>If the segment selector for a TSS has its local/global bit set for local.</td></tr>
<tr>
<td>If a TSS segment descriptor specifies that the TSS is busy or not available.</td></tr>
<tr>
<td rowspan="2">#SS(0)</td>
<td>If pushing the return address, parameters, or stack segment pointer onto the stack exceeds the bounds of the stack segment, when no stack switch occurs.</td></tr>
<tr>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td rowspan="3">#SS(selector)</td>
<td>If pushing the return address, parameters, or stack segment pointer onto the stack exceeds the bounds of the stack segment, when a stack switch occurs.</td></tr>
<tr>
<td>If the SS register is being loaded as part of a stack switch and the segment pointed to is marked not present.</td></tr>
<tr>
<td>If stack segment does not have room for the return address, parameters, or stack segment pointer, when stack switch occurs.</td></tr>
<tr>
<td>#NP(selector)</td>
<td>If a code segment, data segment, call gate, task gate, or TSS is not present.</td></tr>
<tr>
<td rowspan="6">#TS(selector)</td>
<td>If the new stack segment selector and ESP are beyond the end of the TSS.</td></tr>
<tr>
<td>If the new stack segment selector is NULL.</td></tr>
<tr>
<td>If the RPL of the new stack segment selector in the TSS is not equal to the DPL of the code segment being accessed.</td></tr>
<tr>
<td>If DPL of the stack segment descriptor for the new stack segment is not equal to the DPL of the code segment descriptor.</td></tr>
<tr>
<td>If the new stack segment is not a writable data segment.</td></tr>
<tr>
<td>If segment-selector index for stack segment is outside descriptor table limits.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the target offset is beyond the code segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the target offset is beyond the code segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<table>
<tr>
<td>#GP(selector)</td>
<td>If a memory address accessed by the selector is in non-canonical space.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the target offset in the destination operand is non-canonical.</td></tr></table>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="10">#GP(0)</td>
<td>If a memory address is non-canonical.</td></tr>
<tr>
<td>If target offset in destination operand is non-canonical.</td></tr>
<tr>
<td>If the segment selector in the destination operand is NULL.</td></tr>
<tr>
<td>If the code segment selector in the 64-bit gate is NULL.</td></tr>
<tr>
<td>If target mode is compatibility mode and SSP is not in low 4GB.</td></tr>
<tr>
<td>If SSP in IA32_PLi_SSP (where i is the new CPL) is not 8 byte aligned.</td></tr>
<tr>
<td>If the token and the stack frame to be pushed on shadow stack are not contained in a naturally aligned 32-byte region of the shadow stack.</td></tr>
<tr>
<td>If “supervisor Shadow Stack” token on new shadow stack is marked busy.</td></tr>
<tr>
<td>If destination mode is 32-bit mode or compatibility mode, but SSP address in “super-visor shadow” stack token is beyond 4GB.</td></tr>
<tr>
<td>If SSP address in “supervisor shadow stack” token does not match SSP address in IA32_PLi_SSP (where i is the new CPL).</td></tr>
<tr>
<td rowspan="12">#GP(selector)</td>
<td>If code segment or 64-bit call gate is outside descriptor table limits.</td></tr>
<tr>
<td>If code segment or 64-bit call gate overlaps non-canonical space.</td></tr>
<tr>
<td>If the segment descriptor pointed to by the segment selector in the destination operand is not for a conforming-code segment, nonconforming-code segment, or 64-bit call gate.</td></tr>
<tr>
<td>If the segment descriptor pointed to by the segment selector in the destination operand is a code segment and has both the D-bit and the L- bit set.</td></tr>
<tr>
<td>If the DPL for a nonconforming-code segment is not equal to the CPL, or the RPL for the segments segment selector is greater than the CPL.</td></tr>
<tr>
<td>If the DPL for a conforming-code segment is greater than the CPL.</td></tr>
<tr>
<td>If the DPL from a 64-bit call-gate is less than the CPL or than the RPL of the 64-bit call-gate.</td></tr>
<tr>
<td>If the upper type field of a 64-bit call gate is not 0x0.</td></tr>
<tr>
<td>If the segment selector from a 64-bit call gate is beyond the descriptor table limits.</td></tr>
<tr>
<td>If the DPL for a code-segment obtained from a 64-bit call gate is greater than the CPL.</td></tr>
<tr>
<td>If the code segment descriptor pointed to by the selector in the 64-bit gate doesn't have the L-bit set and the D-bit clear.</td></tr>
<tr>
<td>If the segment descriptor for a segment selector from the 64-bit call gate does not indicate it is a code segment.</td></tr>
<tr>
<td rowspan="3">#SS(0)</td>
<td>If pushing the return offset or CS selector onto the stack exceeds the bounds of the stack segment when no stack switch occurs.</td></tr>
<tr>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>If the stack address is in a non-canonical form.</td></tr>
<tr>
<td>#SS(selector)</td>
<td>If pushing the old values of SS selector, stack pointer, EFLAGS, CS selector, offset, or error code onto the stack violates the canonical boundary when a stack switch occurs.</td></tr>
<tr>
<td>#NP(selector)</td>
<td>If a code segment or 64-bit call gate is not present.</td></tr>
<tr>
<td>#TS(selector)</td>
<td>If the load of the new RSP exceeds the limit of the TSS.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>(64-bit mode only) If a far call is direct to an absolute address in memory.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

159
x86/capabilities.html Normal file
View file

@ -0,0 +1,159 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>GETSEC[CAPABILITIES]
— Report the SMX Capabilities</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>GETSEC[CAPABILITIES]
— Report the SMX Capabilities</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 37 (EAX = 0)</td>
<td>GETSEC[CAPABILITIES]</td>
<td>Report the SMX capabilities. The capabilities index is input in EBX with the result returned in EAX.</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>The GETSEC[CAPABILITIES] function returns a bit vector of supported GETSEC leaf functions. The CAPABILITIES leaf of GETSEC is selected with EAX set to 0 at entry. EBX is used as the selector for returning the bit vector field in EAX. GETSEC[CAPABILITIES] may be executed at all privilege levels, but the CR4.SMXE bit must be set or an undefined opcode exception (#UD) is returned.</p>
<p>With EBX = 0 upon execution of GETSEC[CAPABILITIES], EAX returns the a bit vector representing status on the presence of a Intel<sup>®</sup> TXT-capable chipset and the first 30 available GETSEC leaf functions. The format of the returned bit vector is provided in <span class="not-imported">Table 7-3</span>.</p>
<p>If bit 0 is set to 1, then an Intel<sup>®</sup> TXT-capable chipset has been sampled present by the processor. If bits in the range of 1-30 are set, then the corresponding GETSEC leaf function is available. If the bit value at a given bit index is 0, then the GETSEC leaf function corresponding to that index is unsupported and attempted execution results in a #UD.</p>
<p>Bit 31 of EAX indicates if further leaf indexes are supported. If the Extended Leafs bit 31 is set, then additional leaf functions are accessed by repeating GETSEC[CAPABILITIES] with EBX incremented by one. When the most significant bit of EAX is not set, then additional GETSEC leaf functions are not supported; indexing EBX to a higher value results in EAX returning zero.</p>
<figure id="tbl-7-3">
<table>
<tr>
<th>Field</th>
<th>Bit position</th>
<th>Description</th></tr>
<tr>
<td>Chipset Present</td>
<td>0</td>
<td>Intel® TXT-capable chipset is present.</td></tr>
<tr>
<td>Undefined</td>
<td>1</td>
<td>Reserved</td></tr>
<tr>
<td>ENTERACCS</td>
<td>2</td>
<td>GETSEC[ENTERACCS] is available.</td></tr>
<tr>
<td>EXITAC</td>
<td>3</td>
<td>GETSEC[EXITAC] is available.</td></tr>
<tr>
<td>SENTER</td>
<td>4</td>
<td>GETSEC[SENTER] is available.</td></tr>
<tr>
<td>SEXIT</td>
<td>5</td>
<td>GETSEC[SEXIT] is available.</td></tr>
<tr>
<td>PARAMETERS</td>
<td>6</td>
<td>GETSEC[PARAMETERS] is available.</td></tr>
<tr>
<td>SMCTRL</td>
<td>7</td>
<td>GETSEC[SMCTRL] is available.</td></tr>
<tr>
<td>WAKEUP</td>
<td>8</td>
<td>GETSEC[WAKEUP] is available.</td></tr>
<tr>
<td>Undefined</td>
<td>30:9</td>
<td>Reserved</td></tr>
<tr>
<td>Extended Leafs</td>
<td>31</td>
<td>Reserved for extended information reporting of GETSEC capabilities.</td></tr></table>
<figcaption><span class="not-imported">Table 7-3</span>. GETSEC Capability Result Encoding (EBX = 0)</figcaption></figure>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF (CR4.SMXE=0)
THEN #UD;
ELSIF (in VMX non-root operation)
THEN VM Exit (reason=”GETSEC instruction”);
IF (EBX=0) THEN
BitVector := 0;
IF (TXT chipset present)
BitVector[Chipset present] := 1;
IF (ENTERACCS Available)
THEN BitVector[ENTERACCS] := 1;
IF (EXITAC Available)
THEN BitVector[EXITAC] := 1;
IF (SENTER Available)
THEN BitVector[SENTER] := 1;
IF (SEXIT Available)
THEN BitVector[SEXIT] := 1;
IF (PARAMETERS Available)
THEN BitVector[PARAMETERS] := 1;
IF (SMCTRL Available)
THEN BitVector[SMCTRL] := 1;
IF (WAKEUP Available)
THEN BitVector[WAKEUP] := 1;
EAX := BitVector;
ELSE
EAX := 0;
END;;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 id="use-of-prefixes">Use of Prefixes<a class="anchor" href="#use-of-prefixes">
</a></h2>
<p>LOCK Causes #UD.</p>
<p>REP* Cause #UD (includes REPNE/REPNZ and REP/REPE/REPZ).</p>
<p>Operand size Causes #UD.</p>
<p>NP 66/F2/F3 prefixes are not allowed.</p>
<p>Segmentoverrides Ignored.</p>
<p>Address size Ignored.</p>
<p>REX Ignored.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If CR4.SMXE = 0.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If CR4.SMXE = 0.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If CR4.SMXE = 0.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If CR4.SMXE = 0.</td></tr></table>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If CR4.SMXE = 0.</td></tr></table>
<h2 id="vm-exit-condition">VM-exit Condition<a class="anchor" href="#vm-exit-condition">
</a></h2>
<p>Reason (GETSEC) If in VMX non-root operation.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

82
x86/cbw.cwde.cdqe.html Normal file
View file

@ -0,0 +1,82 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CBW/CWDE/CDQE
— Convert Byte to Word/Convert Word to Doubleword/Convert Doubleword toQuadword</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CBW/CWDE/CDQE
— Convert Byte to Word/Convert Word to Doubleword/Convert Doubleword toQuadword</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>98</td>
<td>CBW</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>AX := sign-extend of AL.</td></tr>
<tr>
<td>98</td>
<td>CWDE</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>EAX := sign-extend of AX.</td></tr>
<tr>
<td>REX.W + 98</td>
<td>CDQE</td>
<td>ZO</td>
<td>Valid</td>
<td>N.E.</td>
<td>RAX := sign-extend of EAX.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Double the size of the source operand by means of sign extension. The CBW (convert byte to word) instruction copies the sign (bit 7) in the source operand into every bit in the AH register. The CWDE (convert word to double-word) instruction copies the sign (bit 15) of the word in the AX register into the high 16 bits of the EAX register.</p>
<p>CBW and CWDE reference the same opcode. The CBW instruction is intended for use when the operand-size attribute is 16; CWDE is intended for use when the operand-size attribute is 32. Some assemblers may force the operand size. Others may treat these two mnemonics as synonyms (CBW/CWDE) and use the setting of the operand-size attribute to determine the size of values to be converted.</p>
<p>In 64-bit mode, the default operation size is the size of the destination register. Use of the REX.W prefix promotes this instruction (CDQE when promoted) to operate on 64-bit operands. In which case, CDQE copies the sign (bit 31) of the doubleword in the EAX register into the high 32 bits of RAX.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF OperandSize = 16 (* Instruction = CBW *)
THEN
AX := SignExtend(AL);
ELSE IF (OperandSize = 32, Instruction = CWDE)
EAX := SignExtend(AX); FI;
ELSE (* 64-Bit Mode, OperandSize = 64, Instruction = CDQE*)
RAX := SignExtend(EAX);
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

101
x86/clac.html Normal file
View file

@ -0,0 +1,101 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLAC
— Clear AC Flag in EFLAGS Register</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLAC
— Clear AC Flag in EFLAGS Register</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 01 CA CLAC</td>
<td>ZO</td>
<td>V/V</td>
<td>SMAP</td>
<td>Clear the AC flag in the EFLAGS register.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Clears the AC flag bit in EFLAGS register. This disables any alignment checking of user-mode data accesses. Ifthe SMAP bit is set in the CR4 register, this disallows explicit supervisor-mode data accesses to user-mode pages.</p>
<p>This instruction's operation is the same in non-64-bit modes and 64-bit mode. Attempts to execute CLAC when CPL &gt; 0 cause #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>EFLAGS.AC := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>AC cleared. Other flags are unaffected.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If the CPL &gt; 0.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>The CLAC instruction is not recognized in virtual-8086 mode.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If the CPL &gt; 0.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.</td></tr></table>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If the CPL &gt; 0.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.SMAP[bit 20] = 0.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

57
x86/clc.html Normal file
View file

@ -0,0 +1,57 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLC
— Clear Carry Flag</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLC
— Clear Carry Flag</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>F8</td>
<td>CLC</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>Clear CF flag.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Clears the CF flag in the EFLAGS register. Operation is the same in all modes.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CF := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The CF flag is set to 0. The OF, ZF, SF, AF, and PF flags are unaffected.</p>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

57
x86/cld.html Normal file
View file

@ -0,0 +1,57 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLD
— Clear Direction Flag</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLD
— Clear Direction Flag</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>FC</td>
<td>CLD</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>Clear DF flag.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Clears the DF flag in the EFLAGS register. When the DF flag is set to 0, string operations increment the index registers (ESI and/or EDI). Operation is the same in all modes.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>DF := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The DF flag is set to 0. The CF, OF, ZF, SF, AF, and PF flags are unaffected.</p>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

96
x86/cldemote.html Normal file
View file

@ -0,0 +1,96 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLDEMOTE
— Cache Line Demote</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLDEMOTE
— Cache Line Demote</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 1C /0 CLDEMOTE m8</td>
<td>A</td>
<td>V/V</td>
<td>CLDEMOTE</td>
<td>Hint to hardware to move the cache line containing m8 to a more distant level of the cache without writing back to memory.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<sup>1</sup><a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<blockquote>
<p>1. The Mod field of the ModR/M byte cannot have value 11B.</p></blockquote>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>ModRM:r/m (w)</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Hints to hardware that the cache line that contains the linear address specified with the memory operand should be moved (“demoted”) from the cache(s) closest to the processor core to a level more distant from the processor core. This may accelerate subsequent accesses to the line by other cores in the same coherence domain, especially if the line was written by the core that demotes the line. Moving the line in such a manner is a performance optimization, i.e., it is a hint which does not modify architectural state. Hardware may choose which level in the cache hierarchy to retain the line (e.g., L3 in typical server designs). The source operand is a byte memory location.</p>
<p>The availability of the CLDEMOTE instruction is indicated by the presence of the CPUID feature flag CLDEMOTE (bit 25 of the ECX register in sub-leaf 07H, see “CPUID—CPU Identification”). On processors which do not support the CLDEMOTE instruction (including legacy hardware) the instruction will be treated as a NOP.</p>
<p>A CLDEMOTE instruction is ordered with respect to stores to the same cache line, but unordered with respect to other instructions including memory fences, CLDEMOTE, CLWB or CLFLUSHOPT instructions to a different cache line. Since CLDEMOTE will retire in order with respect to stores to the same cache line, software should ensure that after issuing CLDEMOTE the line is not accessed again immediately by the same core to avoid cache data movement penalties.</p>
<p>The effective memory type of the page containing the affected line determines the effect; cacheable types are likely to generate a data movement operation, while uncacheable types may cause the instruction to be ignored.</p>
<p>Speculative fetching can occur at any time and is not tied to instruction execution. The CLDEMOTE instruction is not ordered with respect to PREFETCHh instructions or any of the speculative fetching mechanisms. That is, data can be speculatively loaded into a cache line just before, during, or after the execution of a CLDEMOTE instruction that references the cache line.</p>
<p>Unlike CLFLUSH, CLFLUSHOPT, and CLWB instructions, CLDEMOTE is not guaranteed to write back modified data to memory.</p>
<p>The CLDEMOTE instruction may be ignored by hardware in certain cases and is not a guarantee.</p>
<p>The CLDEMOTE instruction can be used at all privilege levels. In certain processor implementations the CLDEMOTE instruction may set the A bit but not the D bit in the page tables.</p>
<p>If the line is not found in the cache, the instruction will be treated as a NOP.</p>
<p>In some implementations, the CLDEMOTE instruction may always cause a transactional abort with Transactional Synchronization Extensions (TSX). However, programmers must not rely on CLDEMOTE instruction to force a transactional abort.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>Cache_Line_Demote(m8);
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 id="c-c++-compiler-intrinsic-equivalent">C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>CLDEMOTE void _cldemote(const void*);
</pre>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<p>Same exceptions as in real address mode.</p>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

121
x86/clflush.html Normal file
View file

@ -0,0 +1,121 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLFLUSH
— Flush Cache Line</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLFLUSH
— Flush Cache Line</h1>
<table>
<tr>
<th>Opcode / Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>NP 0F AE /7 CLFLUSH <em>m8</em></td>
<td>M</td>
<td>Valid</td>
<td>Valid</td>
<td>Flushes cache line containing <em>m8</em>.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>M</td>
<td>ModRM:r/m (w)</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Invalidates from every level of the cache hierarchy in the cache coherence domain the cache line that contains the linear address specified with the memory operand. If that cache line contains modified data at any level of the cache hierarchy, that data is written back to memory. The source operand is a byte memory location.</p>
<p>The availability of CLFLUSH is indicated by the presence of the CPUID feature flag CLFSH (CPUID.01H:EDX[bit 19]). The aligned cache line size affected is also indicated with the CPUID instruction (bits 8 through 15 of the EBX register when the initial value in the EAX register is 1).</p>
<p>The memory attribute of the page containing the affected line has no effect on the behavior of this instruction. It should be noted that processors are free to speculatively fetch and cache data from system memory regions assigned a memory-type allowing for speculative reads (such as, the WB, WC, and WT memory types). PREFETCH<em>h</em> instructions can be used to provide the processor with hints for this speculative behavior. Because this speculative fetching can occur at any time and is not tied to instruction execution, the CLFLUSH instruction is not ordered with respect to PREFETCH<em>h</em> instructions or any of the speculative fetching mechanisms (that is, data can be speculatively loaded into a cache line just before, during, or after the execution of a CLFLUSH instruction that references the cache line).</p>
<p>Executions of the CLFLUSH instruction are ordered with respect to each other and with respect to writes, locked read-modify-write instructions, and fence instructions.<sup>1</sup> They are not ordered with respect to executions of CLFLUSHOPT and CLWB. Software can use the SFENCE instruction to order an execution of CLFLUSH relative to one of those operations.</p>
<blockquote>
<p>1. Earlier versions of this manual specified that executions of the CLFLUSH instruction were ordered only by the MFENCE instruction. All processors implementing the CLFLUSH instruction also order it relative to the other operations enumerated above.</p></blockquote>
<p>The CLFLUSH instruction can be used at all privilege levels and is subject to all permission checking and faults associated with a byte load (and in addition, a CLFLUSH instruction is allowed to flush a linear address in an execute-only segment). Like a load, the CLFLUSH instruction sets the A bit but not the D bit in the page tables.</p>
<p>In some implementations, the CLFLUSH instruction may always cause transactional abort with Transactional Synchronization Extensions (TSX). The CLFLUSH instruction is not expected to be commonly used inside typical transactional regions. However, programmers must not rely on CLFLUSH instruction to force a transactional abort, since whether they cause transactional abort is implementation dependent.</p>
<p>The CLFLUSH instruction was introduced with the SSE2 extensions; however, because it has its own CPUID feature flag, it can be implemented in IA-32 processors that do not include the SSE2 extensions. Also, detecting the presence of the SSE2 extensions with the CPUID instruction does not guarantee that the CLFLUSH instruction is implemented in the processor.</p>
<p>CLFLUSH operation is the same in non-64-bit modes and 64-bit mode.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>Flush_Cache_Line(SRC);
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalents">Intel C/C++ Compiler Intrinsic Equivalents<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalents">
</a></h2>
<pre>CLFLUSH void _mm_clflush(void const *p)
</pre>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.</td></tr>
<tr>
<td>#SS(0)</td>
<td>For an illegal address in the SS segment.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If CPUID.01H:EDX.CLFSH[bit 19] = 0.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If any part of the operand lies outside the effective address space from 0 to FFFFH.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If CPUID.01H:EDX.CLFSH[bit 19] = 0.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<p>Same exceptions as in real address mode.</p>
<table>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If CPUID.01H:EDX.CLFSH[bit 19] = 0.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

124
x86/clflushopt.html Normal file
View file

@ -0,0 +1,124 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLFLUSHOPT
— Flush Cache Line Optimized</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLFLUSHOPT
— Flush Cache Line Optimized</h1>
<table>
<tr>
<th>Opcode / Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>NFx 66 0F AE /7 CLFLUSHOPT m8</td>
<td>M</td>
<td>Valid</td>
<td>Valid</td>
<td>Flushes cache line containing m8.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>M</td>
<td>ModRM:r/m (w)</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Invalidates from every level of the cache hierarchy in the cache coherence domain the cache line that contains the linear address specified with the memory operand. If that cache line contains modified data at any level of the cache hierarchy, that data is written back to memory. The source operand is a byte memory location.</p>
<p>The availability of CLFLUSHOPT is indicated by the presence of the CPUID feature flag CLFLUSHOPT (CPUID.(EAX=07H,ECX=0H):EBX[bit 23]). The aligned cache line size affected is also indicated with the CPUID instruction (bits 8 through 15 of the EBX register when the initial value in the EAX register is 1).</p>
<p>The memory attribute of the page containing the affected line has no effect on the behavior of this instruction. It should be noted that processors are free to speculatively fetch and cache data from system memory regions assigned a memory-type allowing for speculative reads (such as, the WB, WC, and WT memory types). PREFETCH<em>h</em> instructions can be used to provide the processor with hints for this speculative behavior. Because this speculative fetching can occur at any time and is not tied to instruction execution, the CLFLUSH instruction is not ordered with respect to PREFETCH<em>h</em> instructions or any of the speculative fetching mechanisms (that is, data can be speculatively loaded into a cache line just before, during, or after the execution of a CLFLUSH instruction that references the cache line).</p>
<p>Executions of the CLFLUSHOPT instruction are ordered with respect to fence instructions and to locked read-modify-write instructions; they are also ordered with respect to older writes to the cache line being invalidated. They are not ordered with respect to other executions of CLFLUSHOPT, to executions of CLFLUSH and CLWB, or to younger writes to the cache line being invalidated. Software can use the SFENCE instruction to order an execution of CLFLUSHOPT relative to one of those operations.</p>
<p>The CLFLUSHOPT instruction can be used at all privilege levels and is subject to all permission checking and faults associated with a byte load (and in addition, a CLFLUSHOPT instruction is allowed to flush a linear address in an execute-only segment). Like a load, the CLFLUSHOPT instruction sets the A bit but not the D bit in the page tables.</p>
<p>In some implementations, the CLFLUSHOPT instruction may always cause transactional abort with Transactional Synchronization Extensions (TSX). The CLFLUSHOPT instruction is not expected to be commonly used inside typical transactional regions. However, programmers must not rely on CLFLUSHOPT instruction to force a transactional abort, since whether they cause transactional abort is implementation dependent.</p>
<p>CLFLUSHOPT operation is the same in non-64-bit modes and 64-bit mode.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>Flush_Cache_Line_Optimized(SRC);
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalents">Intel C/C++ Compiler Intrinsic Equivalents<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalents">
</a></h2>
<pre>CLFLUSHOPT void _mm_clflushopt(void const *p)
</pre>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.</td></tr>
<tr>
<td>#SS(0)</td>
<td>For an illegal address in the SS segment.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td rowspan="3">#UD</td>
<td>If CPUID.(EAX=07H,ECX=0H):EBX.CLFLUSHOPT[bit 23] = 0.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If an instruction prefix F2H or F3H is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If any part of the operand lies outside the effective address space from 0 to FFFFH.</td></tr>
<tr>
<td rowspan="3">#UD</td>
<td>If CPUID.(EAX=07H,ECX=0H):EBX.CLFLUSHOPT[bit 23] = 0.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If an instruction prefix F2H or F3H is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<p>Same exceptions as in real address mode.</p>
<table>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr>
<tr>
<td rowspan="3">#UD</td>
<td>If CPUID.(EAX=07H,ECX=0H):EBX.CLFLUSHOPT[bit 23] = 0.</td></tr>
<tr>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If an instruction prefix F2H or F3H is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

150
x86/cli.html Normal file
View file

@ -0,0 +1,150 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLI
— Clear Interrupt Flag</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLI
— Clear Interrupt Flag</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>FA</td>
<td>CLI</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>Clear interrupt flag; interrupts disabled when interrupt flag cleared.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>In most cases, CLI clears the IF flag in the EFLAGS register and no other flags are affected. Clearing the IF flag causes the processor to ignore maskable external interrupts. The IF flag and the CLI and STI instruction have no effect on the generation of exceptions and NMI interrupts.</p>
<p>Operation is different in two modes defined as follows:</p>
<ul>
<li><strong>PVI mode </strong>(protected-mode virtual interrupts): CR0.PE = 1, EFLAGS.VM = 0, CPL = 3, and CR4.PVI = 1;</li>
<li><strong>VME mode </strong>(virtual-8086 mode extensions): CR0.PE = 1, EFLAGS.VM = 1, and CR4.VME = 1.</li></ul>
<p>If IOPL &lt; 3 and either VME mode or PVI mode is active, CLI clears the VIF flag in the EFLAGS register, leaving IF unaffected.</p>
<p><a href='cli.html#tbl-3-7'>Table 3-7</a> indicates the action of the CLI instruction depending on the processor operating mode, IOPL, and CPL.</p>
<figure id="tbl-3-7">
<table>
<tr>
<th>Mode</th>
<th>IOPL</th>
<th>CLI Result</th></tr>
<tr>
<td>Real-address</td>
<td><sub>X</sub><sup>1</sup></td>
<td>IF = 0</td></tr>
<tr>
<td rowspan="2">Protected, not PVI<sup>2</sup></td>
<td>≥ CPL</td>
<td>IF = 0</td></tr>
<tr>
<td>&lt; CPL</td>
<td>#GP fault</td></tr>
<tr>
<td rowspan="2">Protected, PVI<sup>3</sup></td>
<td>3</td>
<td>IF = 0</td></tr>
<tr>
<td>02</td>
<td>VIF = 0</td></tr>
<tr>
<td rowspan="2">Virtual-8086, not VME<sup>3</sup></td>
<td>3</td>
<td>IF = 0</td></tr>
<tr>
<td>02</td>
<td>#GP fault</td></tr>
<tr>
<td rowspan="2">Virtual-8086, VME<sup>3</sup></td>
<td>3</td>
<td>IF = 0</td></tr>
<tr>
<td>02</td>
<td>VIF = 0</td></tr></table>
<figcaption><a href='cli.html#tbl-3-7'>Table 3-7</a>. Decision Table for CLI Results</figcaption></figure>
<blockquote>
<p>1. X = This setting has no effect on instruction operation.</p>
<p>2. For this table, “protected mode” applies whenever CR0.PE = 1 and EFLAGS.VM = 0; it includes compatibility mode and 64-bit mode.</p>
<p>3. PVI mode and virtual-8086 mode each imply CPL = 3.</p></blockquote>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF CR0.PE = 0
THEN IF := 0; (* Reset Interrupt Flag *)
ELSE
IF IOPL ≥ CPL (* CPL = 3 if EFLAGS.VM = 1 *)
THEN IF := 0; (* Reset Interrupt Flag *)
ELSE
IF VME mode OR PVI mode
THEN VIF := 0; (* Reset Virtual Interrupt Flag *)
ELSE #GP(0);
FI;
FI;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>Either the IF flag or the VIF flag is cleared to 0. Other flags are unaffected.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If CPL is greater than IOPL and PVI mode is not active.</td></tr>
<tr>
<td>If CPL is greater than IOPL and less than 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If IOPL is less than 3 and VME mode is not active.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

150
x86/clrssbsy.html Normal file
View file

@ -0,0 +1,150 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLRSSBSY
— Clear Busy Flag in a Supervisor Shadow Stack Token</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLRSSBSY
— Clear Busy Flag in a Supervisor Shadow Stack Token</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F AE /6 CLRSSBSY m64</td>
<td>M</td>
<td>V/V</td>
<td>CET_SS</td>
<td>Clear busy flag in supervisor shadow stack token reference by m64.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>M</td>
<td>N/A</td>
<td>ModRM:r/m (r, w)</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Clear busy flag in supervisor shadow stack token reference by m64. Subsequent to marking the shadow stack as not busy the SSP is loaded with value 0.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF (CR4.CET = 0)
THEN #UD; FI;
IF (IA32_S_CET.SH_STK_EN = 0)
THEN #UD; FI;
IF CPL &gt; 0
THEN GP(0); FI;
SSP_LA = Linear_Address(mem operand)
IF SSP_LA not aligned to 8 bytes
THEN #GP(0); FI;
expected_token_value=SSP_LA|BUSY_BIT (*busybit-bitposition0-mustbeset*)
new_token_value = SSP_LA (* Clear the busy bit *)
IF shadow_stack_lock_cmpxchg8b(SSP_LA, new_token_value, expected_token_value) != expected_token_value
invalid_token := 1; FI
(* Set the CF if invalid token was detected *)
RFLAGS.CF = (invalid_token == 1) ? 1 : 0;
RFLAGS.ZF,PF,AF,OF,SF := 0;
SSP := 0
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>CF is set if an invalid token was detected, else it is cleared. ZF, PF, AF, OF, and SF are cleared.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CR4.CET = 0.</td></tr>
<tr>
<td>IF IA32_S_CET.SH_STK_EN = 0.</td></tr>
<tr>
<td rowspan="5">#GP(0)</td>
<td>If memory operand linear address not aligned to 8 bytes.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If destination is located in a non-writeable segment.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector.</td></tr>
<tr>
<td>If CPL is not 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>The CLRSSBSY instruction is not recognized in real-address mode.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>The CLRSSBSY instruction is not recognized in virtual-8086 mode.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>Same exceptions as in protected mode.</td></tr>
<tr>
<td>#GP(0)</td>
<td>Same exceptions as in protected mode.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr></table>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CR4.CET = 0.</td></tr>
<tr>
<td>IF IA32_S_CET.SH_STK_EN = 0.</td></tr>
<tr>
<td rowspan="4">#GP(0)</td>
<td>If memory operand linear address not aligned to 8 bytes.</td></tr>
<tr>
<td>If CPL is not 0.</td></tr>
<tr>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>If token is invalid.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

97
x86/clts.html Normal file
View file

@ -0,0 +1,97 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLTS
— Clear Task-Switched Flag in CR0</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLTS
— Clear Task-Switched Flag in CR0</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F 06</td>
<td>CLTS</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>Clears TS flag in CR0.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Clears the task-switched (TS) flag in the CR0 register. This instruction is intended for use in operating-system procedures. It is a privileged instruction that can only be executed at a CPL of 0. It is allowed to be executed in real-address mode to allow initialization for protected mode.</p>
<p>The processor sets the TS flag every time a task switch occurs. The flag is used to synchronize the saving of FPU context in multitasking applications. See the description of the TS flag in the section titled “Control Registers” in Chapter 2 of the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 3A, for more information about this flag.</p>
<p>CLTS operation is the same in non-64-bit modes and 64-bit mode.</p>
<p>See Chapter 26, “VMX Non-Root Operation,” of the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 3C, for more information about the behavior of this instruction in VMX non-root operation.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CR0.TS[bit 3] := 0;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The TS flag in CR0 register is cleared.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If the current privilege level is not 0.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>CLTS is not recognized in virtual-8086 mode.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If the CPL is greater than 0.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

95
x86/clui.html Normal file
View file

@ -0,0 +1,95 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLUI
— Clear User Interrupt Flag</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLUI
— Clear User Interrupt Flag</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F 01 EE CLUI</td>
<td>ZO</td>
<td>V/I</td>
<td>UINTR</td>
<td>Clear user interrupt flag; user interrupts blocked when user interrupt flag cleared.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h3 id="description">Description<a class="anchor" href="#description">
</a></h3>
<p>CLUI clears the user interrupt flag (UIF). Its effect takes place immediately: a user interrupt cannot be delivered on the instruction boundary following CLUI.</p>
<p>An execution of CLUI inside a transactional region causes a transactional abort; the abort loads EAX as it would have had it been caused due to an execution of CLI.</p>
<h3 id="operation">Operation<a class="anchor" href="#operation">
</a></h3>
<pre>UIF := 0;
</pre>
<h3 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h3>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>The CLUI instruction is not recognized in protected mode.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>The CLUI instruction is not recognized in real-address mode.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>The CLUI instruction is not recognized in virtual-8086 mode.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>The CLUI instruction is not recognized in compatibility mode.</td></tr></table>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="4">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If executed inside an enclave.</td></tr>
<tr>
<td>If CR4.UINTR = 0.</td></tr>
<tr>
<td>If CPUID.07H.0H:EDX.UINTR[bit 5] = 0.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

124
x86/clwb.html Normal file
View file

@ -0,0 +1,124 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CLWB
— Cache Line Write Back</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CLWB
— Cache Line Write Back</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F AE /6 CLWB m8</td>
<td>M</td>
<td>V/V</td>
<td>CLWB</td>
<td>Writes back modified cache line containing m8, and may retain the line in cache hierarchy in non-modified state.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<sup>1</sup><a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<blockquote>
<p>1. The Mod field of the ModR/M byte cannot have value 11B.</p></blockquote>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>M</td>
<td>ModRM:r/m (w)</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Writes back to memory the cache line (if modified) that contains the linear address specified with the memory operand from any level of the cache hierarchy in the cache coherence domain. The line may be retained in the cache hierarchy in non-modified state. Retaining the line in the cache hierarchy is a performance optimization (treated as a hint by hardware) to reduce the possibility of cache miss on a subsequent access. Hardware may choose to retain the line at any of the levels in the cache hierarchy, and in some cases, may invalidate the line from the cache hierarchy. The source operand is a byte memory location.</p>
<p>The availability of CLWB instruction is indicated by the presence of the CPUID feature flag CLWB (bit 24 of the EBX register, see “CPUID — CPU Identification” in this chapter). The aligned cache line size affected is also indicated with the CPUID instruction (bits 8 through 15 of the EBX register when the initial value in the EAX register is 1).</p>
<p>The memory attribute of the page containing the affected line has no effect on the behavior of this instruction. It should be noted that processors are free to speculatively fetch and cache data from system memory regions that are assigned a memory-type allowing for speculative reads (such as, the WB, WC, and WT memory types). PREFETCHh instructions can be used to provide the processor with hints for this speculative behavior. Because this speculative fetching can occur at any time and is not tied to instruction execution, the CLWB instruction is not ordered with respect to PREFETCHh instructions or any of the speculative fetching mechanisms (that is, data can be speculatively loaded into a cache line just before, during, or after the execution of a CLWB instruction that references the cache line).</p>
<p>Executions of the CLWB instruction are ordered with respect to fence instructions and to locked read-modify-write instructions; they are also ordered with respect to older writes to the cache line being written back. They are not ordered with respect to other executions of CLWB, to executions of CLFLUSH and CLFLUSHOPT, or to younger writes to the cache line being written back. Software can use the SFENCE instruction to order an execution of CLWB relative to one of those operations.</p>
<p>For usages that require only writing back modified data from cache lines to memory (do not require the line to be invalidated), and expect to subsequently access the data, software is recommended to use CLWB (with appropriate fencing) instead of CLFLUSH or CLFLUSHOPT for improved performance.</p>
<p>The CLWB instruction can be used at all privilege levels and is subject to all permission checking and faults associated with a byte load. Like a load, the CLWB instruction sets the accessed flag but not the dirty flag in the page tables.</p>
<p>In some implementations, the CLWB instruction may always cause transactional abort with Transactional Synchronization Extensions (TSX). CLWB instruction is not expected to be commonly used inside typical transactional regions. However, programmers must not rely on CLWB instruction to force a transactional abort, since whether they cause transactional abort is implementation dependent.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>Cache_Line_Write_Back(m8);
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 id="c-c++-compiler-intrinsic-equivalent">C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>CLWB void _mm_clwb(void const *p);
</pre>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.CLWB[bit 24] = 0.</td></tr>
<tr>
<td>#GP(0)</td>
<td>For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments.</td></tr>
<tr>
<td>#SS(0)</td>
<td>For an illegal address in the SS segment.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.CLWB[bit 24] = 0.</td></tr>
<tr>
<td>#GP</td>
<td>If any part of the operand lies outside the effective address space from 0 to FFFFH.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<p>Same exceptions as in real address mode.</p>
<table>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#UD</td>
<td>If the LOCK prefix is used.</td></tr>
<tr>
<td>If CPUID.(EAX=07H, ECX=0H):EBX.CLWB[bit 24] = 0.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>For a page fault.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

57
x86/cmc.html Normal file
View file

@ -0,0 +1,57 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMC
— Complement Carry Flag</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMC
— Complement Carry Flag</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>F5</td>
<td>CMC</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>Complement CF flag.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Complements the CF flag in the EFLAGS register. CMC operation is the same in non-64-bit modes and 64-bit mode.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>EFLAGS.CF[bit 0] := NOT EFLAGS.CF[bit 0];
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The CF flag contains the complement of its original value. The OF, ZF, SF, AF, and PF flags are unaffected.</p>
<h2 id="exceptions--all-operating-modes-">Exceptions (All Operating Modes)<a class="anchor" href="#exceptions--all-operating-modes-">
</a></h2>
<p>#UD If the LOCK prefix is used.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

763
x86/cmovcc.html Normal file
View file

@ -0,0 +1,763 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMOVcc
— Conditional Move</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMOVcc
— Conditional Move</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-Bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F 47 /r</td>
<td>CMOVA r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if above (CF=0 and ZF=0).</td></tr>
<tr>
<td>0F 47 /r</td>
<td>CMOVA r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if above (CF=0 and ZF=0).</td></tr>
<tr>
<td>REX.W + 0F 47 /r</td>
<td>CMOVA r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if above (CF=0 and ZF=0).</td></tr>
<tr>
<td>0F 43 /r</td>
<td>CMOVAE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if above or equal (CF=0).</td></tr>
<tr>
<td>0F 43 /r</td>
<td>CMOVAE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if above or equal (CF=0).</td></tr>
<tr>
<td>REX.W + 0F 43 /r</td>
<td>CMOVAE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if above or equal (CF=0).</td></tr>
<tr>
<td>0F 42 /r</td>
<td>CMOVB r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if below (CF=1).</td></tr>
<tr>
<td>0F 42 /r</td>
<td>CMOVB r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if below (CF=1).</td></tr>
<tr>
<td>REX.W + 0F 42 /r</td>
<td>CMOVB r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if below (CF=1).</td></tr>
<tr>
<td>0F 46 /r</td>
<td>CMOVBE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if below or equal (CF=1 or ZF=1).</td></tr>
<tr>
<td>0F 46 /r</td>
<td>CMOVBE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if below or equal (CF=1 or ZF=1).</td></tr>
<tr>
<td>REX.W + 0F 46 /r</td>
<td>CMOVBE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if below or equal (CF=1 or ZF=1).</td></tr>
<tr>
<td>0F 42 /r</td>
<td>CMOVC r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if carry (CF=1).</td></tr>
<tr>
<td>0F 42 /r</td>
<td>CMOVC r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if carry (CF=1).</td></tr>
<tr>
<td>REX.W + 0F 42 /r</td>
<td>CMOVC r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if carry (CF=1).</td></tr>
<tr>
<td>0F 44 /r</td>
<td>CMOVE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if equal (ZF=1).</td></tr>
<tr>
<td>0F 44 /r</td>
<td>CMOVE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if equal (ZF=1).</td></tr>
<tr>
<td>REX.W + 0F 44 /r</td>
<td>CMOVE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if equal (ZF=1).</td></tr>
<tr>
<td>0F 4F /r</td>
<td>CMOVG r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if greater (ZF=0 and SF=OF).</td></tr>
<tr>
<td>0F 4F /r</td>
<td>CMOVG r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if greater (ZF=0 and SF=OF).</td></tr>
<tr>
<td>REX.W + 0F 4F /r</td>
<td>CMOVG r64, r/m64</td>
<td>RM</td>
<td>V/N.E.</td>
<td>N/A</td>
<td>Move if greater (ZF=0 and SF=OF).</td></tr>
<tr>
<td>0F 4D /r</td>
<td>CMOVGE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if greater or equal (SF=OF).</td></tr>
<tr>
<td>0F 4D /r</td>
<td>CMOVGE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if greater or equal (SF=OF).</td></tr>
<tr>
<td>REX.W + 0F 4D /r</td>
<td>CMOVGE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if greater or equal (SF=OF).</td></tr>
<tr>
<td>0F 4C /r</td>
<td>CMOVL r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if less (SF≠ OF).</td></tr>
<tr>
<td>0F 4C /r</td>
<td>CMOVL r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if less (SF≠ OF).</td></tr>
<tr>
<td>REX.W + 0F 4C /r</td>
<td>CMOVL r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if less (SF≠ OF).</td></tr>
<tr>
<td>0F 4E /r</td>
<td>CMOVLE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if less or equal (ZF=1 or SF≠ OF).</td></tr>
<tr>
<td>0F 4E /r</td>
<td>CMOVLE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if less or equal (ZF=1 or SF≠ OF).</td></tr>
<tr>
<td>REX.W + 0F 4E /r</td>
<td>CMOVLE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if less or equal (ZF=1 or SF≠ OF).</td></tr>
<tr>
<td>0F 46 /r</td>
<td>CMOVNA r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not above (CF=1 or ZF=1).</td></tr>
<tr>
<td>0F 46 /r</td>
<td>CMOVNA r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not above (CF=1 or ZF=1).</td></tr>
<tr>
<td>REX.W + 0F 46 /r</td>
<td>CMOVNA r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not above (CF=1 or ZF=1).</td></tr>
<tr>
<td>0F 42 /r</td>
<td>CMOVNAE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not above or equal (CF=1).</td></tr>
<tr>
<td>0F 42 /r</td>
<td>CMOVNAE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not above or equal (CF=1).</td></tr>
<tr>
<td>REX.W + 0F 42 /r</td>
<td>CMOVNAE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not above or equal (CF=1).</td></tr>
<tr>
<td>0F 43 /r</td>
<td>CMOVNB r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not below (CF=0).</td></tr>
<tr>
<td>0F 43 /r</td>
<td>CMOVNB r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not below (CF=0).</td></tr>
<tr>
<td>REX.W + 0F 43 /r</td>
<td>CMOVNB r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not below (CF=0).</td></tr>
<tr>
<td>0F 47 /r</td>
<td>CMOVNBE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not below or equal (CF=0 and ZF=0).</td></tr>
<tr>
<td>0F 47 /r</td>
<td>CMOVNBE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not below or equal (CF=0 and ZF=0).</td></tr>
<tr>
<td>REX.W + 0F 47 /r</td>
<td>CMOVNBE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not below or equal (CF=0 and ZF=0).</td></tr>
<tr>
<td>0F 43 /r</td>
<td>CMOVNC r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not carry (CF=0).</td></tr>
<tr>
<td>0F 43 /r</td>
<td>CMOVNC r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not carry (CF=0).</td></tr>
<tr>
<td>REX.W + 0F 43 /r</td>
<td>CMOVNC r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not carry (CF=0).</td></tr>
<tr>
<td>0F 45 /r</td>
<td>CMOVNE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not equal (ZF=0).</td></tr>
<tr>
<td>0F 45 /r</td>
<td>CMOVNE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not equal (ZF=0).</td></tr>
<tr>
<td>REX.W + 0F 45 /r</td>
<td>CMOVNE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not equal (ZF=0).</td></tr>
<tr>
<td>0F 4E /r</td>
<td>CMOVNG r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not greater (ZF=1 or SF≠ OF).</td></tr>
<tr>
<td>0F 4E /r</td>
<td>CMOVNG r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not greater (ZF=1 or SF≠ OF).</td></tr>
<tr>
<td>REX.W + 0F 4E /r</td>
<td>CMOVNG r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not greater (ZF=1 or SF≠ OF).</td></tr>
<tr>
<td>0F 4C /r</td>
<td>CMOVNGE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not greater or equal (SF≠ OF).</td></tr>
<tr>
<td>0F 4C /r</td>
<td>CMOVNGE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not greater or equal (SF≠ OF).</td></tr>
<tr>
<td>REX.W + 0F 4C /r</td>
<td>CMOVNGE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not greater or equal (SF≠ OF).</td></tr>
<tr>
<td>0F 4D /r</td>
<td>CMOVNL r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not less (SF=OF).</td></tr>
<tr>
<td>0F 4D /r</td>
<td>CMOVNL r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not less (SF=OF).</td></tr>
<tr>
<td>REX.W + 0F 4D /r</td>
<td>CMOVNL r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not less (SF=OF).</td></tr>
<tr>
<td>0F 4F /r</td>
<td>CMOVNLE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not less or equal (ZF=0 and SF=OF).</td></tr>
<tr>
<td>0F 4F /r</td>
<td>CMOVNLE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not less or equal (ZF=0 and SF=OF).</td></tr>
<tr>
<td>REX.W + 0F 4F /r</td>
<td>CMOVNLE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not less or equal (ZF=0 and SF=OF).</td></tr>
<tr>
<td>0F 41 /r</td>
<td>CMOVNO r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not overflow (OF=0).</td></tr>
<tr>
<td>0F 41 /r</td>
<td>CMOVNO r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not overflow (OF=0).</td></tr>
<tr>
<td>REX.W + 0F 41 /r</td>
<td>CMOVNO r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not overflow (OF=0).</td></tr>
<tr>
<td>0F 4B /r</td>
<td>CMOVNP r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not parity (PF=0).</td></tr>
<tr>
<td>0F 4B /r</td>
<td>CMOVNP r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not parity (PF=0).</td></tr>
<tr>
<td>REX.W + 0F 4B /r</td>
<td>CMOVNP r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not parity (PF=0).</td></tr>
<tr>
<td>0F 49 /r</td>
<td>CMOVNS r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not sign (SF=0).</td></tr>
<tr>
<td>0F 49 /r</td>
<td>CMOVNS r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not sign (SF=0).</td></tr>
<tr>
<td>REX.W + 0F 49 /r</td>
<td>CMOVNS r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not sign (SF=0).</td></tr>
<tr>
<td>0F 45 /r</td>
<td>CMOVNZ r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not zero (ZF=0).</td></tr>
<tr>
<td>0F 45 /r</td>
<td>CMOVNZ r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if not zero (ZF=0).</td></tr>
<tr>
<td>REX.W + 0F 45 /r</td>
<td>CMOVNZ r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if not zero (ZF=0).</td></tr>
<tr>
<td>0F 40 /r</td>
<td>CMOVO r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if overflow (OF=1).</td></tr>
<tr>
<td>0F 40 /r</td>
<td>CMOVO r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if overflow (OF=1).</td></tr>
<tr>
<td>REX.W + 0F 40 /r</td>
<td>CMOVO r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if overflow (OF=1).</td></tr>
<tr>
<td>0F 4A /r</td>
<td>CMOVP r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if parity (PF=1).</td></tr>
<tr>
<td>0F 4A /r</td>
<td>CMOVP r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if parity (PF=1).</td></tr>
<tr>
<td>REX.W + 0F 4A /r</td>
<td>CMOVP r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if parity (PF=1).</td></tr>
<tr>
<td>0F 4A /r</td>
<td>CMOVPE r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if parity even (PF=1).</td></tr>
<tr>
<td>0F 4A /r</td>
<td>CMOVPE r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if parity even (PF=1).</td></tr>
<tr>
<td>REX.W + 0F 4A /r</td>
<td>CMOVPE r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if parity even (PF=1).</td></tr>
<tr>
<td>0F 4B /r</td>
<td>CMOVPO r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if parity odd (PF=0).</td></tr>
<tr>
<td>0F 4B /r</td>
<td>CMOVPO r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if parity odd (PF=0).</td></tr>
<tr>
<td>REX.W + 0F 4B /r</td>
<td>CMOVPO r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if parity odd (PF=0).</td></tr>
<tr>
<td>0F 48 /r</td>
<td>CMOVS r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if sign (SF=1).</td></tr>
<tr>
<td>0F 48 /r</td>
<td>CMOVS r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if sign (SF=1).</td></tr>
<tr>
<td>REX.W + 0F 48 /r</td>
<td>CMOVS r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if sign (SF=1).</td></tr>
<tr>
<td>0F 44 /r</td>
<td>CMOVZ r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if zero (ZF=1).</td></tr>
<tr>
<td>0F 44 /r</td>
<td>CMOVZ r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Move if zero (ZF=1).</td></tr>
<tr>
<td>REX.W + 0F 44 /r</td>
<td>CMOVZ r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Move if zero (ZF=1).</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Each of the CMOVcc instructions performs a move operation if the status flags in the EFLAGS register (CF, OF, PF, SF, and ZF) are in a specified state (or condition). A condition code (<em>cc</em>) is associated with each instruction to indicate the condition being tested for. If the condition is not satisfied, a move is not performed and execution continues with the instruction following the CMOV<em>cc</em> instruction.</p>
<p>Specifically, CMOVcc loads data from its source operand into a temporary register unconditionally (regardless of the condition code and the status flags in the EFLAGS register). If the condition code associated with the instruction (cc) is satisfied, the data in the temporary register is then copied into the instruction's destination operand.</p>
<p>These instructions can move 16-bit, 32-bit or 64-bit values from memory to a general-purpose register or from one general-purpose register to another. Conditional moves of 8-bit register operands are not supported.</p>
<p>The condition for each CMOV<em>cc</em> mnemonic is given in the description column of the above table. The terms “less” and “greater” are used for comparisons of signed integers and the terms “above” and “below” are used for unsigned integers.</p>
<p>Because a particular state of the status flags can sometimes be interpreted in two ways, two mnemonics are defined for some opcodes. For example, the CMOVA (conditional move if above) instruction and the CMOVNBE (conditional move if not below or equal) instruction are alternate mnemonics for the opcode 0F 47H.</p>
<p>The CMOV<em>cc</em> instructions were introduced in P6 family processors; however, these instructions may not be supported by all IA-32 processors. Software can determine if the CMOV<em>cc</em> instructions are supported by checking the processors feature information with the CPUID instruction (see “CPUID—CPU Identification” in this chapter).</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Use of the REX.R prefix permits access to additional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>temp := SRC
IF condition TRUE
THEN DEST := temp;
ELSE IF (OperandSize = 32 and IA-32e mode active)
THEN DEST[63:32] := 0;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

296
x86/cmp.html Normal file
View file

@ -0,0 +1,296 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMP
— Compare Two Operands</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMP
— Compare Two Operands</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-Bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>3C ib</td>
<td>CMP AL, imm8</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare imm8 with AL.</td></tr>
<tr>
<td>3D iw</td>
<td>CMP AX, imm16</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare imm16 with AX.</td></tr>
<tr>
<td>3D id</td>
<td>CMP EAX, imm32</td>
<td>I</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare imm32 with EAX.</td></tr>
<tr>
<td>REX.W + 3D id</td>
<td>CMP RAX, imm32</td>
<td>I</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare imm32 sign-extended to 64-bits with RAX.</td></tr>
<tr>
<td>80 /7 ib</td>
<td>CMP r/m8, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare imm8 with r/m8.</td></tr>
<tr>
<td>REX + 80 /7 ib</td>
<td>CMP r/m8<sup>*</sup>, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare imm8 with r/m8.</td></tr>
<tr>
<td>81 /7 iw</td>
<td>CMP r/m16, imm16</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare imm16 with r/m16.</td></tr>
<tr>
<td>81 /7 id</td>
<td>CMP r/m32, imm32</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare imm32 with r/m32.</td></tr>
<tr>
<td>REX.W + 81 /7 id</td>
<td>CMP r/m64, imm32</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare imm32 sign-extended to 64-bits with r/m64.</td></tr>
<tr>
<td>83 /7 ib</td>
<td>CMP r/m16, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare imm8 with r/m16.</td></tr>
<tr>
<td>83 /7 ib</td>
<td>CMP r/m32, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare imm8 with r/m32.</td></tr>
<tr>
<td>REX.W + 83 /7 ib</td>
<td>CMP r/m64, imm8</td>
<td>MI</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare imm8 with r/m64.</td></tr>
<tr>
<td>38 /r</td>
<td>CMP r/m8, r8</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare r8 with r/m8.</td></tr>
<tr>
<td>REX + 38 /r</td>
<td>CMP r/m8<sup>*</sup>, r8<sup>*</sup></td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare r8 with r/m8.</td></tr>
<tr>
<td>39 /r</td>
<td>CMP r/m16, r16</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare r16 with r/m16.</td></tr>
<tr>
<td>39 /r</td>
<td>CMP r/m32, r32</td>
<td>MR</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare r32 with r/m32.</td></tr>
<tr>
<td>REX.W + 39 /r</td>
<td>CMP r/m64,r64</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare r64 with r/m64.</td></tr>
<tr>
<td>3A /r</td>
<td>CMP r8, r/m8</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare r/m8 with r8.</td></tr>
<tr>
<td>REX + 3A /r</td>
<td>CMP r8<sup>*</sup>, r/m8<sup>*</sup></td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare r/m8 with r8.</td></tr>
<tr>
<td>3B /r</td>
<td>CMP r16, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare r/m16 with r16.</td></tr>
<tr>
<td>3B /r</td>
<td>CMP r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Compare r/m32 with r32.</td></tr>
<tr>
<td>REX.W + 3B /r</td>
<td>CMP r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare r/m64 with r64.</td></tr></table>
<blockquote>
<p>* In64-bitmode,r/m8cannotbeencodedtoaccessthefollowingbyteregistersifaREXprefixisused:AH,BH,CH,DH.</p></blockquote>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (r)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>MI</td>
<td>ModRM:r/m (r)</td>
<td>imm8/16/32</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>I</td>
<td>AL/AX/EAX/RAX (r)</td>
<td>imm8/16/32</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compares the first source operand with the second source operand and sets the status flags in the EFLAGS register according to the results. The comparison is performed by subtracting the second operand from the first operand and then setting the status flags in the same manner as the SUB instruction. When an immediate value is used as an operand, it is sign-extended to the length of the first operand.</p>
<p>The condition codes used by the J<em>cc</em>, CMOV<em>cc</em>, and SET<em>cc</em> instructions are based on the results of a CMP instruction. Appendix B, “EFLAGS Condition Codes,” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 1, shows the relationship of the status flags and the condition codes.</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Use of the REX.R prefix permits access to additional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>temp := SRC1 SignExtend(SRC2);
ModifyStatusFlags; (* Modify status flags in the same manner as the SUB instruction*)
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The CF, OF, SF, ZF, AF, and PF flags are set according to the result.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

680
x86/cmppd.html Normal file
View file

@ -0,0 +1,680 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMPPD
— Compare Packed Double Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMPPD
— Compare Packed Double Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F C2 /r ib CMPPD xmm1, xmm2/m128, imm8</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Compare packed double precision floating-point values in xmm2/m128 and xmm1 using bits 2:0 of imm8 as a comparison predicate.</td></tr>
<tr>
<td>VEX.128.66.0F.WIG C2 /r ib VCMPPD xmm1, xmm2, xmm3/m128, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Compare packed double precision floating-point values in xmm3/m128 and xmm2 using bits 4:0 of imm8 as a comparison predicate.</td></tr>
<tr>
<td>VEX.256.66.0F.WIG C2 /r ib VCMPPD ymm1, ymm2, ymm3/m256, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Compare packed double precision floating-point values in ymm3/m256 and ymm2 using bits 4:0 of imm8 as a comparison predicate.</td></tr>
<tr>
<td>EVEX.128.66.0F.W1 C2 /r ib VCMPPD k1 {k2}, xmm2, xmm3/m128/m64bcst, imm8</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Compare packed double precision floating-point values in xmm3/m128/m64bcst and xmm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W1 C2 /r ib VCMPPD k1 {k2}, ymm2, ymm3/m256/m64bcst, imm8</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Compare packed double precision floating-point values in ymm3/m256/m64bcst and ymm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W1 C2 /r ib VCMPPD k1 {k2}, zmm2, zmm3/m512/m64bcst{sae}, imm8</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Compare packed double precision floating-point values in zmm3/m512/m64bcst and zmm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td></tr>
<tr>
<td>C</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs a SIMD compare of the packed double precision floating-point values in the second source operand and the first source operand and returns the result of the comparison to the destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each pair of packed values in the two source operands.</p>
<p>EVEX encoded versions: The first source operand (second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand (first operand) is an opmask register. Comparison results are written to the destination operand under the writemask k2. Each comparison result is a single mask bit of 1 (comparison true) or 0 (comparison false).</p>
<p>VEX.256 encoded version: The first source operand (second operand) is a YMM register. The second source operand (third operand) can be a YMM register or a 256-bit memory location. The destination operand (first operand) is a YMM register. Four comparisons are performed with results written to the destination operand. The result of each comparison is a quadword mask of all 1s (comparison true) or all 0s (comparison false).</p>
<p>128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The second source operand (second operand) can be an XMM register or 128-bit memory location. Bits (MAXVL-1:128) of the corresponding ZMM destination register remain unchanged. Two comparisons are performed with results written to bits 127:0 of the destination operand. The result of each comparison is a quadword mask of all 1s (comparison true) or all 0s (comparison false).</p>
<p>VEX.128 encoded version: The first source operand (second operand) is an XMM register. The second source operand (third operand) can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination ZMM register are zeroed. Two comparisons are performed with results written to bits 127:0 of the destination operand.</p>
<p>The comparison predicate operand is an 8-bit immediate:</p>
<ul>
<li>For instructions encoded using the VEX or EVEX prefix, bits 4:0 define the type of comparison to be performed (see <a href='cmppd.html#tbl-3-1'>Table 3-1</a>). Bits 5 through 7 of the immediate are reserved.</li>
<li>For instruction encodings that do not use VEX prefix, bits 2:0 define the type of comparison to be made (see the first 8 rows of <a href='cmppd.html#tbl-3-1'>Table 3-1</a>). Bits 3 through 7 of the immediate are reserved.</li></ul>
<figure id="tbl-3-1">
<table>
<tr>
<th rowspan="2">Predicate</th>
<th rowspan="2">imm8 Value</th>
<th rowspan="2">Description</th>
<th colspan="4">Result: A Is 1st Operand, B Is 2nd Operand</th>
<th rowspan="2">Signals #IA on QNAN</th></tr>
<tr>
<th>A &gt;B</th>
<th>A&lt;B</th>
<th>A=B</th>
<th>Unordered<sup>1</sup></th></tr>
<tr>
<td>EQ_OQ (EQ)</td>
<td>0H</td>
<td>Equal (ordered, non-signaling)</td>
<td>False</td>
<td>False</td>
<td>True</td>
<td>False</td>
<td>No</td></tr>
<tr>
<td>LT_OS (LT)</td>
<td>1H</td>
<td>Less-than (ordered, signaling)</td>
<td>False</td>
<td>True</td>
<td>False</td>
<td>False</td>
<td>Yes</td></tr>
<tr>
<td>LE_OS (LE)</td>
<td>2H</td>
<td>Less-than-or-equal (ordered, signaling)</td>
<td>False</td>
<td>True</td>
<td>True</td>
<td>False</td>
<td>Yes</td></tr>
<tr>
<td>UNORD_Q (UNORD)</td>
<td>3H</td>
<td>Unordered (non-signaling)</td>
<td>False</td>
<td>False</td>
<td>False</td>
<td>True</td>
<td>No</td></tr>
<tr>
<td>NEQ_UQ (NEQ)</td>
<td>4H</td>
<td>Not-equal (unordered, non-signaling)</td>
<td>True</td>
<td>True</td>
<td>False</td>
<td>True</td>
<td>No</td></tr>
<tr>
<td>NLT_US (NLT)</td>
<td>5H</td>
<td>Not-less-than (unordered, signaling)</td>
<td>True</td>
<td>False</td>
<td>True</td>
<td>True</td>
<td>Yes</td></tr>
<tr>
<td>NLE_US (NLE)</td>
<td>6H</td>
<td>Not-less-than-or-equal (unordered, signaling)</td>
<td>True</td>
<td>False</td>
<td>False</td>
<td>True</td>
<td>Yes</td></tr>
<tr>
<td>ORD_Q (ORD)</td>
<td>7H</td>
<td>Ordered (non-signaling)</td>
<td>True</td>
<td>True</td>
<td>True</td>
<td>False</td>
<td>No</td></tr>
<tr>
<td>EQ_UQ</td>
<td>8H</td>
<td>Equal (unordered, non-signaling)</td>
<td>False</td>
<td>False</td>
<td>True</td>
<td>True</td>
<td>No</td></tr>
<tr>
<td>NGE_US (NGE)</td>
<td>9H</td>
<td>Not-greater-than-or-equal (unordered, signaling)</td>
<td>False</td>
<td>True</td>
<td>False</td>
<td>True</td>
<td>Yes</td></tr>
<tr>
<td>NGT_US (NGT)</td>
<td>AH</td>
<td>Not-greater-than (unordered, signaling)</td>
<td>False</td>
<td>True</td>
<td>True</td>
<td>True</td>
<td>Yes</td></tr>
<tr>
<td>FALSE_OQ(FALSE)</td>
<td>BH</td>
<td>False (ordered, non-signaling)</td>
<td>False</td>
<td>False</td>
<td>False</td>
<td>False</td>
<td>No</td></tr>
<tr>
<td>NEQ_OQ</td>
<td>CH</td>
<td>Not-equal (ordered, non-signaling)</td>
<td>True</td>
<td>True</td>
<td>False</td>
<td>False</td>
<td>No</td></tr>
<tr>
<td>GE_OS (GE)</td>
<td>DH</td>
<td>Greater-than-or-equal (ordered, signaling)</td>
<td>True</td>
<td>False</td>
<td>True</td>
<td>False</td>
<td>Yes</td></tr>
<tr>
<td>GT_OS (GT)</td>
<td>EH</td>
<td>Greater-than (ordered, signaling)</td>
<td>True</td>
<td>False</td>
<td>False</td>
<td>False</td>
<td>Yes</td></tr>
<tr>
<td>TRUE_UQ(TRUE)</td>
<td>FH</td>
<td>True (unordered, non-signaling)</td>
<td>True</td>
<td>True</td>
<td>True</td>
<td>True</td>
<td>No</td></tr>
<tr>
<td>EQ_OS</td>
<td>10H</td>
<td>Equal (ordered, signaling)</td>
<td>False</td>
<td>False</td>
<td>True</td>
<td>False</td>
<td>Yes</td></tr>
<tr>
<td>LT_OQ</td>
<td>11H</td>
<td>Less-than (ordered, nonsignaling)</td>
<td>False</td>
<td>True</td>
<td>False</td>
<td>False</td>
<td>No</td></tr>
<tr>
<td>LE_OQ</td>
<td>12H</td>
<td>Less-than-or-equal (ordered, nonsignaling)</td>
<td>False</td>
<td>True</td>
<td>True</td>
<td>False</td>
<td>No</td></tr>
<tr>
<td>UNORD_S</td>
<td>13H</td>
<td>Unordered (signaling)</td>
<td>False</td>
<td>False</td>
<td>False</td>
<td>True</td>
<td>Yes</td></tr>
<tr>
<td>NEQ_US</td>
<td>14H</td>
<td>Not-equal (unordered, signaling)</td>
<td>True</td>
<td>True</td>
<td>False</td>
<td>True</td>
<td>Yes</td></tr>
<tr>
<td>NLT_UQ</td>
<td>15H</td>
<td>Not-less-than (unordered, nonsignaling)</td>
<td>True</td>
<td>False</td>
<td>True</td>
<td>True</td>
<td>No</td></tr>
<tr>
<td>NLE_UQ</td>
<td>16H</td>
<td>Not-less-than-or-equal (unordered, nonsignaling)</td>
<td>True</td>
<td>False</td>
<td>False</td>
<td>True</td>
<td>No</td></tr>
<tr>
<td>ORD_S</td>
<td>17H</td>
<td>Ordered (signaling)</td>
<td>True</td>
<td>True</td>
<td>True</td>
<td>False</td>
<td>Yes</td></tr>
<tr>
<td>EQ_US</td>
<td>18H</td>
<td>Equal (unordered, signaling)</td>
<td>False</td>
<td>False</td>
<td>True</td>
<td>True</td>
<td>Yes</td></tr>
<tr>
<td>NGE_UQ</td>
<td>19H</td>
<td>Not-greater-than-or-equal (unordered, non-signaling)</td>
<td>False</td>
<td>True</td>
<td>False</td>
<td>True</td>
<td>No</td></tr>
<tr>
<td>NGT_UQ</td>
<td>1AH</td>
<td>Not-greater-than (unordered, nonsignaling)</td>
<td>False</td>
<td>True</td>
<td>True</td>
<td>True</td>
<td>No</td></tr>
<tr>
<td>FALSE_OS</td>
<td>1BH</td>
<td>False (ordered, signaling)</td>
<td>False</td>
<td>False</td>
<td>False</td>
<td>False</td>
<td>Yes</td></tr>
<tr>
<td>NEQ_OS</td>
<td>1CH</td>
<td>Not-equal (ordered, signaling)</td>
<td>True</td>
<td>True</td>
<td>False</td>
<td>False</td>
<td>Yes</td></tr>
<tr>
<td>GE_OQ</td>
<td>1DH</td>
<td>Greater-than-or-equal (ordered, nonsignaling)</td>
<td>True</td>
<td>False</td>
<td>True</td>
<td>False</td>
<td>No</td></tr>
<tr>
<td>GT_OQ</td>
<td>1EH</td>
<td>Greater-than (ordered, nonsignaling)</td>
<td>True</td>
<td>False</td>
<td>False</td>
<td>False</td>
<td>No</td></tr>
<tr>
<td>TRUE_US</td>
<td>1FH</td>
<td>True (unordered, signaling)</td>
<td>True</td>
<td>True</td>
<td>True</td>
<td>True</td>
<td>Yes</td></tr></table>
<figcaption><a href='cmppd.html#tbl-3-1'>Table 3-1</a>. Comparison Predicate for CMPPD and CMPPS Instructions</figcaption></figure>
<blockquote>
<p>1. If either operand A or B is a NAN.</p></blockquote>
<p>The unordered relationship is true when at least one of the two source operands being compared is a NaN; the ordered relationship is true when neither source operand is a NaN.</p>
<p>A subsequent computational instruction that uses the mask result in the destination operand as an input operand will not generate an exception, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask of all 1s corresponds to a QNaN.</p>
<p>Note that processors with “CPUID.1H:ECX.AVX =0” do not implement the “greater-than”, “greater-than-or-equal”, “not-greater than”, and “not-greater-than-or-equal relations” predicates. These comparisons can be made either by using the inverse relationship (that is, use the “not-less-than-or-equal” to make a “greater-than” comparison) or by using software emulation. When using software emulation, the program must swap the operands (copying registers when necessary to protect the data that will now be in the destination), and then perform the compare using a different predicate. The predicate to be used for these emulations is listed in the first 8 rows of <a href='cli.html#tbl-3-7'>Table 3-7</a> (Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 2A) under the heading Emulation.</p>
<p>Compilers and assemblers may implement the following two-operand pseudo-ops in addition to the three-operand CMPPD instruction, for processors with “CPUID.1H:ECX.AVX =0”. See <a href='cmppd.html#tbl-3-2'>Table 3-2</a>. The compiler should treat reserved imm8 values as illegal syntax.</p>
<figure id="tbl-3-2">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPPD Implementation</th></tr>
<tr>
<td>CMPEQPD xmm1, xmm2</td>
<td>CMPPD xmm1, xmm2, 0</td></tr>
<tr>
<td>CMPLTPD xmm1, xmm2</td>
<td>CMPPD xmm1, xmm2, 1</td></tr>
<tr>
<td>CMPLEPD xmm1, xmm2</td>
<td>CMPPD xmm1, xmm2, 2</td></tr>
<tr>
<td>CMPUNORDPD xmm1, xmm2</td>
<td>CMPPD xmm1, xmm2, 3</td></tr>
<tr>
<td>CMPNEQPD xmm1, xmm2</td>
<td>CMPPD xmm1, xmm2, 4</td></tr>
<tr>
<td>CMPNLTPD xmm1, xmm2</td>
<td>CMPPD xmm1, xmm2, 5</td></tr>
<tr>
<td>CMPNLEPD xmm1, xmm2</td>
<td>CMPPD xmm1, xmm2, 6</td></tr>
<tr>
<td>CMPORDPD xmm1, xmm2</td>
<td>CMPPD xmm1, xmm2, 7</td></tr></table>
<figcaption><a href='cmppd.html#tbl-3-2'>Table 3-2</a>. Pseudo-Op and CMPPD Implementation </figcaption></figure>
<p>The greater-than relations that the processor does not implement require more than one instruction to emulate in software and therefore should not be implemented as pseudo-ops. (For these, the programmer should reverse the operands of the corresponding less than relations and use move instructions to ensure that the mask is moved to the correct destination register and that the source operand is left intact.)</p>
<p>Processors with “CPUID.1H:ECX.AVX =1” implement the full complement of 32 predicates shown in <a href='cmppd.html#tbl-3-3'>Table 3-3</a>, software emulation is no longer needed. Compilers and assemblers may implement the following three-operand pseudo-ops in addition to the four-operand VCMPPD instruction. See <a href='cmppd.html#tbl-3-3'>Table 3-3</a>, where the notations of reg1 reg2, and reg3 represent either XMM registers or YMM registers. The compiler should treat reserved imm8 values as</p>
<p>illegal syntax. Alternately, intrinsics can map the pseudo-ops to pre-defined constants to support a simpler intrinsic interface. Compilers and assemblers may implement three-operand pseudo-ops for EVEX encoded VCMPPD instructions in a similar fashion by extending the syntax listed in <a href='cmppd.html#tbl-3-3'>Table 3-3</a>.</p>
<figure id="tbl-3-3">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPPD Implementation</th></tr>
<tr>
<td>VCMPEQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 0</em></td></tr>
<tr>
<td>VCMPLTPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 1</em></td></tr>
<tr>
<td>VCMPLEPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 2</em></td></tr>
<tr>
<td>VCMPUNORDPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 3</em></td></tr>
<tr>
<td>VCMPNEQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 4</em></td></tr>
<tr>
<td>VCMPNLTPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 5</em></td></tr>
<tr>
<td>VCMPNLEPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 6</em></td></tr>
<tr>
<td>VCMPORDPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 7</em></td></tr>
<tr>
<td>VCMPEQ_UQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 8</em></td></tr>
<tr>
<td>VCMPNGEPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 9</em></td></tr>
<tr>
<td>VCMPNGTPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 0AH</em></td></tr>
<tr>
<td>VCMPFALSEPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 0BH</em></td></tr>
<tr>
<td>VCMPNEQ_OQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 0CH</em></td></tr>
<tr>
<td>VCMPGEPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 0DH</em></td></tr>
<tr>
<td>VCMPGTPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 0EH</em></td></tr>
<tr>
<td>VCMPTRUEPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 0FH</em></td></tr>
<tr>
<td>VCMPEQ_OSPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 10H</em></td></tr>
<tr>
<td>VCMPLT_OQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 11H</em></td></tr>
<tr>
<td>VCMPLE_OQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 12H</em></td></tr>
<tr>
<td>VCMPUNORD_SPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 13H</em></td></tr>
<tr>
<td>VCMPNEQ_USPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 14H</em></td></tr>
<tr>
<td>VCMPNLT_UQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 15H</em></td></tr>
<tr>
<td>VCMPNLE_UQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 16H</em></td></tr>
<tr>
<td>VCMPORD_SPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 17H</em></td></tr>
<tr>
<td>VCMPEQ_USPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 18H</em></td></tr>
<tr>
<td>VCMPNGE_UQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 19H</em></td></tr>
<tr>
<td>VCMPNGT_UQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 1AH</em></td></tr>
<tr>
<td>VCMPFALSE_OSPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 1BH</em></td></tr>
<tr>
<td>VCMPNEQ_OSPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 1CH</em></td></tr>
<tr>
<td>VCMPGE_OQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 1DH</em></td></tr>
<tr>
<td>VCMPGT_OQPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 1EH</em></td></tr>
<tr>
<td>VCMPTRUE_USPD <em>reg1, reg2, reg3</em></td>
<td>VCMPPD <em>reg1, reg2, reg3, 1FH</em></td></tr></table>
<figcaption><a href='cmppd.html#tbl-3-3'>Table 3-3</a>. Pseudo-Op and VCMPPD Implementation </figcaption></figure>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CASE (COMPARISON PREDICATE) OF
0: OP3 := EQ_OQ; OP5 := EQ_OQ;
1: OP3 := LT_OS; OP5 := LT_OS;
2: OP3 := LE_OS; OP5 := LE_OS;
3: OP3 := UNORD_Q; OP5 := UNORD_Q;
4: OP3 := NEQ_UQ; OP5 := NEQ_UQ;
5: OP3 := NLT_US; OP5 := NLT_US;
6: OP3 := NLE_US; OP5 := NLE_US;
7: OP3 := ORD_Q; OP5 := ORD_Q;
8: OP5 := EQ_UQ;
9: OP5 := NGE_US;
10: OP5 := NGT_US;
11: OP5 := FALSE_OQ;
12: OP5 := NEQ_OQ;
13: OP5 := GE_OS;
14: OP5 := GT_OS;
15: OP5 := TRUE_UQ;
16: OP5 := EQ_OS;
17: OP5 := LT_OQ;
18: OP5 := LE_OQ;
19: OP5 := UNORD_S;
20: OP5 := NEQ_US;
21: OP5 := NLT_UQ;
22: OP5 := NLE_UQ;
23: OP5 := ORD_S;
24: OP5 := EQ_US;
25: OP5 := NGE_UQ;
26: OP5 := NGT_UQ;
27: OP5 := FALSE_OS;
28: OP5 := NEQ_OS;
29: OP5 := GE_OQ;
30: OP5 := GT_OQ;
31: OP5 := TRUE_US;
DEFAULT: Reserved;
ESAC;
</pre>
<h3 id="vcmppd--evex-encoded-versions-">VCMPPD (EVEX Encoded Versions)<a class="anchor" href="#vcmppd--evex-encoded-versions-">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
IF k2[j] OR *no writemask*
THEN
IF (EVEX.b = 1) AND (SRC2 *is memory*)
THEN
CMP := SRC1[i+63:i] OP5 SRC2[63:0]
ELSE
CMP := SRC1[i+63:i] OP5 SRC2[i+63:i]
FI;
IF CMP = TRUE
THEN DEST[j] := 1;
ELSE DEST[j] := 0; FI;
ELSE DEST[j] := 0
; zeroing-masking only
FI;
ENDFOR
DEST[MAX_KL-1:KL] := 0
</pre>
<h3 id="vcmppd--vex-256-encoded-version-">VCMPPD (VEX.256 Encoded Version)<a class="anchor" href="#vcmppd--vex-256-encoded-version-">
</a></h3>
<pre>CMP0 := SRC1[63:0] OP5 SRC2[63:0];
CMP1 := SRC1[127:64] OP5 SRC2[127:64];
CMP2 := SRC1[191:128] OP5 SRC2[191:128];
CMP3 := SRC1[255:192] OP5 SRC2[255:192];
IF CMP0 = TRUE
THEN DEST[63:0] := FFFFFFFFFFFFFFFFH;
ELSE DEST[63:0] := 0000000000000000H; FI;
IF CMP1 = TRUE
THEN DEST[127:64] := FFFFFFFFFFFFFFFFH;
ELSE DEST[127:64] := 0000000000000000H; FI;
IF CMP2 = TRUE
THEN DEST[191:128] := FFFFFFFFFFFFFFFFH;
ELSE DEST[191:128] := 0000000000000000H; FI;
IF CMP3 = TRUE
THEN DEST[255:192] := FFFFFFFFFFFFFFFFH;
ELSE DEST[255:192] := 0000000000000000H; FI;
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vcmppd--vex-128-encoded-version-">VCMPPD (VEX.128 Encoded Version)<a class="anchor" href="#vcmppd--vex-128-encoded-version-">
</a></h3>
<pre>CMP0 := SRC1[63:0] OP5 SRC2[63:0];
CMP1 := SRC1[127:64] OP5 SRC2[127:64];
IF CMP0 = TRUE
THEN DEST[63:0] := FFFFFFFFFFFFFFFFH;
ELSE DEST[63:0] := 0000000000000000H; FI;
IF CMP1 = TRUE
THEN DEST[127:64] := FFFFFFFFFFFFFFFFH;
ELSE DEST[127:64] := 0000000000000000H; FI;
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="cmppd--128-bit-legacy-sse-version-">CMPPD (128-bit Legacy SSE Version)<a class="anchor" href="#cmppd--128-bit-legacy-sse-version-">
</a></h3>
<pre>CMP0 := SRC1[63:0] OP3 SRC2[63:0];
CMP1 := SRC1[127:64] OP3 SRC2[127:64];
IF CMP0 = TRUE
THEN DEST[63:0] := FFFFFFFFFFFFFFFFH;
ELSE DEST[63:0] := 0000000000000000H; FI;
IF CMP1 = TRUE
THEN DEST[127:64] := FFFFFFFFFFFFFFFFH;
ELSE DEST[127:64] := 0000000000000000H; FI;
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCMPPD __mmask8 _mm512_cmp_pd_mask( __m512d a, __m512d b, int imm);
</pre>
<pre>VCMPPD __mmask8 _mm512_cmp_round_pd_mask( __m512d a, __m512d b, int imm, int sae);
</pre>
<pre>VCMPPD __mmask8 _mm512_mask_cmp_pd_mask( __mmask8 k1, __m512d a, __m512d b, int imm);
</pre>
<pre>VCMPPD __mmask8 _mm512_mask_cmp_round_pd_mask( __mmask8 k1, __m512d a, __m512d b, int imm, int sae);
</pre>
<pre>VCMPPD __mmask8 _mm256_cmp_pd_mask( __m256d a, __m256d b, int imm);
</pre>
<pre>VCMPPD __mmask8 _mm256_mask_cmp_pd_mask( __mmask8 k1, __m256d a, __m256d b, int imm);
</pre>
<pre>VCMPPD __mmask8 _mm_cmp_pd_mask( __m128d a, __m128d b, int imm);
</pre>
<pre>VCMPPD __mmask8 _mm_mask_cmp_pd_mask( __mmask8 k1, __m128d a, __m128d b, int imm);
</pre>
<pre>VCMPPD __m256 _mm256_cmp_pd(__m256d a, __m256d b, int imm)
</pre>
<pre>(V)CMPPD __m128 _mm_cmp_pd(__m128d a, __m128d b, int imm)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid if SNaN operand and invalid if QNaN and predicate as listed in <a href='cmppd.html#tbl-3-1'>Table 3-1</a>, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

408
x86/cmpps.html Normal file
View file

@ -0,0 +1,408 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMPPS
— Compare Packed Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMPPS
— Compare Packed Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F C2 /r ib CMPPS xmm1, xmm2/m128, imm8</td>
<td>A</td>
<td>V/V</td>
<td>SSE</td>
<td>Compare packed single precision floating-point values in xmm2/m128 and xmm1 using bits 2:0 of imm8 as a comparison predicate.</td></tr>
<tr>
<td>VEX.128.0F.WIG C2 /r ib VCMPPS xmm1, xmm2, xmm3/m128, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Compare packed single precision floating-point values in xmm3/m128 and xmm2 using bits 4:0 of imm8 as a comparison predicate.</td></tr>
<tr>
<td>VEX.256.0F.WIG C2 /r ib VCMPPS ymm1, ymm2, ymm3/m256, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Compare packed single precision floating-point values in ymm3/m256 and ymm2 using bits 4:0 of imm8 as a comparison predicate.</td></tr>
<tr>
<td>EVEX.128.0F.W0 C2 /r ib VCMPPS k1 {k2}, xmm2, xmm3/m128/m32bcst, imm8</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Compare packed single precision floating-point values in xmm3/m128/m32bcst and xmm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr>
<tr>
<td>EVEX.256.0F.W0 C2 /r ib VCMPPS k1 {k2}, ymm2, ymm3/m256/m32bcst, imm8</td>
<td>C</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Compare packed single precision floating-point values in ymm3/m256/m32bcst and ymm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr>
<tr>
<td>EVEX.512.0F.W0 C2 /r ib VCMPPS k1 {k2}, zmm2, zmm3/m512/m32bcst{sae}, imm8</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Compare packed single precision floating-point values in zmm3/m512/m32bcst and zmm2 using bits 4:0 of imm8 as a comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td></tr>
<tr>
<td>C</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Performs a SIMD compare of the packed single precision floating-point values in the second source operand and the first source operand and returns the result of the comparison to the destination operand. The comparison predicate operand (immediate byte) specifies the type of comparison performed on each of the pairs of packed values.</p>
<p>EVEX encoded versions: The first source operand (second operand) is a ZMM/YMM/XMM register. The second source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand (first operand) is an opmask register. Comparison results are written to the destination operand under the writemask k2. Each comparison result is a single mask bit of 1 (comparison true) or 0 (comparison false).</p>
<p>VEX.256 encoded version: The first source operand (second operand) is a YMM register. The second source operand (third operand) can be a YMM register or a 256-bit memory location. The destination operand (first operand) is a YMM register. Eight comparisons are performed with results written to the destination operand. The result of each comparison is a doubleword mask of all 1s (comparison true) or all 0s (comparison false).</p>
<p>128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The second source operand (second operand) can be an XMM register or 128-bit memory location. Bits (MAXVL-1:128) of the corresponding ZMM destination register remain unchanged. Four comparisons are performed with results written to bits 127:0 of the destination operand. The result of each comparison is a doubleword mask of all 1s (comparison true) or all 0s (comparison false).</p>
<p>VEX.128 encoded version: The first source operand (second operand) is an XMM register. The second source operand (third operand) can be an XMM register or a 128-bit memory location. Bits (MAXVL-1:128) of the destination ZMM register are zeroed. Four comparisons are performed with results written to bits 127:0 of the destination operand.</p>
<p>The comparison predicate operand is an 8-bit immediate:</p>
<ul>
<li>For instructions encoded using the VEX prefix and EVEX prefix, bits 4:0 define the type of comparison to be performed (see <a href='cmppd.html#tbl-3-1'>Table 3-1</a>). Bits 5 through 7 of the immediate are reserved.</li>
<li>For instruction encodings that do not use VEX prefix, bits 2:0 define the type of comparison to be made (see the first 8 rows of <a href='cmppd.html#tbl-3-1'>Table 3-1</a>). Bits 3 through 7 of the immediate are reserved.</li></ul>
<p>The unordered relationship is true when at least one of the two source operands being compared is a NaN; the ordered relationship is true when neither source operand is a NaN.</p>
<p>A subsequent computational instruction that uses the mask result in the destination operand as an input operand will not generate an exception, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask of all 1s corresponds to a QNaN.</p>
<p>Note that processors with “CPUID.1H:ECX.AVX =0” do not implement the “greater-than”, “greater-than-or-equal”, “not-greater than”, and “not-greater-than-or-equal relations” predicates. These comparisons can be made either by using the inverse relationship (that is, use the “not-less-than-or-equal” to make a “greater-than” comparison) or by using software emulation. When using software emulation, the program must swap the operands (copying registers when necessary to protect the data that will now be in the destination), and then perform the compare using a different predicate. The predicate to be used for these emulations is listed in the first 8 rows of <a href='cli.html#tbl-3-7'>Table 3-7</a> (Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 2A) under the heading Emulation.</p>
<p>Compilers and assemblers may implement the following two-operand pseudo-ops in addition to the three-operand CMPPS instruction, for processors with “CPUID.1H:ECX.AVX =0”. See <a href='cmpps.html#tbl-3-4'>Table 3-4</a>. The compiler should treat reserved imm8 values as illegal syntax.</p>
<figure id="tbl-3-4">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPPS Implementation</th></tr>
<tr>
<td>CMPEQPS xmm1, xmm2</td>
<td>CMPPS xmm1, xmm2, 0</td></tr>
<tr>
<td>CMPLTPS xmm1, xmm2</td>
<td>CMPPS xmm1, xmm2, 1</td></tr>
<tr>
<td>CMPLEPS xmm1, xmm2</td>
<td>CMPPS xmm1, xmm2, 2</td></tr>
<tr>
<td>CMPUNORDPS xmm1, xmm2</td>
<td>CMPPS xmm1, xmm2, 3</td></tr>
<tr>
<td>CMPNEQPS xmm1, xmm2</td>
<td>CMPPS xmm1, xmm2, 4</td></tr>
<tr>
<td>CMPNLTPS xmm1, xmm2</td>
<td>CMPPS xmm1, xmm2, 5</td></tr>
<tr>
<td>CMPNLEPS xmm1, xmm2</td>
<td>CMPPS xmm1, xmm2, 6</td></tr>
<tr>
<td>CMPORDPS xmm1, xmm2</td>
<td>CMPPS xmm1, xmm2, 7</td></tr></table>
<figcaption><a href='cmpps.html#tbl-3-4'>Table 3-4</a>. Pseudo-Op and CMPPS Implementation </figcaption></figure>
<p>The greater-than relations that the processor does not implement require more than one instruction to emulate in software and therefore should not be implemented as pseudo-ops. (For these, the programmer should reverse the operands of the corresponding less than relations and use move instructions to ensure that the mask is moved to the correct destination register and that the source operand is left intact.)</p>
<p>Processors with “CPUID.1H:ECX.AVX =1” implement the full complement of 32 predicates shown in <a href='cmpps.html#tbl-3-5'>Table 3-5</a>, software emulation is no longer needed. Compilers and assemblers may implement the following three-operand pseudo-ops in addition to the four-operand VCMPPS instruction. See <a href='cmpps.html#tbl-3-5'>Table 3-5</a>, where the notation of reg1 and reg2 represent either XMM registers or YMM registers. The compiler should treat reserved imm8 values as illegal syntax. Alternately, intrinsics can map the pseudo-ops to pre-defined constants to support a simpler intrinsic interface. Compilers and assemblers may implement three-operand pseudo-ops for EVEX encoded VCMPPS instructions in a similar fashion by extending the syntax listed in <a href='cmpps.html#tbl-3-5'>Table 3-5</a>.</p>
<p>:</p>
<figure id="tbl-3-5">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPPS Implementation</th></tr>
<tr>
<td>VCMPEQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 0</em></td></tr>
<tr>
<td>VCMPLTPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 1</em></td></tr>
<tr>
<td>VCMPLEPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 2</em></td></tr>
<tr>
<td>VCMPUNORDPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 3</em></td></tr>
<tr>
<td>VCMPNEQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 4</em></td></tr>
<tr>
<td>VCMPNLTPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 5</em></td></tr>
<tr>
<td>VCMPNLEPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 6</em></td></tr>
<tr>
<td>VCMPORDPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 7</em></td></tr>
<tr>
<td>VCMPEQ_UQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 8</em></td></tr>
<tr>
<td>VCMPNGEPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 9</em></td></tr>
<tr>
<td>VCMPNGTPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 0AH</em></td></tr>
<tr>
<td>VCMPFALSEPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 0BH</em></td></tr>
<tr>
<td>VCMPNEQ_OQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 0CH</em></td></tr>
<tr>
<td>VCMPGEPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 0DH</em></td></tr>
<tr>
<td>VCMPGTPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 0EH</em></td></tr>
<tr>
<td>VCMPTRUEPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 0FH</em></td></tr>
<tr>
<td>VCMPEQ_OSPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 10H</em></td></tr>
<tr>
<td>VCMPLT_OQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 11H</em></td></tr>
<tr>
<td>VCMPLE_OQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 12H</em></td></tr>
<tr>
<td>VCMPUNORD_SPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 13H</em></td></tr>
<tr>
<td>VCMPNEQ_USPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 14H</em></td></tr>
<tr>
<td>VCMPNLT_UQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 15H</em></td></tr>
<tr>
<td>VCMPNLE_UQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 16H</em></td></tr>
<tr>
<td>VCMPORD_SPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 17H</em></td></tr>
<tr>
<td>VCMPEQ_USPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 18H</em></td></tr>
<tr>
<td>VCMPNGE_UQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 19H</em></td></tr>
<tr>
<td>VCMPNGT_UQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 1AH</em></td></tr>
<tr>
<td>VCMPFALSE_OSPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 1BH</em></td></tr>
<tr>
<td>VCMPNEQ_OSPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 1CH</em></td></tr>
<tr>
<td>VCMPGE_OQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 1DH</em></td></tr>
<tr>
<td>VCMPGT_OQPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 1EH</em></td></tr>
<tr>
<td>VCMPTRUE_USPS r<em>eg1, reg2, reg3</em></td>
<td>VCMPPS r<em>eg1, reg2, reg3, 1FH</em></td></tr></table>
<figcaption><a href='cmpps.html#tbl-3-5'>Table 3-5</a>. Pseudo-Op and VCMPPS Implementation</figcaption></figure>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CASE (COMPARISON PREDICATE) OF
0: OP3 := EQ_OQ; OP5 := EQ_OQ;
1: OP3 := LT_OS; OP5 := LT_OS;
2: OP3 := LE_OS; OP5 := LE_OS;
3: OP3 := UNORD_Q; OP5 := UNORD_Q;
4: OP3 := NEQ_UQ; OP5 := NEQ_UQ;
5: OP3 := NLT_US; OP5 := NLT_US;
6: OP3 := NLE_US; OP5 := NLE_US;
7: OP3 := ORD_Q; OP5 := ORD_Q;
8: OP5 := EQ_UQ;
9: OP5 := NGE_US;
10: OP5 := NGT_US;
11: OP5 := FALSE_OQ;
12: OP5 := NEQ_OQ;
13: OP5 := GE_OS;
14: OP5 := GT_OS;
15: OP5 := TRUE_UQ;
16: OP5 := EQ_OS;
17: OP5 := LT_OQ;
18: OP5 := LE_OQ;
19: OP5 := UNORD_S;
20: OP5 := NEQ_US;
21: OP5 := NLT_UQ;
22: OP5 := NLE_UQ;
23: OP5 := ORD_S;
24: OP5 := EQ_US;
25: OP5 := NGE_UQ;
26: OP5 := NGT_UQ;
27: OP5 := FALSE_OS;
28: OP5 := NEQ_OS;
29: OP5 := GE_OQ;
30: OP5 := GT_OQ;
31: OP5 := TRUE_US;
DEFAULT: Reserved
ESAC;
</pre>
<h3 id="vcmpps--evex-encoded-versions-">VCMPPS (EVEX Encoded Versions)<a class="anchor" href="#vcmpps--evex-encoded-versions-">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k2[j] OR *no writemask*
THEN
IF (EVEX.b = 1) AND (SRC2 *is memory*)
THEN
CMP := SRC1[i+31:i] OP5 SRC2[31:0]
ELSE
CMP := SRC1[i+31:i] OP5 SRC2[i+31:i]
FI;
IF CMP = TRUE
THEN DEST[j] := 1;
ELSE DEST[j] := 0; FI;
ELSE DEST[j] := 0
; zeroing-masking onlyFI;
FI;
ENDFOR
DEST[MAX_KL-1:KL] := 0
</pre>
<h3 id="vcmpps--vex-256-encoded-version-">VCMPPS (VEX.256 Encoded Version)<a class="anchor" href="#vcmpps--vex-256-encoded-version-">
</a></h3>
<pre>CMP0 := SRC1[31:0] OP5 SRC2[31:0];
CMP1 := SRC1[63:32] OP5 SRC2[63:32];
CMP2 := SRC1[95:64] OP5 SRC2[95:64];
CMP3 := SRC1[127:96] OP5 SRC2[127:96];
CMP4 := SRC1[159:128] OP5 SRC2[159:128];
CMP5 := SRC1[191:160] OP5 SRC2[191:160];
CMP6 := SRC1[223:192] OP5 SRC2[223:192];
CMP7 := SRC1[255:224] OP5 SRC2[255:224];
IF CMP0 = TRUE
THEN DEST[31:0] :=FFFFFFFFH;
ELSE DEST[31:0] := 000000000H; FI;
IF CMP1 = TRUE
THEN DEST[63:32] := FFFFFFFFH;
ELSE DEST[63:32] :=000000000H; FI;
IF CMP2 = TRUE
THEN DEST[95:64] := FFFFFFFFH;
ELSE DEST[95:64] := 000000000H; FI;
IF CMP3 = TRUE
THEN DEST[127:96] := FFFFFFFFH;
ELSE DEST[127:96] := 000000000H; FI;
IF CMP4 = TRUE
THEN DEST[159:128] := FFFFFFFFH;
ELSE DEST[159:128] := 000000000H; FI;
IF CMP5 = TRUE
THEN DEST[191:160] := FFFFFFFFH;
ELSE DEST[191:160] := 000000000H; FI;
IF CMP6 = TRUE
THEN DEST[223:192] := FFFFFFFFH;
ELSE DEST[223:192] :=000000000H; FI;
IF CMP7 = TRUE
THEN DEST[255:224] := FFFFFFFFH;
ELSE DEST[255:224] := 000000000H; FI;
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vcmpps--vex-128-encoded-version-">VCMPPS (VEX.128 Encoded Version)<a class="anchor" href="#vcmpps--vex-128-encoded-version-">
</a></h3>
<pre>CMP0 := SRC1[31:0] OP5 SRC2[31:0];
CMP1 := SRC1[63:32] OP5 SRC2[63:32];
CMP2 := SRC1[95:64] OP5 SRC2[95:64];
CMP3 := SRC1[127:96] OP5 SRC2[127:96];
IF CMP0 = TRUE
THEN DEST[31:0] :=FFFFFFFFH;
ELSE DEST[31:0] := 000000000H; FI;
IF CMP1 = TRUE
THEN DEST[63:32] := FFFFFFFFH;
ELSE DEST[63:32] := 000000000H; FI;
IF CMP2 = TRUE
THEN DEST[95:64] := FFFFFFFFH;
ELSE DEST[95:64] := 000000000H; FI;
IF CMP3 = TRUE
THEN DEST[127:96] := FFFFFFFFH;
ELSE DEST[127:96] :=000000000H; FI;
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="cmpps--128-bit-legacy-sse-version-">CMPPS (128-bit Legacy SSE Version)<a class="anchor" href="#cmpps--128-bit-legacy-sse-version-">
</a></h3>
<pre>CMP0 := SRC1[31:0] OP3 SRC2[31:0];
CMP1 := SRC1[63:32] OP3 SRC2[63:32];
CMP2 := SRC1[95:64] OP3 SRC2[95:64];
CMP3 := SRC1[127:96] OP3 SRC2[127:96];
IF CMP0 = TRUE
THEN DEST[31:0] :=FFFFFFFFH;
ELSE DEST[31:0] := 000000000H; FI;
IF CMP1 = TRUE
THEN DEST[63:32] := FFFFFFFFH;
ELSE DEST[63:32] := 000000000H; FI;
IF CMP2 = TRUE
THEN DEST[95:64] := FFFFFFFFH;
ELSE DEST[95:64] := 000000000H; FI;
IF CMP3 = TRUE
THEN DEST[127:96] := FFFFFFFFH;
ELSE DEST[127:96] :=000000000H; FI;
DEST[MAXVL-1:128] (Unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCMPPS __mmask16 _mm512_cmp_ps_mask( __m512 a, __m512 b, int imm);
</pre>
<pre>VCMPPS __mmask16 _mm512_cmp_round_ps_mask( __m512 a, __m512 b, int imm, int sae);
</pre>
<pre>VCMPPS __mmask16 _mm512_mask_cmp_ps_mask( __mmask16 k1, __m512 a, __m512 b, int imm);
</pre>
<pre>VCMPPS __mmask16 _mm512_mask_cmp_round_ps_mask( __mmask16 k1, __m512 a, __m512 b, int imm, int sae);
</pre>
<pre>VCMPPS __mmask8 _mm256_cmp_ps_mask( __m256 a, __m256 b, int imm);
</pre>
<pre>VCMPPS __mmask8 _mm256_mask_cmp_ps_mask( __mmask8 k1, __m256 a, __m256 b, int imm);
</pre>
<pre>VCMPPS __mmask8 _mm_cmp_ps_mask( __m128 a, __m128 b, int imm);
</pre>
<pre>VCMPPS __mmask8 _mm_mask_cmp_ps_mask( __mmask8 k1, __m128 a, __m128 b, int imm);
</pre>
<pre>VCMPPS __m256 _mm256_cmp_ps(__m256 a, __m256 b, int imm)
</pre>
<pre>CMPPS __m128 _mm_cmp_ps(__m128 a, __m128 b, int imm)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid if SNaN operand and invalid if QNaN and predicate as listed in <a href='cmppd.html#tbl-3-1'>Table 3-1</a>, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

View file

@ -0,0 +1,265 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMPS/CMPSB/CMPSW/CMPSD/CMPSQ
— Compare String Operands</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMPS/CMPSB/CMPSW/CMPSD/CMPSQ
— Compare String Operands</h1>
<table>
<tr>
<th>Opcode</th>
<th>Instruction</th>
<th>Op/En</th>
<th>64-Bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>A6</td>
<td>CMPS <em>m8, m8</em></td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>For legacy mode, compare byte at address DS:(E)SI with byte at address ES:(E)DI; For 64-bit mode compare byte at address (R|E)SI to byte at address (R|E)DI. The status flags are set accordingly.</td></tr>
<tr>
<td>A7</td>
<td>CMPS <em>m16</em>, <em>m16</em></td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>For legacy mode, compare word at address DS:(E)SI with word at address ES:(E)DI; For 64-bit mode compare word at address (R|E)SI with word at address (R|E)DI. The status flags are set accordingly.</td></tr>
<tr>
<td>A7</td>
<td>CMPS <em>m32, m32</em></td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>For legacy mode, compare dword at address DS:(E)SI at dword at address ES:(E)DI; For 64-bit mode compare dword at address (R|E)SI at dword at address (R|E)DI. The status flags are set accordingly.</td></tr>
<tr>
<td>REX.W + A7</td>
<td>CMPS <em>m64, m64</em></td>
<td>ZO</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compares quadword at address (R|E)SI with quadword at address (R|E)DI and sets the status flags accordingly.</td></tr>
<tr>
<td>A6</td>
<td>CMPSB</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>For legacy mode, compare byte at address DS:(E)SI with byte at address ES:(E)DI; For 64-bit mode compare byte at address (R|E)SI with byte at address (R|E)DI. The status flags are set accordingly.</td></tr>
<tr>
<td>A7</td>
<td>CMPSW</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>For legacy mode, compare word at address DS:(E)SI with word at address ES:(E)DI; For 64-bit mode compare word at address (R|E)SI with word at address (R|E)DI. The status flags are set accordingly.</td></tr>
<tr>
<td>A7</td>
<td>CMPSD</td>
<td>ZO</td>
<td>Valid</td>
<td>Valid</td>
<td>For legacy mode, compare dword at address DS:(E)SI with dword at address ES:(E)DI; For 64-bit mode compare dword at address (R|E)SI with dword at address (R|E)DI. The status flags are set accordingly.</td></tr>
<tr>
<td>REX.W + A7</td>
<td>CMPSQ</td>
<td>ZO</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compares quadword at address (R|E)SI with quadword at address (R|E)DI and sets the status flags accordingly.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>ZO</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compares the byte, word, doubleword, or quadword specified with the first source operand with the byte, word, doubleword, or quadword specified with the second source operand and sets the status flags in the EFLAGS register according to the results.</p>
<p>Both source operands are located in memory. The address of the first source operand is read from DS:SI, DS:ESI or RSI (depending on the address-size attribute of the instruction is 16, 32, or 64, respectively). The address of the second source operand is read from ES:DI, ES:EDI or RDI (again depending on the address-size attribute of the instruction is 16, 32, or 64). The DS segment may be overridden with a segment override prefix, but the ES segment cannot be overridden.</p>
<p>At the assembly-code level, two forms of this instruction are allowed: the “explicit-operands” form and the “no-operands” form. The explicit-operands form (specified with the CMPS mnemonic) allows the two source operands to be specified explicitly. Here, the source operands should be symbols that indicate the size and location of the source values. This explicit-operand form is provided to allow documentation. However, note that the documentation provided by this form can be misleading. That is, the source operand symbols must specify the correct type (size) of the operands (bytes, words, or doublewords, quadwords), but they do not have to specify the correct loca-</p>
<p>tion. Locations of the source operands are always specified by the DS:(E)SI (or RSI) and ES:(E)DI (or RDI) registers, which must be loaded correctly before the compare string instruction is executed.</p>
<p>The no-operands form provides “short forms” of the byte, word, and doubleword versions of the CMPS instructions. Here also the DS:(E)SI (or RSI) and ES:(E)DI (or RDI) registers are assumed by the processor to specify the location of the source operands. The size of the source operands is selected with the mnemonic: CMPSB (byte comparison), CMPSW (word comparison), CMPSD (doubleword comparison), or CMPSQ (quadword comparison using REX.W).</p>
<p>After the comparison, the (E/R)SI and (E/R)DI registers increment or decrement automatically according to the setting of the DF flag in the EFLAGS register. (If the DF flag is 0, the (E/R)SI and (E/R)DI register increment; if the DF flag is 1, the registers decrement.) The registers increment or decrement by 1 for byte operations, by 2 for word operations, 4 for doubleword operations. If operand size is 64, RSI and RDI registers increment by 8 for quadword operations.</p>
<p>The CMPS, CMPSB, CMPSW, CMPSD, and CMPSQ instructions can be preceded by the REP prefix for block comparisons. More often, however, these instructions will be used in a LOOP construct that takes some action based on the setting of the status flags before the next comparison is made. See “REP/REPE/REPZ /REPNE/REPNZ—Repeat String Operation Prefix” in Chapter 4 of the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 2B, for a description of the REP prefix.</p>
<p>In 64-bit mode, the instructions default address size is 64 bits, 32 bit address size is supported using the prefix 67H. Use of the REX.W prefix promotes doubleword operation to 64 bits (see CMPSQ). See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>temp := SRC1 - SRC2;
SetStatusFlags(temp);
IF (64-Bit Mode)
THEN
IF (Byte comparison)
THEN IF DF = 0
THEN
(R|E)SI := (R|E)SI + 1;
(R|E)DI := (R|E)DI + 1;
ELSE
(R|E)SI := (R|E)SI 1;
(R|E)DI := (R|E)DI 1;
FI;
ELSE IF (Word comparison)
THEN IF DF = 0
THEN
(R|E)SI
:= (R|E)SI + 2;
(R|E)DI
:= (R|E)DI + 2;
ELSE
(R|E)SI
:= (R|E)SI 2;
(R|E)DI
:= (R|E)DI 2;
FI;
ELSE IF (Doubleword
comparison)
THEN IF DF = 0
THEN
(R|E)SI
:= (R|E)SI + 4;
(R|E)DI
:= (R|E)DI + 4;
ELSE
(R|E)SI
:= (R|E)SI 4;
(R|E)DI
:= (R|E)DI 4;
FI;
ELSE (* Quadword comparison *)
THEN IF DF = 0
(R|E)SI := (R|E)SI + 8;
(R|E)DI := (R|E)DI + 8;
ELSE
(R|E)SI := (R|E)SI 8;
(R|E)DI := (R|E)DI 8;
FI;
FI;
ELSE (* Non-64-bit Mode *)
IF (byte comparison)
THEN IF DF = 0
THEN
(E)SI := (E)SI + 1;
(E)DI := (E)DI + 1;
ELSE
(E)SI := (E)SI 1;
(E)DI := (E)DI 1;
FI;
ELSE IF (Word comparison)
THENIFDF =0
(E)SI := (E)SI + 2;
(E)DI := (E)DI + 2;
ELSE
(E)SI := (E)SI 2;
(E)DI := (E)DI 2;
FI;
ELSE (* Doubleword comparison *)
THEN IF DF = 0
(E)SI := (E)SI + 4;
(E)DI := (E)DI + 4;
ELSE
(E)SI := (E)SI 4;
(E)DI := (E)DI 4;
FI;
FI;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The CF, OF, SF, ZF, AF, and PF flags are set according to the temporary result of the comparison.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="2">#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

316
x86/cmpsd.html Normal file
View file

@ -0,0 +1,316 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMPSD
— Compare Scalar Double Precision Floating-Point Value</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMPSD
— Compare Scalar Double Precision Floating-Point Value</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F2 0F C2 /r ib CMPSD xmm1, xmm2/m64, imm8</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Compare low double precision floating-point value in xmm2/m64 and xmm1 using bits 2:0 of imm8 as comparison predicate.</td></tr>
<tr>
<td>VEX.LIG.F2.0F.WIG C2 /r ib VCMPSD xmm1, xmm2, xmm3/m64, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Compare low double precision floating-point value in xmm3/m64 and xmm2 using bits 4:0 of imm8 as comparison predicate.</td></tr>
<tr>
<td>EVEX.LLIG.F2.0F.W1 C2 /r ib VCMPSD k1 {k2}, xmm2, xmm3/m64{sae}, imm8</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Compare low double precision floating-point value in xmm3/m64 and xmm2 using bits 4:0 of imm8 as comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td></tr>
<tr>
<td>C</td>
<td>Tuple1 Scalar</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compares the low double precision floating-point values in the second source operand and the first source operand and returns the result of the comparison to the destination operand. The comparison predicate operand (immediate operand) specifies the type of comparison performed.</p>
<p>128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The second source operand (second operand) can be an XMM register or 64-bit memory location. Bits (MAXVL-1:64) of the corresponding YMM destination register remain unchanged. The comparison result is a quadword mask of all 1s (comparison true) or all 0s (comparison false).</p>
<p>VEX.128 encoded version: The first source operand (second operand) is an XMM register. The second source operand (third operand) can be an XMM register or a 64-bit memory location. The result is stored in the low quadword of the destination operand; the high quadword is filled with the contents of the high quadword of the first source operand. Bits (MAXVL-1:128) of the destination ZMM register are zeroed. The comparison result is a quadword mask of all 1s (comparison true) or all 0s (comparison false).</p>
<p>EVEX encoded version: The first source operand (second operand) is an XMM register. The second source operand can be a XMM register or a 64-bit memory location. The destination operand (first operand) is an opmask register. The comparison result is a single mask bit of 1 (comparison true) or 0 (comparison false), written to the destination starting from the LSB according to the writemask k2. Bits (MAX_KL-1:128) of the destination register are cleared.</p>
<p>The comparison predicate operand is an 8-bit immediate:</p>
<ul>
<li>For instructions encoded using the VEX prefix, bits 4:0 define the type of comparison to be performed (see <a href='cmppd.html#tbl-3-1'>Table 3-1</a>). Bits 5 through 7 of the immediate are reserved.</li>
<li>For instruction encodings that do not use VEX prefix, bits 2:0 define the type of comparison to be made (see the first 8 rows of <a href='cmppd.html#tbl-3-1'>Table 3-1</a>). Bits 3 through 7 of the immediate are reserved.</li></ul>
<p>The unordered relationship is true when at least one of the two source operands being compared is a NaN; the ordered relationship is true when neither source operand is a NaN.</p>
<p>A subsequent computational instruction that uses the mask result in the destination operand as an input operand will not generate an exception, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask of all 1s corresponds to a QNaN.</p>
<p>Note that processors with “CPUID.1H:ECX.AVX =0” do not implement the “greater-than”, “greater-than-or-equal”, “not-greater than”, and “not-greater-than-or-equal relations” predicates. These comparisons can be made either</p>
<p>by using the inverse relationship (that is, use the “not-less-than-or-equal” to make a “greater-than” comparison) or by using software emulation. When using software emulation, the program must swap the operands (copying registers when necessary to protect the data that will now be in the destination), and then perform the compare using a different predicate. The predicate to be used for these emulations is listed in the first 8 rows of <a href='cli.html#tbl-3-7'>Table 3-7</a> (Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 2A) under the heading Emulation.</p>
<p>Compilers and assemblers may implement the following two-operand pseudo-ops in addition to the three-operand CMPSD instruction, for processors with “CPUID.1H:ECX.AVX =0”. See <a href='cmpsd.html#tbl-3-6'>Table 3-6</a>. The compiler should treat reserved imm8 values as illegal syntax.</p>
<figure id="tbl-3-6">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPSD Implementation</th></tr>
<tr>
<td>CMPEQSD xmm1, xmm2</td>
<td>CMPSD xmm1, xmm2, 0</td></tr>
<tr>
<td>CMPLTSD xmm1, xmm2</td>
<td>CMPSD xmm1, xmm2, 1</td></tr>
<tr>
<td>CMPLESD xmm1, xmm2</td>
<td>CMPSD xmm1, xmm2, 2</td></tr>
<tr>
<td>CMPUNORDSD xmm1, xmm2</td>
<td>CMPSD xmm1, xmm2, 3</td></tr>
<tr>
<td>CMPNEQSD xmm1, xmm2</td>
<td>CMPSD xmm1, xmm2, 4</td></tr>
<tr>
<td>CMPNLTSD xmm1, xmm2</td>
<td>CMPSD xmm1, xmm2, 5</td></tr>
<tr>
<td>CMPNLESD xmm1, xmm2</td>
<td>CMPSD xmm1, xmm2, 6</td></tr>
<tr>
<td>CMPORDSD xmm1, xmm2</td>
<td>CMPSD xmm1, xmm2, 7</td></tr></table>
<figcaption><a href='cmpsd.html#tbl-3-6'>Table 3-6</a>. Pseudo-Op and CMPSD Implementation </figcaption></figure>
<p>The greater-than relations that the processor does not implement require more than one instruction to emulate in software and therefore should not be implemented as pseudo-ops. (For these, the programmer should reverse the operands of the corresponding less than relations and use move instructions to ensure that the mask is moved to the correct destination register and that the source operand is left intact.)</p>
<p>Processors with “CPUID.1H:ECX.AVX =1” implement the full complement of 32 predicates shown in <a href='cli.html#tbl-3-7'>Table 3-7</a>, software emulation is no longer needed. Compilers and assemblers may implement the following three-operand pseudo-ops in addition to the four-operand VCMPSD instruction. See <a href='cli.html#tbl-3-7'>Table 3-7</a>, where the notations of reg1 reg2, and reg3 represent either XMM registers or YMM registers. The compiler should treat reserved imm8 values as illegal syntax. Alternately, intrinsics can map the pseudo-ops to pre-defined constants to support a simpler intrinsic interface. Compilers and assemblers may implement three-operand pseudo-ops for EVEX encoded VCMPSD instructions in a similar fashion by extending the syntax listed in <a href='cli.html#tbl-3-7'>Table 3-7</a>.</p>
<figure id="tbl-3-7">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPSD Implementation</th></tr>
<tr>
<td>VCMPEQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 0</td></tr>
<tr>
<td>VCMPLTSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 1</td></tr>
<tr>
<td>VCMPLESD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 2</td></tr>
<tr>
<td>VCMPUNORDSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 3</td></tr>
<tr>
<td>VCMPNEQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 4</td></tr>
<tr>
<td>VCMPNLTSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 5</td></tr>
<tr>
<td>VCMPNLESD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 6</td></tr>
<tr>
<td>VCMPORDSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 7</td></tr>
<tr>
<td>VCMPEQ_UQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 8</td></tr>
<tr>
<td>VCMPNGESD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 9</td></tr>
<tr>
<td>VCMPNGTSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 0AH</td></tr>
<tr>
<td>VCMPFALSESD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 0BH</td></tr>
<tr>
<td>VCMPNEQ_OQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 0CH</td></tr>
<tr>
<td>VCMPGESD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 0DH</td></tr></table>
<figcaption><a href='cli.html#tbl-3-7'>Table 3-7</a>. Pseudo-Op and VCMPSD Implementation </figcaption></figure>
<figure id="tbl-3-7">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPSD Implementation</th></tr>
<tr>
<td>VCMPGTSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 0EH</td></tr>
<tr>
<td>VCMPTRUESD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 0FH</td></tr>
<tr>
<td>VCMPEQ_OSSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 10H</td></tr>
<tr>
<td>VCMPLT_OQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 11H</td></tr>
<tr>
<td>VCMPLE_OQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 12H</td></tr>
<tr>
<td>VCMPUNORD_SSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 13H</td></tr>
<tr>
<td>VCMPNEQ_USSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 14H</td></tr>
<tr>
<td>VCMPNLT_UQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 15H</td></tr>
<tr>
<td>VCMPNLE_UQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 16H</td></tr>
<tr>
<td>VCMPORD_SSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 17H</td></tr>
<tr>
<td>VCMPEQ_USSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 18H</td></tr>
<tr>
<td>VCMPNGE_UQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 19H</td></tr>
<tr>
<td>VCMPNGT_UQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 1AH</td></tr>
<tr>
<td>VCMPFALSE_OSSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 1BH</td></tr>
<tr>
<td>VCMPNEQ_OSSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 1CH</td></tr>
<tr>
<td>VCMPGE_OQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 1DH</td></tr>
<tr>
<td>VCMPGT_OQSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 1EH</td></tr>
<tr>
<td>VCMPTRUE_USSD reg1, reg2, reg3</td>
<td>VCMPSD reg1, reg2, reg3, 1FH</td></tr></table>
<figcaption><a href='cli.html#tbl-3-7'>Table 3-7</a>. Pseudo-Op and VCMPSD Implementation</figcaption></figure>
<p>Software should ensure VCMPSD is encoded with VEX.L=0. Encoding VCMPSD with VEX.L=1 may encounter unpredictable behavior across different processor generations.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CASE (COMPARISON PREDICATE) OF
0: OP3 := EQ_OQ; OP5 := EQ_OQ;
1: OP3 := LT_OS; OP5 := LT_OS;
2: OP3 := LE_OS; OP5 := LE_OS;
3: OP3 := UNORD_Q; OP5 := UNORD_Q;
4: OP3 := NEQ_UQ; OP5 := NEQ_UQ;
5: OP3 := NLT_US; OP5 := NLT_US;
6: OP3 := NLE_US; OP5 := NLE_US;
7: OP3 := ORD_Q; OP5 := ORD_Q;
8: OP5 := EQ_UQ;
9: OP5 := NGE_US;
10: OP5 := NGT_US;
11: OP5 := FALSE_OQ;
12: OP5 := NEQ_OQ;
13: OP5 := GE_OS;
14: OP5 := GT_OS;
15: OP5 := TRUE_UQ;
16: OP5 := EQ_OS;
17: OP5 := LT_OQ;
18: OP5 := LE_OQ;
19: OP5 := UNORD_S;
20: OP5 := NEQ_US;
21: OP5 := NLT_UQ;
22: OP5 := NLE_UQ;
23: OP5 := ORD_S;
24: OP5 := EQ_US;
25: OP5 := NGE_UQ;
26: OP5 := NGT_UQ;
27: OP5 := FALSE_OS;
28: OP5 := NEQ_OS;
29: OP5 := GE_OQ;
30: OP5 := GT_OQ;
31: OP5 := TRUE_US;
DEFAULT: Reserved
ESAC;
</pre>
<h3 id="vcmpsd--evex-encoded-version-">VCMPSD (EVEX Encoded Version)<a class="anchor" href="#vcmpsd--evex-encoded-version-">
</a></h3>
<pre>CMP0 := SRC1[63:0] OP5 SRC2[63:0];
IF k2[0] or *no writemask*
THEN IF CMP0 = TRUE
THEN DEST[0] := 1;
ELSE DEST[0] := 0; FI;
ELSE DEST[0] := 0
; zeroing-masking only
FI;
DEST[MAX_KL-1:1] := 0
</pre>
<h3 id="cmpsd--128-bit-legacy-sse-version-">CMPSD (128-bit Legacy SSE Version)<a class="anchor" href="#cmpsd--128-bit-legacy-sse-version-">
</a></h3>
<pre>CMP0 := DEST[63:0] OP3 SRC[63:0];
IF CMP0 = TRUE
THEN DEST[63:0] := FFFFFFFFFFFFFFFFH;
ELSE DEST[63:0] := 0000000000000000H; FI;
DEST[MAXVL-1:64] (Unmodified)
</pre>
<h3 id="vcmpsd--vex-128-encoded-version-">VCMPSD (VEX.128 Encoded Version)<a class="anchor" href="#vcmpsd--vex-128-encoded-version-">
</a></h3>
<pre>CMP0 := SRC1[63:0] OP5 SRC2[63:0];
IF CMP0 = TRUE
THEN DEST[63:0] := FFFFFFFFFFFFFFFFH;
ELSE DEST[63:0] := 0000000000000000H; FI;
DEST[127:64] := SRC1[127:64]
DEST[MAXVL-1:128] := 0
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCMPSD __mmask8 _mm_cmp_sd_mask( __m128d a, __m128d b, int imm);
</pre>
<pre>VCMPSD __mmask8 _mm_cmp_round_sd_mask( __m128d a, __m128d b, int imm, int sae);
</pre>
<pre>VCMPSD __mmask8 _mm_mask_cmp_sd_mask( __mmask8 k1, __m128d a, __m128d b, int imm);
</pre>
<pre>VCMPSD __mmask8 _mm_mask_cmp_round_sd_mask( __mmask8 k1, __m128d a, __m128d b, int imm, int sae);
</pre>
<pre>(V)CMPSD __m128d _mm_cmp_sd(__m128d a, __m128d b, const int imm)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid if SNaN operand, Invalid if QNaN and predicate as listed in <a href='cmppd.html#tbl-3-1'>Table 3-1</a>, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-20</span>, “Type 3 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-47</span>, “Type E3 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

315
x86/cmpss.html Normal file
View file

@ -0,0 +1,315 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMPSS
— Compare Scalar Single Precision Floating-Point Value</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMPSS
— Compare Scalar Single Precision Floating-Point Value</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F C2 /r ib CMPSS xmm1, xmm2/m32, imm8</td>
<td>A</td>
<td>V/V</td>
<td>SSE</td>
<td>Compare low single precision floating-point value in xmm2/m32 and xmm1 using bits 2:0 of imm8 as comparison predicate.</td></tr>
<tr>
<td>VEX.LIG.F3.0F.WIG C2 /r ib VCMPSS xmm1, xmm2, xmm3/m32, imm8</td>
<td>B</td>
<td>V/V</td>
<td>AVX</td>
<td>Compare low single precision floating-point value in xmm3/m32 and xmm2 using bits 4:0 of imm8 as comparison predicate.</td></tr>
<tr>
<td>EVEX.LLIG.F3.0F.W0 C2 /r ib VCMPSS k1 {k2}, xmm2, xmm3/m32{sae}, imm8</td>
<td>C</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Compare low single precision floating-point value in xmm3/m32 and xmm2 using bits 4:0 of imm8 as comparison predicate with writemask k2 and leave the result in mask register k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>VEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td></tr>
<tr>
<td>C</td>
<td>Tuple1 Scalar</td>
<td>ModRM:reg (w)</td>
<td>EVEX.vvvv (r)</td>
<td>ModRM:r/m (r)</td>
<td>imm8</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compares the low single precision floating-point values in the second source operand and the first source operand and returns the result of the comparison to the destination operand. The comparison predicate operand (immediate operand) specifies the type of comparison performed.</p>
<p>128-bit Legacy SSE version: The first source and destination operand (first operand) is an XMM register. The second source operand (second operand) can be an XMM register or 32-bit memory location. Bits (MAXVL-1:32) of the corresponding YMM destination register remain unchanged. The comparison result is a doubleword mask of all 1s (comparison true) or all 0s (comparison false).</p>
<p>VEX.128 encoded version: The first source operand (second operand) is an XMM register. The second source operand (third operand) can be an XMM register or a 32-bit memory location. The result is stored in the low 32 bits of the destination operand; bits 127:32 of the destination operand are copied from the first source operand. Bits (MAXVL-1:128) of the destination ZMM register are zeroed. The comparison result is a doubleword mask of all 1s (comparison true) or all 0s (comparison false).</p>
<p>EVEX encoded version: The first source operand (second operand) is an XMM register. The second source operand can be a XMM register or a 32-bit memory location. The destination operand (first operand) is an opmask register. The comparison result is a single mask bit of 1 (comparison true) or 0 (comparison false), written to the destination starting from the LSB according to the writemask k2. Bits (MAX_KL-1:128) of the destination register are cleared.</p>
<p>The comparison predicate operand is an 8-bit immediate:</p>
<ul>
<li>For instructions encoded using the VEX prefix, bits 4:0 define the type of comparison to be performed (see <a href='cmppd.html#tbl-3-1'>Table 3-1</a>). Bits 5 through 7 of the immediate are reserved.</li>
<li>For instruction encodings that do not use VEX prefix, bits 2:0 define the type of comparison to be made (see the first 8 rows of <a href='cmppd.html#tbl-3-1'>Table 3-1</a>). Bits 3 through 7 of the immediate are reserved.</li></ul>
<p>The unordered relationship is true when at least one of the two source operands being compared is a NaN; the ordered relationship is true when neither source operand is a NaN.</p>
<p>A subsequent computational instruction that uses the mask result in the destination operand as an input operand will not generate an exception, because a mask of all 0s corresponds to a floating-point value of +0.0 and a mask of all 1s corresponds to a QNaN.</p>
<p>Note that processors with “CPUID.1H:ECX.AVX =0” do not implement the “greater-than”, “greater-than-or-equal”, “not-greater than”, and “not-greater-than-or-equal relations” predicates. These comparisons can be made either by using the inverse relationship (that is, use the “not-less-than-or-equal” to make a “greater-than” comparison) or by using software emulation. When using software emulation, the program must swap the operands (copying registers when necessary to protect the data that will now be in the destination), and then perform the compare using a different predicate. The predicate to be used for these emulations is listed in the first 8 rows of <a href='cli.html#tbl-3-7'>Table 3-7</a> (Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 2A) under the heading Emulation.</p>
<p>Compilers and assemblers may implement the following two-operand pseudo-ops in addition to the three-operand CMPSS instruction, for processors with “CPUID.1H:ECX.AVX =0”. See <a href='cmpss.html#tbl-3-8'>Table 3-8</a>. The compiler should treat reserved imm8 values as illegal syntax.</p>
<figure id="tbl-3-8">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPSS Implementation</th></tr>
<tr>
<td>CMPEQSS xmm1, xmm2</td>
<td>CMPSS xmm1, xmm2, 0</td></tr>
<tr>
<td>CMPLTSS xmm1, xmm2</td>
<td>CMPSS xmm1, xmm2, 1</td></tr>
<tr>
<td>CMPLESS xmm1, xmm2</td>
<td>CMPSS xmm1, xmm2, 2</td></tr>
<tr>
<td>CMPUNORDSS xmm1, xmm2</td>
<td>CMPSS xmm1, xmm2, 3</td></tr>
<tr>
<td>CMPNEQSS xmm1, xmm2</td>
<td>CMPSS xmm1, xmm2, 4</td></tr>
<tr>
<td>CMPNLTSS xmm1, xmm2</td>
<td>CMPSS xmm1, xmm2, 5</td></tr>
<tr>
<td>CMPNLESS xmm1, xmm2</td>
<td>CMPSS xmm1, xmm2, 6</td></tr>
<tr>
<td>CMPORDSS xmm1, xmm2</td>
<td>CMPSS xmm1, xmm2, 7</td></tr></table>
<figcaption><a href='cmpss.html#tbl-3-8'>Table 3-8</a>. Pseudo-Op and CMPSS Implementation </figcaption></figure>
<p>The greater-than relations that the processor does not implement require more than one instruction to emulate in software and therefore should not be implemented as pseudo-ops. (For these, the programmer should reverse the operands of the corresponding less than relations and use move instructions to ensure that the mask is moved to the correct destination register and that the source operand is left intact.)</p>
<p>Processors with “CPUID.1H:ECX.AVX =1” implement the full complement of 32 predicates shown in <a href='cli.html#tbl-3-7'>Table 3-7</a>, software emulation is no longer needed. Compilers and assemblers may implement the following three-operand pseudo-ops in addition to the four-operand VCMPSS instruction. See <a href='cmpss.html#tbl-3-9'>Table 3-9</a>, where the notations of reg1 reg2, and reg3 represent either XMM registers or YMM registers. The compiler should treat reserved imm8 values as illegal syntax. Alternately, intrinsics can map the pseudo-ops to pre-defined constants to support a simpler intrinsic interface. Compilers and assemblers may implement three-operand pseudo-ops for EVEX encoded VCMPSS instructions in a similar fashion by extending the syntax listed in <a href='cmpss.html#tbl-3-9'>Table 3-9</a>.</p>
<figure id="tbl-3-9">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPSS Implementation</th></tr>
<tr>
<td>VCMPEQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 0</td></tr>
<tr>
<td>VCMPLTSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 1</td></tr>
<tr>
<td>VCMPLESS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 2</td></tr>
<tr>
<td>VCMPUNORDSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 3</td></tr>
<tr>
<td>VCMPNEQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 4</td></tr>
<tr>
<td>VCMPNLTSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 5</td></tr>
<tr>
<td>VCMPNLESS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 6</td></tr>
<tr>
<td>VCMPORDSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 7</td></tr>
<tr>
<td>VCMPEQ_UQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 8</td></tr>
<tr>
<td>VCMPNGESS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 9</td></tr>
<tr>
<td>VCMPNGTSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 0AH</td></tr>
<tr>
<td>VCMPFALSESS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 0BH</td></tr></table>
<figcaption><a href='cmpss.html#tbl-3-9'>Table 3-9</a>. Pseudo-Op and VCMPSS Implementation </figcaption></figure>
<figure id="tbl-3-9">
<table>
<tr>
<th>Pseudo-Op</th>
<th>CMPSS Implementation</th></tr>
<tr>
<td>VCMPNEQ_OQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 0CH</td></tr>
<tr>
<td>VCMPGESS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 0DH</td></tr>
<tr>
<td>VCMPGTSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 0EH</td></tr>
<tr>
<td>VCMPTRUESS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 0FH</td></tr>
<tr>
<td>VCMPEQ_OSSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 10H</td></tr>
<tr>
<td>VCMPLT_OQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 11H</td></tr>
<tr>
<td>VCMPLE_OQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 12H</td></tr>
<tr>
<td>VCMPUNORD_SSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 13H</td></tr>
<tr>
<td>VCMPNEQ_USSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 14H</td></tr>
<tr>
<td>VCMPNLT_UQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 15H</td></tr>
<tr>
<td>VCMPNLE_UQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 16H</td></tr>
<tr>
<td>VCMPORD_SSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 17H</td></tr>
<tr>
<td>VCMPEQ_USSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 18H</td></tr>
<tr>
<td>VCMPNGE_UQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 19H</td></tr>
<tr>
<td>VCMPNGT_UQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 1AH</td></tr>
<tr>
<td>VCMPFALSE_OSSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 1BH</td></tr>
<tr>
<td>VCMPNEQ_OSSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 1CH</td></tr>
<tr>
<td>VCMPGE_OQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 1DH</td></tr>
<tr>
<td>VCMPGT_OQSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 1EH</td></tr>
<tr>
<td>VCMPTRUE_USSS reg1, reg2, reg3</td>
<td>VCMPSS reg1, reg2, reg3, 1FH</td></tr></table>
<figcaption><a href='cmpss.html#tbl-3-9'>Table 3-9</a>. Pseudo-Op and VCMPSS Implementation</figcaption></figure>
<p>Software should ensure VCMPSS is encoded with VEX.L=0. Encoding VCMPSS with VEX.L=1 may encounter unpredictable behavior across different processor generations.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>CASE (COMPARISON PREDICATE) OF
0: OP3 := EQ_OQ; OP5 := EQ_OQ;
1: OP3 := LT_OS; OP5 := LT_OS;
2: OP3 := LE_OS; OP5 := LE_OS;
3: OP3 := UNORD_Q; OP5 := UNORD_Q;
4: OP3 := NEQ_UQ; OP5 := NEQ_UQ;
5: OP3 := NLT_US; OP5 := NLT_US;
6: OP3 := NLE_US; OP5 := NLE_US;
7: OP3 := ORD_Q; OP5 := ORD_Q;
8: OP5 := EQ_UQ;
9: OP5 := NGE_US;
10: OP5 := NGT_US;
11: OP5 := FALSE_OQ;
12: OP5 := NEQ_OQ;
13: OP5 := GE_OS;
14: OP5 := GT_OS;
15: OP5 := TRUE_UQ;
16: OP5 := EQ_OS;
17: OP5 := LT_OQ;
18: OP5 := LE_OQ;
19: OP5 := UNORD_S;
20: OP5 := NEQ_US;
21: OP5 := NLT_UQ;
22: OP5 := NLE_UQ;
23: OP5 := ORD_S;
24: OP5 := EQ_US;
25: OP5 := NGE_UQ;
26: OP5 := NGT_UQ;
27: OP5 := FALSE_OS;
28: OP5 := NEQ_OS;
29: OP5 := GE_OQ;
30: OP5 := GT_OQ;
31: OP5 := TRUE_US;
DEFAULT: Reserved
ESAC;
</pre>
<h3 id="vcmpss--evex-encoded-version-">VCMPSS (EVEX Encoded Version)<a class="anchor" href="#vcmpss--evex-encoded-version-">
</a></h3>
<pre>CMP0 := SRC1[31:0] OP5 SRC2[31:0];
IF k2[0] or *no writemask*
THEN IF CMP0 = TRUE
THEN DEST[0] := 1;
ELSE DEST[0] := 0; FI;
ELSE DEST[0] := 0
; zeroing-masking only
FI;
DEST[MAX_KL-1:1] := 0
</pre>
<h3 id="cmpss--128-bit-legacy-sse-version-">CMPSS (128-bit Legacy SSE Version)<a class="anchor" href="#cmpss--128-bit-legacy-sse-version-">
</a></h3>
<pre>CMP0 := DEST[31:0] OP3 SRC[31:0];
IF CMP0 = TRUE
THEN DEST[31:0] := FFFFFFFFH;
ELSE DEST[31:0] := 00000000H; FI;
DEST[MAXVL-1:32] (Unmodified)
</pre>
<h3 id="vcmpss--vex-128-encoded-version-">VCMPSS (VEX.128 Encoded Version)<a class="anchor" href="#vcmpss--vex-128-encoded-version-">
</a></h3>
<pre>CMP0 := SRC1[31:0] OP5 SRC2[31:0];
IF CMP0 = TRUE
THEN DEST[31:0] := FFFFFFFFH;
ELSE DEST[31:0] := 00000000H; FI;
DEST[127:32] := SRC1[127:32]
DEST[MAXVL-1:128] := 0
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCMPSS __mmask8 _mm_cmp_ss_mask( __m128 a, __m128 b, int imm);
</pre>
<pre>VCMPSS __mmask8 _mm_cmp_round_ss_mask( __m128 a, __m128 b, int imm, int sae);
</pre>
<pre>VCMPSS __mmask8 _mm_mask_cmp_ss_mask( __mmask8 k1, __m128 a, __m128 b, int imm);
</pre>
<pre>VCMPSS __mmask8 _mm_mask_cmp_round_ss_mask( __mmask8 k1, __m128 a, __m128 b, int imm, int sae);
</pre>
<pre>(V)CMPSS __m128 _mm_cmp_ss(__m128 a, __m128 b, const int imm)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid if SNaN operand, Invalid if QNaN and predicate as listed in <a href='cmppd.html#tbl-3-1'>Table 3-1</a>, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-20</span>, “Type 3 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-47</span>, “Type E3 Class Exception Conditions.”</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

172
x86/cmpxchg.html Normal file
View file

@ -0,0 +1,172 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMPXCHG
— Compare and Exchange</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMPXCHG
— Compare and Exchange</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64-Bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F B0/r CMPXCHG r/m8, r8</td>
<td>MR</td>
<td>Valid</td>
<td>Valid*</td>
<td>Compare AL with r/m8. If equal, ZF is set and r8 is loaded into r/m8. Else, clear ZF and load r/m8 into AL.</td></tr>
<tr>
<td>REX + 0F B0/r CMPXCHG r/m8**,r8</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare AL with r/m8. If equal, ZF is set and r8 is loaded into r/m8. Else, clear ZF and load r/m8 into AL.</td></tr>
<tr>
<td>0F B1/r CMPXCHG r/m16, r16</td>
<td>MR</td>
<td>Valid</td>
<td>Valid*</td>
<td>Compare AX with r/m16. If equal, ZF is set and r16 is loaded into r/m16. Else, clear ZF and load r/m16 into AX.</td></tr>
<tr>
<td>0F B1/r CMPXCHG r/m32, r32</td>
<td>MR</td>
<td>Valid</td>
<td>Valid*</td>
<td>Compare EAX with r/m32. If equal, ZF is set and r32 is loaded into r/m32. Else, clear ZF and load r/m32 into EAX.</td></tr>
<tr>
<td>REX.W + 0F B1/r CMPXCHG r/m64, r64</td>
<td>MR</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare RAX with r/m64. If equal, ZF is set and r64 is loaded into r/m64. Else, clear ZF and load r/m64 into RAX.</td></tr></table>
<blockquote>
<p>* SeetheIA-32ArchitectureCompatibilitysectionbelow.</p>
<p>** In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.</p></blockquote>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>MR</td>
<td>ModRM:r/m (r, w)</td>
<td>ModRM:reg (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compares the value in the AL, AX, EAX, or RAX register with the first operand (destination operand). If the two values are equal, the second operand (source operand) is loaded into the destination operand. Otherwise, the destination operand is loaded into the AL, AX, EAX or RAX register. RAX register is available only in 64-bit mode.</p>
<p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the interface to the processors bus, the destination operand receives a write cycle without regard to the result of the comparison. The destination operand is written back if the comparison fails; otherwise, the source operand is written into the destination. (The processor never produces a locked read without also producing a locked write.)</p>
<p>In 64-bit mode, the instructions default operation size is 32 bits. Use of the REX.R prefix permits access to additional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.</p>
<h2 id="ia-32-architecture-compatibility">IA-32 Architecture Compatibility<a class="anchor" href="#ia-32-architecture-compatibility">
</a></h2>
<p>This instruction is not supported on Intel processors earlier than the Intel486 processors.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>(* Accumulator = AL, AX, EAX, or RAX depending on whether a byte, word, doubleword, or quadword comparison is being performed *)
TEMP := DEST
IF accumulator = TEMP
THEN
ZF := 1;
DEST := SRC;
ELSE
ZF := 0;
accumulator := TEMP;
DEST := TEMP;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The ZF flag is set if the values in the destination operand and register AL, AX, or EAX are equal; otherwise it is cleared. The CF, PF, AF, SF, and OF flags are set according to the results of the comparison operation.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the destination is located in a non-writable segment.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td>#UD</td>
<td>If the LOCK prefix is used but the destination is not a memory operand.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

View file

@ -0,0 +1,173 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CMPXCHG8B/CMPXCHG16B
— Compare and Exchange Bytes</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CMPXCHG8B/CMPXCHG16B
— Compare and Exchange Bytes</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64-Bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>0F C7 /1 CMPXCHG8B m64</td>
<td>M</td>
<td>Valid</td>
<td>Valid*</td>
<td>Compare EDX:EAX with m64. If equal, set ZF and load ECX:EBX into m64. Else, clear ZF and load m64 into EDX:EAX.</td></tr>
<tr>
<td>REX.W + 0F C7 /1 CMPXCHG16B m128</td>
<td>M</td>
<td>Valid</td>
<td>N.E.</td>
<td>Compare RDX:RAX with m128. If equal, set ZF and load RCX:RBX into m128. Else, clear ZF and load m128 into RDX:RAX.</td></tr></table>
<blockquote>
<p>*See IA-32 Architecture Compatibility section below.</p></blockquote>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>M</td>
<td>ModRM:r/m (r, w)</td>
<td>N/A</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compares the 64-bit value in EDX:EAX (or 128-bit value in RDX:RAX if operand size is 128 bits) with the operand (destination operand). If the values are equal, the 64-bit value in ECX:EBX (or 128-bit value in RCX:RBX) is stored in the destination operand. Otherwise, the value in the destination operand is loaded into EDX:EAX (or RDX:RAX). The destination operand is an 8-byte memory location (or 16-byte memory location if operand size is 128 bits). For the EDX:EAX and ECX:EBX register pairs, EDX and ECX contain the high-order 32 bits and EAX and EBX contain the low-order 32 bits of a 64-bit value. For the RDX:RAX and RCX:RBX register pairs, RDX and RCX contain the high-order 64 bits and RAX and RBX contain the low-order 64bits of a 128-bit value.</p>
<p>This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. To simplify the interface to the processors bus, the destination operand receives a write cycle without regard to the result of the comparison. The destination operand is written back if the comparison fails; otherwise, the source operand is written into the destination. (The processor never produces a locked read without also producing a locked write.)</p>
<p>In 64-bit mode, default operation size is 64 bits. Use of the REX.W prefix promotes operation to 128 bits. Note that CMPXCHG16B requires that the destination (memory) operand be 16-byte aligned. See the summary chart at the beginning of this section for encoding data and limits. For information on the CPUID flag that indicates CMPX-CHG16B, see page 3-243.</p>
<h2 id="ia-32-architecture-compatibility">IA-32 Architecture Compatibility<a class="anchor" href="#ia-32-architecture-compatibility">
</a></h2>
<p>This instruction encoding is not supported on Intel processors earlier than the Pentium processors.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>IF (64-Bit Mode and OperandSize = 64)
THEN
TEMP128 := DEST
IF (RDX:RAX = TEMP128)
THEN
ZF := 1;
DEST := RCX:RBX;
ELSE
ZF := 0;
RDX:RAX := TEMP128;
DEST := TEMP128;
FI;
FI
ELSE
TEMP64 := DEST;
IF (EDX:EAX = TEMP64)
THEN
ZF := 1;
DEST := ECX:EBX;
ELSE
ZF := 0;
EDX:EAX := TEMP64;
DEST := TEMP64;
FI;
FI;
FI;
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>The ZF flag is set if the destination operand and EDX:EAX are equal; otherwise it is cleared. The CF, PF, AF, SF, and OF flags are unaffected.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the destination is not a memory operand.</td></tr>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the destination is located in a non-writable segment.</td></tr>
<tr>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>If the DS, ES, FS, or GS register contains a NULL segment selector.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the destination operand is not a memory location.</td></tr>
<tr>
<td>#GP</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual-8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#UD</td>
<td>If the destination operand is not a memory location.</td></tr>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in protected mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td rowspan="3">#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>If memory operand for CMPXCHG16B is not aligned on a 16-byte boundary.</td></tr>
<tr>
<td>If CPUID.01H:ECX.CMPXCHG16B[bit 13] = 0.</td></tr>
<tr>
<td>#UD</td>
<td>If the destination operand is not a memory location.</td></tr>
<tr>
<td>#PF(fault-code)</td>
<td>If a page fault occurs.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

113
x86/comisd.html Normal file
View file

@ -0,0 +1,113 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>COMISD
— Compare Scalar Ordered Double Precision Floating-Point Values and Set EFLAGS</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>COMISD
— Compare Scalar Ordered Double Precision Floating-Point Values and Set EFLAGS</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 2F /r COMISD xmm1, xmm2/m64</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Compare low double precision floating-point values in xmm1 and xmm2/mem64 and set the EFLAGS flags accordingly.</td></tr>
<tr>
<td>VEX.LIG.66.0F.WIG 2F /r VCOMISD xmm1, xmm2/m64</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Compare low double precision floating-point values in xmm1 and xmm2/mem64 and set the EFLAGS flags accordingly.</td></tr>
<tr>
<td>EVEX.LLIG.66.0F.W1 2F /r VCOMISD xmm1, xmm2/m64{sae}</td>
<td>B</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Compare low double precision floating-point values in xmm1 and xmm2/mem64 and set the EFLAGS flags accordingly.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>Tuple1 Scalar</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compares the double precision floating-point values in the low quadwords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF, and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN).</p>
<p>Operand 1 is an XMM register; operand 2 can be an XMM register or a 64 bit memory location. The COMISD instruction differs from the UCOMISD instruction in that it signals a SIMD floating-point invalid operation exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISD instruction signals an invalid operation exception only if a source operand is an SNaN.</p>
<p>The EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.</p>
<p>VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.</p>
<p>Software should ensure VCOMISD is encoded with VEX.L=0. Encoding VCOMISD with VEX.L=1 may encounter unpredictable behavior across different processor generations.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="comisd--all-versions-">COMISD (All Versions)<a class="anchor" href="#comisd--all-versions-">
</a></h3>
<pre>RESULT := OrderedCompare(DEST[63:0] &lt;&gt; SRC[63:0]) {
(* Set EFLAGS *) CASE (RESULT) OF
UNORDERED: ZF,PF,CF := 111;
GREATER_THAN: ZF,PF,CF := 000;
LESS_THAN: ZF,PF,CF := 001;
EQUAL: ZF,PF,CF := 100;
ESAC;
OF, AF, SF := 0; }
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCOMISD int _mm_comi_round_sd(__m128d a, __m128d b, int imm, int sae);
</pre>
<pre>VCOMISD int _mm_comieq_sd (__m128d a, __m128d b)
</pre>
<pre>VCOMISD int _mm_comilt_sd (__m128d a, __m128d b)
</pre>
<pre>VCOMISD int _mm_comile_sd (__m128d a, __m128d b)
</pre>
<pre>VCOMISD int _mm_comigt_sd (__m128d a, __m128d b)
</pre>
<pre>VCOMISD int _mm_comige_sd (__m128d a, __m128d b)
</pre>
<pre>VCOMISD int _mm_comineq_sd (__m128d a, __m128d b)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid (if SNaN or QNaN operands), Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-20</span>, “Type 3 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-48</span>, “Type E3NF Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

114
x86/comiss.html Normal file
View file

@ -0,0 +1,114 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>COMISS
— Compare Scalar Ordered Single Precision Floating-Point Values and Set EFLAGS</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>COMISS
— Compare Scalar Ordered Single Precision Floating-Point Values and Set EFLAGS</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 2F /r COMISS xmm1, xmm2/m32</td>
<td>A</td>
<td>V/V</td>
<td>SSE</td>
<td>Compare low single precision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly.</td></tr>
<tr>
<td>VEX.LIG.0F.WIG 2F /r VCOMISS xmm1, xmm2/m32</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Compare low single precision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly.</td></tr>
<tr>
<td>EVEX.LLIG.0F.W0 2F /r VCOMISS xmm1, xmm2/m32{sae}</td>
<td>B</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Compare low single precision floating-point values in xmm1 and xmm2/mem32 and set the EFLAGS flags accordingly.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>Tuple1 Scalar</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Compares the single precision floating-point values in the low quadwords of operand 1 (first operand) and operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF, and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN).</p>
<p>Operand 1 is an XMM register; operand 2 can be an XMM register or a 32 bit memory location.</p>
<p>The COMISS instruction differs from the UCOMISS instruction in that it signals a SIMD floating-point invalid operation exception (#I) when a source operand is either a QNaN or SNaN. The UCOMISS instruction signals an invalid operation exception only if a source operand is an SNaN.</p>
<p>The EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.</p>
<p>VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.</p>
<p>Software should ensure VCOMISS is encoded with VEX.L=0. Encoding VCOMISS with VEX.L=1 may encounter unpredictable behavior across different processor generations.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="comiss--all-versions-">COMISS (All Versions)<a class="anchor" href="#comiss--all-versions-">
</a></h3>
<pre>RESULT := OrderedCompare(DEST[31:0] &lt;&gt; SRC[31:0]) {
(* Set EFLAGS *) CASE (RESULT) OF
UNORDERED: ZF,PF,CF := 111;
GREATER_THAN: ZF,PF,CF := 000;
LESS_THAN: ZF,PF,CF := 001;
EQUAL: ZF,PF,CF := 100;
ESAC;
OF, AF, SF := 0; }
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCOMISS int _mm_comi_round_ss(__m128 a, __m128 b, int imm, int sae);
</pre>
<pre>VCOMISS int _mm_comieq_ss (__m128 a, __m128 b)
</pre>
<pre>VCOMISS int _mm_comilt_ss (__m128 a, __m128 b)
</pre>
<pre>VCOMISS int _mm_comile_ss (__m128 a, __m128 b)
</pre>
<pre>VCOMISS int _mm_comigt_ss (__m128 a, __m128 b)
</pre>
<pre>VCOMISS int _mm_comige_ss (__m128 a, __m128 b)
</pre>
<pre>VCOMISS int _mm_comineq_ss (__m128 a, __m128 b)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid (if SNaN or QNaN operands), Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-20</span>, “Type 3 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-48</span>, “Type E3NF Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

2180
x86/cpuid.html Normal file

File diff suppressed because it is too large Load diff

228
x86/crc32.html Normal file
View file

@ -0,0 +1,228 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CRC32
— Accumulate CRC32 Value</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CRC32
— Accumulate CRC32 Value</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64-Bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>F2 0F 38 F0 /r CRC32 r32, r/m8</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Accumulate CRC32 on r/m8.</td></tr>
<tr>
<td>F2 REX 0F 38 F0 /r CRC32 r32, r/m8<sup>1</sup></td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Accumulate CRC32 on r/m8.</td></tr>
<tr>
<td>F2 0F 38 F1 /r CRC32 r32, r/m16</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Accumulate CRC32 on r/m16.</td></tr>
<tr>
<td>F2 0F 38 F1 /r CRC32 r32, r/m32</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Accumulate CRC32 on r/m32.</td></tr>
<tr>
<td>F2 REX.W 0F 38 F0 /r CRC32 r64, r/m8</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Accumulate CRC32 on r/m8.</td></tr>
<tr>
<td>F2 REX.W 0F 38 F1 /r CRC32 r64, r/m64</td>
<td>RM</td>
<td>Valid</td>
<td>N.E.</td>
<td>Accumulate CRC32 on r/m64.</td></tr></table>
<blockquote>
<p>1. In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.</p></blockquote>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Starting with an initial value in the first operand (destination operand), accumulates a CRC32 (polynomial 11EDC6F41H) value for the second operand (source operand) and stores the result in the destination operand. The source operand can be a register or a memory location. The destination operand must be an r32 or r64 register. If the destination is an r64 register, then the 32-bit result is stored in the least significant double word and 00000000H is stored in the most significant double word of the r64 register.</p>
<p>The initial value supplied in the destination operand is a double word integer stored in the r32 register or the least significant double word of the r64 register. To incrementally accumulate a CRC32 value, software retains the result of the previous CRC32 operation in the destination operand, then executes the CRC32 instruction again with new input data in the source operand. Data contained in the source operand is processed in reflected bit order. This means that the most significant bit of the source operand is treated as the least significant bit of the quotient, and so on, for all the bits of the source operand. Likewise, the result of the CRC operation is stored in the destination operand in reflected bit order. This means that the most significant bit of the resulting CRC (bit 31) is stored in the least significant bit of the destination operand (bit 0), and so on, for all the bits of the CRC.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<blockquote>
<p>BIT_REFLECT64: DST[63-0] = SRC[0-63]</p>
<p>BIT_REFLECT32: DST[31-0] = SRC[0-31]</p>
<p>BIT_REFLECT16: DST[15-0] = SRC[0-15]</p>
<p>BIT_REFLECT8: DST[7-0] = SRC[0-7]</p>
<p>MOD2: Remainder from Polynomial division modulus 2</p></blockquote>
<pre>CRC32 instruction for 64-bit source operand and 64-bit destination operand:
TEMP1[63-0] := BIT_REFLECT64 (SRC[63-0])
TEMP2[31-0] := BIT_REFLECT32 (DEST[31-0])
TEMP3[95-0] := TEMP1[63-0] « 32
TEMP4[95-0] := TEMP2[31-0] « 64
TEMP5[95-0] := TEMP3[95-0] XOR TEMP4[95-0]
TEMP6[31-0] := TEMP5[95-0] MOD2 11EDC6F41H
DEST[31-0] := BIT_REFLECT (TEMP6[31-0])
DEST[63-32] := 00000000H
CRC32 instruction for 32-bit source operand and 32-bit destination operand:
TEMP1[31-0] := BIT_REFLECT32 (SRC[31-0])
TEMP2[31-0] := BIT_REFLECT32 (DEST[31-0])
TEMP3[63-0] := TEMP1[31-0] « 32
TEMP4[63-0] := TEMP2[31-0] « 32
TEMP5[63-0] := TEMP3[63-0] XOR TEMP4[63-0]
TEMP6[31-0] := TEMP5[63-0] MOD2 11EDC6F41H
DEST[31-0] := BIT_REFLECT (TEMP6[31-0])
CRC32 instruction for 16-bit source operand and 32-bit destination operand:
TEMP1[15-0] := BIT_REFLECT16 (SRC[15-0])
TEMP2[31-0] := BIT_REFLECT32 (DEST[31-0])
TEMP3[47-0] := TEMP1[15-0] « 32
TEMP4[47-0] := TEMP2[31-0] « 16
TEMP5[47-0] := TEMP3[47-0] XOR TEMP4[47-0]
TEMP6[31-0] := TEMP5[47-0] MOD2 11EDC6F41H
DEST[31-0] := BIT_REFLECT (TEMP6[31-0])
CRC32 instruction for 8-bit source operand and 64-bit destination operand:
TEMP1[7-0] := BIT_REFLECT8(SRC[7-0])
TEMP2[31-0] := BIT_REFLECT32 (DEST[31-0])
TEMP3[39-0] := TEMP1[7-0] « 32
TEMP4[39-0] := TEMP2[31-0] « 8
TEMP5[39-0] := TEMP3[39-0] XOR TEMP4[39-0]
TEMP6[31-0] := TEMP5[39-0] MOD2 11EDC6F41H
DEST[31-0] := BIT_REFLECT (TEMP6[31-0])
DEST[63-32] := 00000000H
CRC32 instruction for 8-bit source operand and 32-bit destination operand:
TEMP1[7-0] := BIT_REFLECT8(SRC[7-0])
TEMP2[31-0] := BIT_REFLECT32 (DEST[31-0])
TEMP3[39-0] := TEMP1[7-0] « 32
TEMP4[39-0] := TEMP2[31-0] « 8
TEMP5[39-0] := TEMP3[39-0] XOR TEMP4[39-0]
TEMP6[31-0] := TEMP5[39-0] MOD2 11EDC6F41H
DEST[31-0] := BIT_REFLECT (TEMP6[31-0])
</pre>
<h2 id="flags-affected">Flags Affected<a class="anchor" href="#flags-affected">
</a></h2>
<p>None.</p>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>unsigned int _mm_crc32_u8( unsigned int crc, unsigned char data )
</pre>
<pre>unsigned int _mm_crc32_u16( unsigned int crc, unsigned short data )
</pre>
<pre>unsigned int _mm_crc32_u32( unsigned int crc, unsigned int data )
</pre>
<pre>unsigned __int64 _mm_crc32_u64( unsigned __int64 crc, unsigned __int64 data )
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="protected-mode-exceptions">Protected Mode Exceptions<a class="anchor" href="#protected-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If a memory operand effective address is outside the CS, DS, ES, FS or GS segments.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF</td>
<td>(fault-code) For a page fault.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If CPUID.01H:ECX.SSE4_2[Bit 20] = 0.</td></tr>
<tr>
<td>If LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="real-address-mode-exceptions">Real-Address Mode Exceptions<a class="anchor" href="#real-address-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If any part of the operand lies outside of the effective address space from 0 to 0FFFFH.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If CPUID.01H:ECX.SSE4_2[Bit 20] = 0.</td></tr>
<tr>
<td>If LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="virtual-8086-mode-exceptions">Virtual 8086 Mode Exceptions<a class="anchor" href="#virtual-8086-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If any part of the operand lies outside of the effective address space from 0 to 0FFFFH.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory operand effective address is outside the SS segment limit.</td></tr>
<tr>
<td>#PF</td>
<td>(fault-code) For a page fault.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If CPUID.01H:ECX.SSE4_2[Bit 20] = 0.</td></tr>
<tr>
<td>If LOCK prefix is used.</td></tr></table>
<h2 class="exceptions" id="compatibility-mode-exceptions">Compatibility Mode Exceptions<a class="anchor" href="#compatibility-mode-exceptions">
</a></h2>
<p>Same exceptions as in Protected Mode.</p>
<h2 class="exceptions" id="64-bit-mode-exceptions">64-Bit Mode Exceptions<a class="anchor" href="#64-bit-mode-exceptions">
</a></h2>
<table>
<tr>
<td>#GP(0)</td>
<td>If the memory address is in a non-canonical form.</td></tr>
<tr>
<td>#SS(0)</td>
<td>If a memory address referencing the SS segment is in a non-canonical form.</td></tr>
<tr>
<td>#PF</td>
<td>(fault-code) For a page fault.</td></tr>
<tr>
<td>#AC(0)</td>
<td>If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.</td></tr>
<tr>
<td rowspan="2">#UD</td>
<td>If CPUID.01H:ECX.SSE4_2[Bit 20] = 0.</td></tr>
<tr>
<td>If LOCK prefix is used.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

265
x86/cvtdq2pd.html Normal file
View file

@ -0,0 +1,265 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CVTDQ2PD
— Convert Packed Doubleword Integers to Packed Double Precision Floating-PointValues</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CVTDQ2PD
— Convert Packed Doubleword Integers to Packed Double Precision Floating-PointValues</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F3 0F E6 /r CVTDQ2PD xmm1, xmm2/m64</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Convert two packed signed doubleword integers from xmm2/mem to two packed double precision floating-point values in xmm1.</td></tr>
<tr>
<td>VEX.128.F3.0F.WIG E6 /r VCVTDQ2PD xmm1, xmm2/m64</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert two packed signed doubleword integers from xmm2/mem to two packed double precision floating-point values in xmm1.</td></tr>
<tr>
<td>VEX.256.F3.0F.WIG E6 /r VCVTDQ2PD ymm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert four packed signed doubleword integers from xmm2/mem to four packed double precision floating-point values in ymm1.</td></tr>
<tr>
<td>EVEX.128.F3.0F.W0 E6 /r VCVTDQ2PD xmm1 {k1}{z}, xmm2/m64/m32bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert 2 packed signed doubleword integers from xmm2/m64/m32bcst to eight packed double precision floating-point values in xmm1 with writemask k1.</td></tr>
<tr>
<td>EVEX.256.F3.0F.W0 E6 /r VCVTDQ2PD ymm1 {k1}{z}, xmm2/m128/m32bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert 4 packed signed doubleword integers from xmm2/m128/m32bcst to 4 packed double precision floating-point values in ymm1 with writemask k1.</td></tr>
<tr>
<td>EVEX.512.F3.0F.W0 E6 /r VCVTDQ2PD zmm1 {k1}{z}, ymm2/m256/m32bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Convert eight packed signed doubleword integers from ymm2/m256/m32bcst to eight packed double precision floating-point values in zmm1 with writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>Half</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Converts two, four or eight packed signed doubleword integers in the source operand (the second operand) to two, four or eight packed double precision floating-point values in the destination operand (the first operand).</p>
<p>EVEX encoded versions: The source operand can be a YMM/XMM/XMM (low 64 bits) register, a 256/128/64-bit memory location or a 256/128/64-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1. Attempt to encode this instruction with EVEX embedded rounding is ignored.</p>
<p>VEX.256 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a YMM register.</p>
<p>VEX.128 encoded version: The source operand is an XMM register or 64- bit memory location. The destination operand is a XMM register. The upper Bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The source operand is an XMM register or 64- bit memory location. The destination operand is an XMM register. The upper Bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>
<p>VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.</p>
<figure id="fig-3-11">
<svg style="width: 475.77604799999995pt; height: 142.12797599999985pt" viewBox="101.24000000000001 0.0 401.48004 123.43997999999988">
<g xmlns="http://www.w3.org/2000/svg" style="stroke: none; fill: none">
<rect height="117.48" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="103.74000000000001" y="0.479979999999955"></rect>
<rect height="117.48" style="fill: rgb(0%, 0%, 0%)" width="0.48004" x="499.74" y="0.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="396.48" x="103.74000000000001" y="0.0"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="396.48" x="103.74000000000001" y="117.95993999999985"></rect>
<path d="M 219.18 83.81997999999999 L 220.8 87.77998000000002 L 221.04000000000002 88.43997999999999 L 220.38 88.49997999999994 L 204.9 90.17998 L 203.16 90.35997999999995 L 204.54000000000002 89.27998000000002 L 216.78 79.61997999999994 L 217.32 79.19997999999998 L 217.56 79.79998 L 217.44 80.39998000000003 L 205.20000000000002 90.05998 L 204.54000000000002 89.27998000000002 L 204.84 89.15998000000002 L 220.32 87.47997999999995 L 220.38 88.49997999999994 L 219.9 88.13998000000004 L 218.28 84.17998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 217.56 79.79997999999989 L 219.18 83.81997999999987 L 218.28 84.17997999999989 L 216.66 80.1599799999999" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 218.76 83.99997999999994 L 220.38 87.95997999999997 L 204.89999999999998 89.63997999999992 L 217.14 79.97997999999995" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 335.28000000000003 37.31997999999999 L 334.50000000000006 35.45997999999997 L 218.88000000000002 82.85997999999995 L 219.66000000000003 84.71997999999996" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.14" x="429.36" y="22.85997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="429.36" y="22.620000000000005"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="466.26" y="22.85997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="429.12" y="36.120000000000005"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="429.12" y="22.61997999999994"></rect>
<path d="M 431.70000000000005 75.05997999999988 L 431.88000000000005 75.11997999999983 L 432.12000000000006 75.17997999999989 C 433.1 74.98597999999993 433.446 73.89997999999991 432.6 73.31997999999987L 432.4200000000001 73.19997999999987 L 432.24000000000007 73.13997999999992 C 431.266 72.99497999999983 430.617 74.2709799999999 431.40000000000003 74.87997999999993L 431.52000000000004 74.99997999999994 L 431.70000000000005 75.05997999999988" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 432.06 74.21997999999996 L 436.08 75.71997999999996 L 436.68 75.95997999999997 L 436.26 76.49997999999994 L 426.96 89.03998000000001 L 425.94 90.41998000000001 L 426.06 88.67998 L 427.32 73.13997999999992 L 427.44 72.47997999999995 L 428.04 72.71997999999996 L 428.34 73.19997999999998 L 427.08 88.73997999999995 L 426.06 88.67998 L 426.18 88.37997999999993 L 435.48 75.83997999999997 L 436.26 76.49997999999994 L 435.72 76.61997999999994 L 431.7 75.11997999999994" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 428.04 72.71997999999996 L 432.06 74.21997999999996 L 431.70000000000005 75.11997999999994 L 427.68 73.61997999999994" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 431.88 74.63997999999992 L 435.9 76.13997999999992 L 426.6 88.67997999999989 L 427.86 73.13997999999992" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 447.18 36.719979999999964 L 445.32 35.99997999999994 L 431.16 73.79998 L 433.02 74.51997999999992" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 366.66 78.83997999999985 L 366.78000000000003 78.95997999999986 L 366.96000000000004 79.0199799999998 C 367.836 79.46497999999985 368.76300000000003 78.32597999999984 368.16 77.57997999999986L 368.04 77.3999799999998 L 367.92 77.27997999999991 L 367.56 77.1599799999999 L 367.32000000000005 77.09997999999985 L 366.78000000000003 77.27997999999991 L 366.66 77.3999799999998 C 366.351 77.8859799999999 366.35 77.7269799999998 366.36 78.29997999999989L 366.42 78.47997999999984 L 366.66 78.83997999999985" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 367.38 78.11997999999994 L 370.38 81.17997999999989 L 370.8 81.71997999999996 L 370.26 82.01997999999992 L 356.64 89.57997999999998 L 355.08 90.4199799999999 L 355.92 88.85997999999995 L 363.48 75.17997999999989 L 363.84 74.63997999999992 L 364.32 75.05998 L 364.38 75.6599799999999 L 356.82 89.33997999999997 L 355.92 88.85997999999995 L 356.15999999999997 88.67997999999989 L 369.78 81.11997999999994 L 370.26 82.01997999999992 L 369.65999999999997 81.89997999999991 L 366.65999999999997 78.83997999999997" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 364.32 75.05997999999988 L 367.38 78.11997999999983 L 366.65999999999997 78.83997999999985 L 363.59999999999997 75.77997999999991" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 367.02 78.47997999999995 L 370.02 81.5399799999999 L 356.4 89.09997999999996 L 363.96 75.41998000000001" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 409.86 37.07997999999998 L 408.48 35.69997999999998 L 366.66 77.45997999999997 L 368.04 78.83997999999997" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 294.90000000000003 82.01997999999992 L 295.02000000000004 82.19997999999987 L 295.14000000000004 82.31997999999987 C 295.867 82.97597999999994 297.043 82.05997999999988 296.70000000000005 81.17997999999989L 296.58000000000004 80.99997999999994 L 296.46000000000004 80.87997999999993 C 295.939 80.07897999999989 294.54600000000005 80.78897999999992 294.78000000000003 81.6599799999999L 294.78000000000003 81.83997999999997 L 294.90000000000003 82.01997999999992" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 295.74 81.5399799999999 L 297.90000000000003 85.25997999999993 L 298.2 85.85997999999995 L 297.54 85.97997999999995 L 282.54 89.87997999999993 L 280.8 90.35997999999995 L 282.0 89.09997999999985 L 292.68 77.81997999999987 L 293.16 77.27997999999991 L 293.52 77.87997999999993 L 293.46000000000004 78.47997999999995 L 282.78000000000003 89.75997999999993 L 282.0 89.09997999999985 L 282.3 88.9199799999999 L 297.36 85.01997999999992 L 297.54 85.97997999999995 L 297.0 85.79997999999989 L 294.84000000000003 82.07997999999986" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 293.52 77.87997999999993 L 295.74 81.5399799999999 L 294.84 82.07997999999998 L 292.62 78.4199799999999" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 295.32 81.77997999999991 L 297.48 85.49997999999994 L 282.42 89.39997999999991 L 293.09999999999997 78.11997999999994" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 372.54 37.19997999999998 L 371.52000000000004 35.51998000000003 L 295.26 80.69997999999998 L 296.28000000000003 82.37997999999993" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.14" x="392.22" y="22.85997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="392.22" y="22.620000000000005"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="429.12" y="22.85997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="391.98" y="36.120000000000005"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="391.98" y="22.61997999999994"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.08" x="355.14" y="22.85997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.32" x="355.14" y="22.620000000000005"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="391.98" y="22.85997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.32" x="354.90000000000003" y="36.120000000000005"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="354.90000000000003" y="22.61997999999994"></rect>
<rect height="13.86" style="fill: rgb(100%, 100%, 100%)" width="38.64" x="316.5" y="22.499979999999937"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="38.88" x="316.5" y="22.25999999999999"></rect>
<rect height="14.100000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="354.90000000000003" y="22.499979999999937"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="38.88" x="316.26" y="36.120000000000005"></rect>
<rect height="14.100000000000001" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="316.26" y="22.259979999999928"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.22" x="391.68" y="90.35997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="391.68" y="90.12"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="465.66" y="90.35997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="391.44" y="103.62"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="391.44" y="90.11997999999994"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.28" x="168.9" y="90.35997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="168.9" y="90.12"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="242.94" y="90.35997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="168.66" y="103.62"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="168.66" y="90.11997999999994"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="84.06" x="233.34" y="90.35997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="84.30000000000001" x="233.34" y="90.12"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="317.16" y="90.35997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="84.30000000000001" x="233.10000000000002" y="103.62"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="233.10000000000002" y="90.11997999999994"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.28" x="317.40000000000003" y="90.35997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="317.40000000000003" y="90.12"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="391.44" y="90.35997999999995"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="317.16" y="103.62"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="317.16" y="90.11997999999994"></rect>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="9.172166699999991" x="329.04" y="32.970877799999926">X3</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="83.45216670000002" x="366.18" y="32.970877799999926">X2 X1 X0</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="15.82151632" x="149.45976564" y="34.59138643999995">SRC</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="9.23248510000002" x="192.54" y="100.47087779999993">X3</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="9.172166699999991" x="276.90000000000003" y="100.47087779999993">X2</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.80058169554718pt; fill: #000" textLength="9.904999895989476" x="351.24" y="100.5438892632261">X1</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.80058169554718pt; fill: #000" textLength="9.844471945989483" x="425.399176" y="100.5438892632261">X0</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="20.024954820000005" x="149.22" y="103.83087779999994">DEST</text></g></svg>
<figcaption><a href='cvtdq2pd.html#fig-3-11'>Figure 3-11</a>. CVTDQ2PD (VEX.256 encoded version)</figcaption></figure>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vcvtdq2pd--evex-encoded-versions--when-src-operand-is-a-register">VCVTDQ2PD (EVEX Encoded Versions) When SRC Operand is a Register<a class="anchor" href="#vcvtdq2pd--evex-encoded-versions--when-src-operand-is-a-register">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
k := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+63:i] :=
Convert_Integer_To_Double_Precision_Floating_Point(SRC[k+31:k])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+63:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+63:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vcvtdq2pd--evex-encoded-versions--when-src-operand-is-a-memory-source">VCVTDQ2PD (EVEX Encoded Versions) When SRC Operand is a Memory Source<a class="anchor" href="#vcvtdq2pd--evex-encoded-versions--when-src-operand-is-a-memory-source">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 64
k := j * 32
IF k1[j] OR *no writemask*
THEN
IF (EVEX.b = 1)
THEN
DEST[i+63:i] :=
Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0])
ELSE
DEST[i+63:i] :=
Convert_Integer_To_Double_Precision_Floating_Point(SRC[k+31:k])
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+63:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+63:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vcvtdq2pd--vex-256-encoded-version-">VCVTDQ2PD (VEX.256 Encoded Version)<a class="anchor" href="#vcvtdq2pd--vex-256-encoded-version-">
</a></h3>
<pre>DEST[63:0] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0])
DEST[127:64] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[63:32])
DEST[191:128] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[95:64])
DEST[255:192] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[127:96)
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vcvtdq2pd--vex-128-encoded-version-">VCVTDQ2PD (VEX.128 Encoded Version)<a class="anchor" href="#vcvtdq2pd--vex-128-encoded-version-">
</a></h3>
<pre>DEST[63:0] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0])
DEST[127:64] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[63:32])
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="cvtdq2pd--128-bit-legacy-sse-version-">CVTDQ2PD (128-bit Legacy SSE Version)<a class="anchor" href="#cvtdq2pd--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[63:0] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0])
DEST[127:64] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[63:32])
DEST[MAXVL-1:128] (unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCVTDQ2PD __m512d _mm512_cvtepi32_pd( __m256i a);
</pre>
<pre>VCVTDQ2PD __m512d _mm512_mask_cvtepi32_pd( __m512d s, __mmask8 k, __m256i a);
</pre>
<pre>VCVTDQ2PD __m512d _mm512_maskz_cvtepi32_pd( __mmask8 k, __m256i a);
</pre>
<pre>VCVTDQ2PD __m256d _mm256_cvtepi32_pd (__m128i src);
</pre>
<pre>VCVTDQ2PD __m256d _mm256_mask_cvtepi32_pd( __m256d s, __mmask8 k, __m256i a);
</pre>
<pre>VCVTDQ2PD __m256d _mm256_maskz_cvtepi32_pd( __mmask8 k, __m256i a);
</pre>
<pre>VCVTDQ2PD __m128d _mm_mask_cvtepi32_pd( __m128d s, __mmask8 k, __m128i a);
</pre>
<pre>VCVTDQ2PD __m128d _mm_maskz_cvtepi32_pd( __mmask8 k, __m128i a);
</pre>
<pre>CVTDQ2PD __m128d _mm_cvtepi32_pd (__m128i src)
</pre>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-22</span>, “Type 5 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-51</span>, “Type E5 Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

214
x86/cvtdq2ps.html Normal file
View file

@ -0,0 +1,214 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CVTDQ2PS
— Convert Packed Doubleword Integers to Packed Single Precision Floating-PointValues</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CVTDQ2PS
— Convert Packed Doubleword Integers to Packed Single Precision Floating-PointValues</h1>
<table>
<tr>
<th>Opcode Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 5B /r CVTDQ2PS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Convert four packed signed doubleword integers from xmm2/mem to four packed single precision floating-point values in xmm1.</td></tr>
<tr>
<td>VEX.128.0F.WIG 5B /r VCVTDQ2PS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert four packed signed doubleword integers from xmm2/mem to four packed single precision floating-point values in xmm1.</td></tr>
<tr>
<td>VEX.256.0F.WIG 5B /r VCVTDQ2PS ymm1, ymm2/m256</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert eight packed signed doubleword integers from ymm2/mem to eight packed single precision floating-point values in ymm1.</td></tr>
<tr>
<td>EVEX.128.0F.W0 5B /r VCVTDQ2PS xmm1 {k1}{z}, xmm2/m128/m32bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert four packed signed doubleword integers from xmm2/m128/m32bcst to four packed single precision floating-point values in xmm1with writemask k1.</td></tr>
<tr>
<td>EVEX.256.0F.W0 5B /r VCVTDQ2PS ymm1 {k1}{z}, ymm2/m256/m32bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert eight packed signed doubleword integers from ymm2/m256/m32bcst to eight packed single precision floating-point values in ymm1with writemask k1.</td></tr>
<tr>
<td>EVEX.512.0F.W0 5B /r VCVTDQ2PS zmm1 {k1}{z}, zmm2/m512/m32bcst{er}</td>
<td>B</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Convert sixteen packed signed doubleword integers from zmm2/m512/m32bcst to sixteen packed single precision floating-point values in zmm1with writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Converts four, eight or sixteen packed signed doubleword integers in the source operand to four, eight or sixteen packed single precision floating-point values in the destination operand.</p>
<p>EVEX encoded versions: The source operand can be a ZMM/YMM/XMM register, a 512/256/128-bit memory location or a 512/256/128-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1.</p>
<p>VEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination operand is a YMM register. Bits (MAXVL-1:256) of the corresponding register destination are zeroed.</p>
<p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the corresponding register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operand is an XMM register. The upper Bits (MAXVL-1:128) of the corresponding register destination are unmodified.</p>
<p>VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vcvtdq2ps--evex-encoded-versions--when-src-operand-is-a-register">VCVTDQ2PS (EVEX Encoded Versions) When SRC Operand is a Register<a class="anchor" href="#vcvtdq2ps--evex-encoded-versions--when-src-operand-is-a-register">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
IF (VL = 512) AND (EVEX.b = 1)
THEN
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC); ; refer to <span class="not-imported">Table 15-4</span> in the Intel<sup>®</sup> 64 and IA-32 Architectures
Software Developers Manual, Volume 1
ELSE
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC); ; refer to <span class="not-imported">Table 15-4</span> in the Intel<sup>®</sup> 64 and IA-32 Architectures
Software Developers Manual, Volume 1
FI;
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] :=
Convert_Integer_To_Single_Precision_Floating_Point(SRC[i+31:i])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vcvtdq2ps--evex-encoded-versions--when-src-operand-is-a-memory-source">VCVTDQ2PS (EVEX Encoded Versions) When SRC Operand is a Memory Source<a class="anchor" href="#vcvtdq2ps--evex-encoded-versions--when-src-operand-is-a-memory-source">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN
IF (EVEX.b = 1)
THEN
DEST[i+31:i] :=
Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0])
ELSE
DEST[i+31:i] :=
Convert_Integer_To_Single_Precision_Floating_Point(SRC[i+31:i])
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vcvtdq2ps--vex-256-encoded-version-">VCVTDQ2PS (VEX.256 Encoded Version)<a class="anchor" href="#vcvtdq2ps--vex-256-encoded-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0])
DEST[63:32] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:32])
DEST[95:64] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[95:64])
DEST[127:96] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[127:96)
DEST[159:128] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[159:128])
DEST[191:160] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[191:160])
DEST[223:192] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[223:192])
DEST[255:224] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[255:224)
DEST[MAXVL-1:256] := 0
</pre>
<h3 id="vcvtdq2ps--vex-128-encoded-version-">VCVTDQ2PS (VEX.128 Encoded Version)<a class="anchor" href="#vcvtdq2ps--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0])
DEST[63:32] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:32])
DEST[95:64] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[95:64])
DEST[127:96] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[127z:96)
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="cvtdq2ps--128-bit-legacy-sse-version-">CVTDQ2PS (128-bit Legacy SSE Version)<a class="anchor" href="#cvtdq2ps--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0])
DEST[63:32] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:32])
DEST[95:64] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[95:64])
DEST[127:96] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[127z:96)
DEST[MAXVL-1:128] (unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCVTDQ2PS __m512 _mm512_cvtepi32_ps( __m512i a);
</pre>
<pre>VCVTDQ2PS __m512 _mm512_mask_cvtepi32_ps( __m512 s, __mmask16 k, __m512i a);
</pre>
<pre>VCVTDQ2PS __m512 _mm512_maskz_cvtepi32_ps( __mmask16 k, __m512i a);
</pre>
<pre>VCVTDQ2PS __m512 _mm512_cvt_roundepi32_ps( __m512i a, int r);
</pre>
<pre>VCVTDQ2PS __m512 _mm512_mask_cvt_roundepi_ps( __m512 s, __mmask16 k, __m512i a, int r);
</pre>
<pre>VCVTDQ2PS __m512 _mm512_maskz_cvt_roundepi32_ps( __mmask16 k, __m512i a, int r);
</pre>
<pre>VCVTDQ2PS __m256 _mm256_mask_cvtepi32_ps( __m256 s, __mmask8 k, __m256i a);
</pre>
<pre>VCVTDQ2PS __m256 _mm256_maskz_cvtepi32_ps( __mmask8 k, __m256i a);
</pre>
<pre>VCVTDQ2PS __m128 _mm_mask_cvtepi32_ps( __m128 s, __mmask8 k, __m128i a);
</pre>
<pre>VCVTDQ2PS __m128 _mm_maskz_cvtepi32_ps( __mmask8 k, __m128i a);
</pre>
<pre>CVTDQ2PS __m256 _mm256_cvtepi32_ps (__m256i src)
</pre>
<pre>CVTDQ2PS __m128 _mm_cvtepi32_ps (__m128i src)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Precision.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

289
x86/cvtpd2dq.html Normal file
View file

@ -0,0 +1,289 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CVTPD2DQ
— Convert Packed Double Precision Floating-Point Values to Packed DoublewordIntegers</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CVTPD2DQ
— Convert Packed Double Precision Floating-Point Values to Packed DoublewordIntegers</h1>
<table>
<tr>
<th>Opcode Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>F2 0F E6 /r CVTPD2DQ xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Convert two packed double precision floating-point values in xmm2/mem to two signed doubleword integers in xmm1.</td></tr>
<tr>
<td>VEX.128.F2.0F.WIG E6 /r VCVTPD2DQ xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert two packed double precision floating-point values in xmm2/mem to two signed doubleword integers in xmm1.</td></tr>
<tr>
<td>VEX.256.F2.0F.WIG E6 /r VCVTPD2DQ xmm1, ymm2/m256</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert four packed double precision floating-point values in ymm2/mem to four signed doubleword integers in xmm1.</td></tr>
<tr>
<td>EVEX.128.F2.0F.W1 E6 /r VCVTPD2DQ xmm1 {k1}{z}, xmm2/m128/m64bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert two packed double precision floating-point values in xmm2/m128/m64bcst to two signed doubleword integers in xmm1 subject to writemask k1.</td></tr>
<tr>
<td>EVEX.256.F2.0F.W1 E6 /r VCVTPD2DQ xmm1 {k1}{z}, ymm2/m256/m64bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert four packed double precision floating-point values in ymm2/m256/m64bcst to four signed doubleword integers in xmm1 subject to writemask k1.</td></tr>
<tr>
<td>EVEX.512.F2.0F.W1 E6 /r VCVTPD2DQ ymm1 {k1}{z}, zmm2/m512/m64bcst{er}</td>
<td>B</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Convert eight packed double precision floating-point values in zmm2/m512/m64bcst to eight signed doubleword integers in ymm1 subject to writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Converts packed double precision floating-point values in the source operand (second operand) to packed signed doubleword integers in the destination operand (first operand).</p>
<p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits. If a converted result cannot be represented in the destination format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (2<sup>w</sup>-1, where w represents the number of bits in the destination format) is returned.</p>
<p>EVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512-bit memory location, or a 512-bit vector broadcasted from a 64-bit memory location. The destination operand is a ZMM/YMM/XMM register conditionally updated with writemask k1. The upper bits (MAXVL-1:256/128/64) of the corresponding destination are zeroed.</p>
<p>VEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:64) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operand is an XMM register. Bits[127:64] of the destination XMM register are zeroed. However, the upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>
<p>VEX.vvvv and EVEX.vvvv are reserved and must be 1111b, otherwise instructions will #UD.</p>
<figure id="fig-3-12">
<svg style="width: 475.848pt; height: 151.0559759999999pt" viewBox="100.7 0.0 401.54 130.87997999999993">
<g xmlns="http://www.w3.org/2000/svg" style="stroke: none; fill: none">
<rect height="124.92" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="103.2" y="0.479979999999955"></rect>
<rect height="124.92" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="499.20000000000005" y="0.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="396.54" x="103.2" y="0.0"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="396.54" x="103.2" y="125.39999999999998"></rect>
<path d="M 285.96 88.55997999999988 L 287.46 84.5399799999999 L 287.7 83.93997999999988 L 288.23999999999995 84.35997999999984 L 300.71999999999997 93.6599799999999 L 302.09999999999997 94.67997999999989 L 300.35999999999996 94.55997999999988 L 284.82 93.29997999999989 L 284.15999999999997 93.17997999999989 L 284.4 92.57997999999986 L 284.88 92.27997999999991 L 300.41999999999996 93.5399799999999 L 300.35999999999996 94.55997999999988 L 300.06 94.43997999999988 L 287.58 85.13997999999992 L 288.23999999999995 84.35997999999984 L 288.35999999999996 84.89997999999991 L 286.85999999999996 88.9199799999999" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 284.40000000000003 92.57997999999998 L 285.96000000000004 88.55998 L 286.86 88.91998000000001 L 285.3 92.93997999999999" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 286.38 88.73997999999995 L 287.88 84.71997999999996 L 300.36 94.01997999999992 L 284.82 92.75997999999993" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 167.46 43.07997999999998 L 166.74 44.93997999999999 L 285.54 89.51997999999992 L 286.26 87.65998000000002" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.08" x="393.24" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.32" x="393.24" y="94.38"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="430.08" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.32" x="393.0" y="107.88"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="393.0" y="94.37997999999993"></rect>
<path d="M 404.94 79.13997999999992 L 405.12 79.07997999999998 L 405.3 78.95997999999997 L 405.42 78.83997999999997 L 405.54 78.6599799999999 L 405.6 78.47997999999995 L 405.6 78.11997999999994 L 405.54 77.87997999999993 L 405.48 77.69997999999987 C 405.043 77.18997999999988 405.364 77.47197999999992 404.7 77.21997999999996L 404.52 77.21997999999996 L 404.28 77.27997999999991 L 404.1 77.33997999999997 L 403.98 77.45997999999997 L 403.8 77.57997999999998 L 403.68 77.93997999999988 L 403.62 78.11997999999994 L 403.62 78.35997999999995 L 403.68 78.5399799999999 C 403.727 78.8329799999999 404.047 79.15297999999996 404.34 79.19997999999987L 404.76 79.19997999999987 L 404.94 79.13997999999992" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 404.58 78.17997999999989 L 408.65999999999997 76.85997999999984 L 409.32 76.73997999999983 L 409.38 77.33997999999985 L 410.03999999999996 92.93997999999988 L 410.09999999999997 94.67997999999989 L 409.08 93.23997999999983 L 400.26 80.33997999999985 L 399.9 79.79997999999989 L 400.5 79.55997999999988 L 401.09999999999997 79.79997999999989 L 409.91999999999996 92.69997999999987 L 409.08 93.23997999999983 L 409.02 92.99997999999994 L 408.35999999999996 77.39997999999991 L 409.38 77.33997999999985 L 409.02 77.87997999999993 L 404.94 79.19997999999987" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 400.5 79.55997999999988 L 404.58 78.17997999999989 L 404.94 79.19997999999987 L 400.86 80.57997999999986" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 404.76 78.6599799999999 L 408.84 77.33997999999985 L 409.5 92.93997999999988 L 400.68 80.0399799999999" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 394.20000000000005 43.67997999999989 L 392.34000000000003 44.27997999999991 L 403.74000000000007 78.5399799999999 L 405.6 77.93997999999988" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 361.86 82.73997999999995 L 362.1 82.37997999999993 C 362.41700000000003 81.45097999999996 361.553 80.67597999999998 360.66 81.11997999999994L 360.48 81.23997999999995 L 360.36 81.4199799999999 C 359.802 82.13897999999995 360.54 83.28797999999995 361.5 82.9199799999999L 361.68 82.85997999999995 L 361.86 82.73997999999995" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 361.20000000000005 81.95997999999997 L 364.32000000000005 79.07997999999998 L 364.80000000000007 78.65998000000002 L 365.1 79.25997999999993 L 372.18000000000006 93.11997999999994 L 372.96000000000004 94.67998 L 371.46000000000004 93.77998000000002 L 358.14000000000004 85.73997999999995 L 357.6 85.37997999999993 L 358.08000000000004 84.89998000000003 L 358.62000000000006 84.83997999999997 L 371.94000000000005 92.87997999999993 L 371.46000000000004 93.77998000000002 L 371.28000000000003 93.53998000000001 L 364.20000000000005 79.67998 L 365.1 79.25997999999993 L 364.98 79.85997999999995 L 361.86000000000007 82.73997999999995" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 358.08 84.89997999999991 L 361.2 81.95997999999986 L 361.85999999999996 82.73997999999995 L 358.74 85.67997999999989" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 361.5 82.31997999999999 L 364.62 79.43997999999999 L 371.7 93.29998 L 358.38 85.25998000000004" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 326.46 43.31997999999999 L 325.02 44.69997999999998 L 360.41999999999996 82.67998 L 361.85999999999996 81.29998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 324.96 86.69997999999998 L 325.02 86.51998000000003 L 325.08 86.33997999999997 C 325.19 85.30097999999998 323.916 84.90098 323.34 85.55998L 323.21999999999997 85.73997999999995 L 323.15999999999997 85.91998000000001 C 322.789 86.83897999999999 323.96 87.66197999999997 324.71999999999997 86.99997999999994L 324.84 86.87997999999993 L 324.96 86.69997999999998" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 324.12 86.21997999999996 L 326.22 82.49997999999994 L 326.58 81.95997999999997 L 327.06 82.37997999999993 L 337.98 93.41998000000001 L 339.24 94.67998 L 337.56 94.25997999999993 L 322.38 90.71997999999996 L 321.72 90.59997999999996 L 322.02 89.99997999999994 L 322.56 89.75997999999993 L 337.74 93.29998 L 337.56 94.25997999999993 L 337.26 94.13997999999992 L 326.28000000000003 83.09997999999996 L 327.06 82.37997999999993 L 327.12 82.97997999999995 L 325.02 86.69997999999998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 322.02 89.99997999999994 L 324.12 86.21997999999996 L 325.02 86.69997999999998 L 322.91999999999996 90.47997999999995" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 324.54 86.45997999999997 L 326.64000000000004 82.73997999999995 L 337.62 93.77998000000002 L 322.44 90.23997999999995" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 248.58 43.13997999999992 L 247.62 44.87997999999993 L 323.58000000000004 87.11997999999994 L 324.54 85.37997999999993" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.14" x="356.1" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="356.1" y="94.38"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="393.0" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="355.86" y="107.88"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="355.86" y="94.37997999999993"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.14" x="318.96" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="318.96" y="94.38"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="355.86" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="318.72" y="107.88"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="318.72" y="94.37997999999993"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.14" x="281.82" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="281.82" y="94.38"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="318.72" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="281.58" y="107.88"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="281.58" y="94.37997999999993"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.22" x="356.1" y="30.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="356.1" y="30.24000000000001"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="430.08" y="30.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="355.86" y="43.74000000000001"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="355.86" y="30.239979999999946"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.28" x="133.32" y="30.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="133.32" y="30.24000000000001"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="207.36" y="30.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="133.08" y="43.74000000000001"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="133.08" y="30.239979999999946"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.22" x="207.60000000000002" y="30.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="207.60000000000002" y="30.24000000000001"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="281.58" y="30.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="207.36" y="43.74000000000001"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="207.36" y="30.239979999999946"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.28" x="281.82" y="30.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="281.82" y="30.24000000000001"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="355.86" y="30.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="281.58" y="43.74000000000001"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="281.58" y="30.239979999999946"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="145.5" x="135.42000000000002" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="145.74" x="135.42000000000002" y="94.38"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="280.68" y="94.61997999999994"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="145.74" x="135.18" y="107.88"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="135.18" y="94.37997999999993"></rect>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="10.438853099999989" x="113.10000000000001" y="40.650871579999944">SR</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="9.17216670000002" x="156.96" y="40.650877799999876">X3</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="9.17216670000002" x="241.32" y="40.650877799999876">X2</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.80058169554718pt; fill: #000" textLength="9.904170745989461" x="315.66" y="40.723889263226056">X1</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.80058169554718pt; fill: #000" textLength="9.844471945989483" x="389.81834685000007" y="40.723889263226056">X0</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="20.02495481999999" x="113.10000000000001" y="104.73087779999992">DEST</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000001pt; fill: #000" textLength="4.192128800000006" x="204.9" y="104.73087779999989">0</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="120.5321667" x="292.92" y="104.73087779999992">X3 X2 X1 X0</text></g></svg>
<figcaption><a href='cvtpd2dq.html#fig-3-12'>Figure 3-12</a>. VCVTPD2DQ (VEX.256 encoded version)</figcaption></figure>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vcvtpd2dq--evex-encoded-versions--when-src-operand-is-a-register">VCVTPD2DQ (EVEX Encoded Versions) When SRC Operand is a Register<a class="anchor" href="#vcvtpd2dq--evex-encoded-versions--when-src-operand-is-a-register">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
IF (VL = 512) AND (EVEX.b = 1)
THEN
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
ELSE
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
FI;
FOR j := 0 TO KL-1
i := j * 32
k := j * 64
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] :=
Convert_Double_Precision_Floating_Point_To_Integer(SRC[k+63:k])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE
; zeroing-masking
DEST[i+31:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL/2] := 0
</pre>
<h3 id="vcvtpd2dq--evex-encoded-versions--when-src-operand-is-a-memory-source">VCVTPD2DQ (EVEX Encoded Versions) When SRC Operand is a Memory Source<a class="anchor" href="#vcvtpd2dq--evex-encoded-versions--when-src-operand-is-a-memory-source">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 32
k := j * 64
IF k1[j] OR *no writemask*
THEN
IF (EVEX.b = 1)
THEN
DEST[i+31:i] :=
Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0])
ELSE
DEST[i+31:i] :=
Convert_Double_Precision_Floating_Point_To_Integer(SRC[k+63:k])
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL/2] := 0
</pre>
<h3 id="vcvtpd2dq--vex-256-encoded-version-">VCVTPD2DQ (VEX.256 Encoded Version)<a class="anchor" href="#vcvtpd2dq--vex-256-encoded-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0])
DEST[63:32] := Convert_Double_Precision_Floating_Point_To_Integer(SRC[127:64])
DEST[95:64] := Convert_Double_Precision_Floating_Point_To_Integer(SRC[191:128])
DEST[127:96] := Convert_Double_Precision_Floating_Point_To_Integer(SRC[255:192)
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vcvtpd2dq--vex-128-encoded-version-">VCVTPD2DQ (VEX.128 Encoded Version)<a class="anchor" href="#vcvtpd2dq--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0])
DEST[63:32] := Convert_Double_Precision_Floating_Point_To_Integer(SRC[127:64])
DEST[MAXVL-1:64] := 0
</pre>
<h3 id="cvtpd2dq--128-bit-legacy-sse-version-">CVTPD2DQ (128-bit Legacy SSE Version)<a class="anchor" href="#cvtpd2dq--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Double_Precision_Floating_Point_To_Integer(SRC[63:0])
DEST[63:32] := Convert_Double_Precision_Floating_Point_To_Integer(SRC[127:64])
DEST[127:64] := 0
DEST[MAXVL-1:128] (unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCVTPD2DQ __m256i _mm512_cvtpd_epi32( __m512d a);
</pre>
<pre>VCVTPD2DQ __m256i _mm512_mask_cvtpd_epi32( __m256i s, __mmask8 k, __m512d a);
</pre>
<pre>VCVTPD2DQ __m256i _mm512_maskz_cvtpd_epi32( __mmask8 k, __m512d a);
</pre>
<pre>VCVTPD2DQ __m256i _mm512_cvt_roundpd_epi32( __m512d a, int r);
</pre>
<pre>VCVTPD2DQ __m256i _mm512_mask_cvt_roundpd_epi32( __m256i s, __mmask8 k, __m512d a, int r);
</pre>
<pre>VCVTPD2DQ __m256i _mm512_maskz_cvt_roundpd_epi32( __mmask8 k, __m512d a, int r);
</pre>
<pre>VCVTPD2DQ __m128i _mm256_mask_cvtpd_epi32( __m128i s, __mmask8 k, __m256d a);
</pre>
<pre>VCVTPD2DQ __m128i _mm256_maskz_cvtpd_epi32( __mmask8 k, __m256d a);
</pre>
<pre>VCVTPD2DQ __m128i _mm_mask_cvtpd_epi32( __m128i s, __mmask8 k, __m128d a);
</pre>
<pre>VCVTPD2DQ __m128i _mm_maskz_cvtpd_epi32( __mmask8 k, __m128d a);
</pre>
<pre>VCVTPD2DQ __m128i _mm256_cvtpd_epi32 (__m256d src)
</pre>
<pre>CVTPD2DQ __m128i _mm_cvtpd_epi32 (__m128d src)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid, Precision.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

65
x86/cvtpd2pi.html Normal file
View file

@ -0,0 +1,65 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CVTPD2PI
— Convert Packed Double Precision Floating-Point Values to Packed Dword Integers</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CVTPD2PI
— Convert Packed Double Precision Floating-Point Values to Packed Dword Integers</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 2D /r CVTPD2PI mm, xmm/m128</td>
<td>RM</td>
<td>V/V</td>
<td>SSE2</td>
<td>Convert two packed double precision floating-point values from xmm/m128 to two packed signed doubleword integers in mm.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Converts two packed double precision floating-point values in the source operand (second operand) to two packed signed doubleword integers in the destination operand (first operand).</p>
<p>The source operand can be an XMM register or a 128-bit memory location. The destination operand is an MMX technology register.</p>
<p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned.</p>
<p>This instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the CVTPD2PI instruction is executed.</p>
<p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>DEST[31:0] := Convert_Double_Precision_Floating_Point_To_Integer32(SRC[63:0]);
DEST[63:32] := Convert_Double_Precision_Floating_Point_To_Integer32(SRC[127:64]);
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>CVTPD1PI __m64 _mm_cvtpd_pi32(__m128d a)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid, Precision.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 23-4</span>, “Exception Conditions for Legacy SIMD/MMX Instructions with FP Exception and 16-Byte Alignment” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 3B.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

286
x86/cvtpd2ps.html Normal file
View file

@ -0,0 +1,286 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CVTPD2PS
— Convert Packed Double Precision Floating-Point Values to Packed Single PrecisionFloating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CVTPD2PS
— Convert Packed Double Precision Floating-Point Values to Packed Single PrecisionFloating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 5A /r CVTPD2PS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Convert two packed double precision floating-point values in xmm2/mem to two single precision floating-point values in xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F.WIG 5A /r VCVTPD2PS xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert two packed double precision floating-point values in xmm2/mem to two single precision floating-point values in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F.WIG 5A /r VCVTPD2PS xmm1, ymm2/m256</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert four packed double precision floating-point values in ymm2/mem to four single precision floating-point values in xmm1.</td></tr>
<tr>
<td>EVEX.128.66.0F.W1 5A /r VCVTPD2PS xmm1 {k1}{z}, xmm2/m128/m64bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert two packed double precision floating-point values in xmm2/m128/m64bcst to two single precision floating-point values in xmm1with writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W1 5A /r VCVTPD2PS xmm1 {k1}{z}, ymm2/m256/m64bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert four packed double precision floating-point values in ymm2/m256/m64bcst to four single precision floating-point values in xmm1with writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W1 5A /r VCVTPD2PS ymm1 {k1}{z}, zmm2/m512/m64bcst{er}</td>
<td>B</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Convert eight packed double precision floating-point values in zmm2/m512/m64bcst to eight single precision floating-point values in ymm1with writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Converts two, four or eight packed double precision floating-point values in the source operand (second operand) to two, four or eight packed single precision floating-point values in the destination operand (first operand).</p>
<p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits.</p>
<p>EVEX encoded versions: The source operand is a ZMM/YMM/XMM register, a 512/256/128-bit memory location, or a 512/256/128-bit vector broadcasted from a 64-bit memory location. The destination operand is a YMM/XMM/XMM (low 64-bits) register conditionally updated with writemask k1. The upper bits (MAXVL-1:256/128/64) of the corresponding destination are zeroed.</p>
<p>VEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:64) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operand is an XMM register. Bits[127:64] of the destination XMM register are zeroed. However, the upper Bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>
<p>VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p>
<figure id="fig-3-13">
<svg style="width: 475.77604799999995pt; height: 151.0559759999999pt" viewBox="101.24000000000001 0.0 401.48004 130.87997999999993">
<g xmlns="http://www.w3.org/2000/svg" style="stroke: none; fill: none">
<rect height="124.86" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="103.74000000000001" y="0.479979999999955"></rect>
<rect height="124.86" style="fill: rgb(0%, 0%, 0%)" width="0.48004" x="499.74" y="0.479979999999955"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="396.48" x="103.74000000000001" y="0.0"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="396.48" x="103.74000000000001" y="125.39999999999998"></rect>
<path d="M 286.44 86.21997999999996 L 287.94 82.19997999999998 L 288.18 81.59997999999996 L 288.71999999999997 82.01997999999992 L 301.26 91.31997999999999 L 302.64 92.33997999999997 L 300.9 92.21997999999996 L 285.36 90.95997999999997 L 284.7 90.83997999999997 L 284.94 90.23997999999995 L 285.42 89.93997999999999 L 300.96 91.19997999999998 L 300.9 92.21997999999996 L 300.6 92.09997999999996 L 288.06 82.79998 L 288.71999999999997 82.01997999999992 L 288.84 82.55998 L 287.34 86.57997999999998" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 284.94 90.23997999999995 L 286.44 86.21997999999996 L 287.34 86.57997999999998 L 285.84 90.59997999999996" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 286.86 86.39997999999991 L 288.36 82.37997999999993 L 300.90000000000003 91.67997999999989 L 285.36 90.4199799999999" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 167.94 40.79997999999989 L 167.22 42.659979999999905 L 286.02 87.17997999999989 L 286.74 85.31997999999987" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.14" x="393.72" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="393.72" y="92.03993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="430.62" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="393.48" y="105.53993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="393.48" y="92.0399799999999"></rect>
<path d="M 405.48 76.85997999999995 L 405.66 76.73997999999995 L 405.78000000000003 76.61997999999994 C 406.555 76.04697999999996 405.95500000000004 74.74797999999998 405.0 74.87997999999993L 404.64000000000004 74.99997999999994 L 404.46000000000004 75.11997999999994 L 404.34000000000003 75.29998 L 404.22 75.4199799999999 L 404.16 75.59997999999996 L 404.16 76.01997999999992 L 404.22 76.19997999999998 C 404.267 76.49297999999999 404.58700000000005 76.81297999999992 404.88 76.85997999999995L 405.06 76.9199799999999 L 405.24 76.85997999999995 L 405.48 76.85997999999995" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 405.12 75.89997999999991 L 409.2 74.51997999999992 L 409.86 74.39997999999991 L 409.92 74.99997999999994 L 410.58 90.59997999999996 L 410.64 92.33997999999997 L 409.62 90.89997999999991 L 400.8 77.99997999999994 L 400.44 77.45997999999986 L 401.04 77.21997999999996 L 401.64 77.45997999999986 L 410.46 90.35997999999995 L 409.62 90.89997999999991 L 409.56 90.6599799999999 L 408.9 75.05997999999988 L 409.92 74.99997999999994 L 409.56 75.5399799999999 L 405.48 76.9199799999999" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 401.04 77.21997999999996 L 405.12 75.89997999999991 L 405.48 76.91998000000001 L 401.40000000000003 78.23997999999995" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 405.3 76.37997999999993 L 409.38 74.99997999999994 L 410.04 90.59997999999996 L 401.22 77.69997999999998" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 394.68 41.399979999999914 L 392.82 41.99997999999994 L 404.22 76.19997999999987 L 406.08 75.59997999999996" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 362.34000000000003 80.39997999999991 L 362.52000000000004 80.21997999999996 L 362.58000000000004 80.0399799999999 L 362.64000000000004 79.85997999999995 L 362.70000000000005 79.67997999999989 L 362.64000000000004 79.49997999999994 L 362.64000000000004 79.31997999999987 L 362.52000000000004 79.13997999999992 L 362.40000000000003 78.95997999999986 L 362.28000000000003 78.83997999999997 L 362.1 78.71997999999996 L 361.92 78.6599799999999 L 361.50000000000006 78.6599799999999 L 361.14000000000004 78.77997999999991 L 360.84000000000003 79.07997999999986 L 360.66 79.61997999999994 L 360.72 79.79997999999989 L 360.72 79.97997999999995 L 360.96000000000004 80.33997999999997 C 361.30800000000005 80.70397999999989 361.42 80.6359799999999 361.86 80.63997999999992L 362.22 80.51997999999992 L 362.34000000000003 80.39997999999991" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 361.74 79.67997999999989 L 364.86 76.73997999999983 L 365.34000000000003 76.31997999999987 L 365.64 76.9199799999999 L 372.72 90.77997999999991 L 373.5 92.33997999999985 L 372.0 91.43997999999988 L 358.62 83.45997999999986 L 358.08 83.09997999999985 L 358.56 82.61997999999994 L 359.1 82.55997999999988 L 372.48 90.5399799999999 L 372.0 91.43997999999988 L 371.82 91.19997999999987 L 364.74 77.33997999999985 L 365.64 76.9199799999999 L 365.52 77.51997999999992 L 362.40000000000003 80.45997999999986" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 358.56 82.61997999999994 L 361.74 79.67997999999989 L 362.4 80.45997999999997 L 359.22 83.39997999999991" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 362.04 80.0399799999999 L 365.16 77.09997999999985 L 372.24 90.95997999999986 L 358.86 82.97997999999995" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 326.94 41.0399799999999 L 325.5 42.419979999999896 L 360.96 80.39997999999991 L 362.4 79.01997999999992" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 325.5 84.4199799999999 L 325.56 84.17997999999989 L 325.62 83.99997999999994 C 325.69 83.0689799999999 324.422 82.4969799999999 323.88 83.27997999999991L 323.76 83.39997999999991 L 323.64 83.57997999999986 C 323.454 84.67097999999987 324.353 85.19097999999985 325.2 84.71997999999985L 325.5 84.4199799999999" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 324.6 83.87997999999993 L 326.70000000000005 80.1599799999999 L 327.06 79.61997999999994 L 327.54 80.0399799999999 L 338.52000000000004 91.07997999999998 L 339.78000000000003 92.33997999999997 L 338.1 91.9199799999999 L 322.92 88.37997999999993 L 322.26000000000005 88.25997999999993 L 322.56 87.6599799999999 L 323.1 87.4199799999999 L 338.28000000000003 90.95997999999997 L 338.1 91.9199799999999 L 337.8 91.79997999999989 L 326.76000000000005 80.75997999999993 L 327.54 80.0399799999999 L 327.6 80.63997999999992 L 325.5 84.35997999999995" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 322.56 87.6599799999999 L 324.6 83.87997999999993 L 325.5 84.35997999999995 L 323.46 88.13997999999992" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<path d="M 325.02 84.11997999999994 L 327.12 80.39997999999991 L 338.15999999999997 91.43997999999999 L 322.97999999999996 87.89997999999991" style="fill: rgb(0%, 0%, 0%); fill-rule: evenodd"></path>
<path d="M 249.06 40.85997999999995 L 248.1 42.59997999999996 L 324.12 84.77997999999991 L 325.08 83.0399799999999" style="fill: rgb(0%, 0%, 0%); fill-rule: nonzero"></path>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.08" x="356.64" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="37.32" x="356.64" y="92.03993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="393.48" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="37.32" x="356.40000000000003" y="105.53993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="356.40000000000003" y="92.0399799999999"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.14" x="319.5" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="319.5" y="92.03993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="356.40000000000003" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="319.26" y="105.53993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="319.26" y="92.0399799999999"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="37.14" x="282.36" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="282.36" y="92.03993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.47998" x="319.26" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="37.38" x="282.12" y="105.53993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="282.12" y="92.0399799999999"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.22" x="356.64" y="28.199979999999982"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="356.64" y="27.959999999999923"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="430.62" y="28.19997999999987"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="356.40000000000003" y="41.45999999999992"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="356.40000000000003" y="27.959979999999973"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.22" x="133.86" y="28.199979999999982"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="133.86" y="27.959999999999923"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="207.84" y="28.19997999999987"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.46000000000001" x="133.62" y="41.45999999999992"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="133.62" y="27.959979999999973"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.28" x="208.08" y="28.199979999999982"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="208.08" y="27.959999999999923"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="282.12" y="28.19997999999987"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="207.84" y="41.45999999999992"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48" x="207.84" y="27.959979999999973"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="74.28" x="282.36" y="28.199979999999982"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="282.36" y="27.959999999999923"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="356.40000000000003" y="28.19997999999987"></rect>
<rect height="0.47998" style="fill: rgb(0%, 0%, 0%)" width="74.52" x="282.12" y="41.45999999999992"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="282.12" y="27.959979999999973"></rect>
<rect height="13.5" style="fill: rgb(100%, 100%, 100%)" width="145.56" x="135.9" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="145.8" x="135.9" y="92.03993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="281.22" y="92.27997999999991"></rect>
<rect height="0.48004" style="fill: rgb(0%, 0%, 0%)" width="145.8" x="135.66" y="105.53993999999989"></rect>
<rect height="13.74" style="fill: rgb(0%, 0%, 0%)" width="0.48001000000000005" x="135.66" y="92.0399799999999"></rect>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="10.438853099999989" x="113.58" y="38.311307159999956">SR</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="9.17216670000002" x="157.5" y="38.31087779999996">X3</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="9.17216670000002" x="241.86" y="38.31087779999996">X2</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.80058169554718pt; fill: #000" textLength="9.904170745989461" x="316.20000000000005" y="38.38388926322614">X1</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.80058169554718pt; fill: #000" textLength="9.844471945989483" x="390.3583468500001" y="38.38388926322614">X0</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="20.02495481999999" x="113.58" y="102.45087779999994">DEST</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000001pt; fill: #000" textLength="4.192128800000006" x="205.44" y="102.45087779999992">0</text>
<text lengthAdjust="spacingAndGlyphs" style="font-size: 8.414416800000026pt; fill: #000" textLength="120.5321667" x="293.46" y="102.45087779999994">X3 X2 X1 X0</text></g></svg>
<figcaption><a href='cvtpd2ps.html#fig-3-13'>Figure 3-13</a>. VCVTPD2PS (VEX.256 encoded version)</figcaption></figure>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vcvtpd2ps--evex-encoded-version--when-src-operand-is-a-register">VCVTPD2PS (EVEX Encoded Version) When SRC Operand is a Register<a class="anchor" href="#vcvtpd2ps--evex-encoded-version--when-src-operand-is-a-register">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
IF (VL = 512) AND (EVEX.b = 1)
THEN
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
ELSE
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
FI;
FOR j := 0 TO KL-1
i := j * 32
k := j * 64
IF k1[j] OR *no writemask*
THEN
DEST[i+31:i] := Convert_Double_Precision_Floating_Point_To_Single_Precision_Floating_Point(SRC[k+63:k])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL/2] := 0
</pre>
<h3 id="vcvtpd2ps--evex-encoded-version--when-src-operand-is-a-memory-source">VCVTPD2PS (EVEX Encoded Version) When SRC Operand is a Memory Source<a class="anchor" href="#vcvtpd2ps--evex-encoded-version--when-src-operand-is-a-memory-source">
</a></h3>
<pre>(KL, VL) = (2, 128), (4, 256), (8, 512)
FOR j := 0 TO KL-1
i := j * 32
k := j * 64
IF k1[j] OR *no writemask*
THEN
IF (EVEX.b = 1)
THEN
DEST[i+31:i] :=Convert_Double_Precision_Floating_Point_To_Single_Precision_Floating_Point(SRC[63:0])
ELSE
DEST[i+31:i] := Convert_Double_Precision_Floating_Point_To_Single_Precision_Floating_Point(SRC[k+63:k])
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL/2] := 0
</pre>
<h3 id="vcvtpd2ps--vex-256-encoded-version-">VCVTPD2PS (VEX.256 Encoded Version)<a class="anchor" href="#vcvtpd2ps--vex-256-encoded-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[63:0])
DEST[63:32] := Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[127:64])
DEST[95:64] := Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[191:128])
DEST[127:96] := Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[255:192)
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="vcvtpd2ps--vex-128-encoded-version-">VCVTPD2PS (VEX.128 Encoded Version)<a class="anchor" href="#vcvtpd2ps--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[63:0])
DEST[63:32] := Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[127:64])
DEST[MAXVL-1:64] := 0
</pre>
<h3 id="cvtpd2ps--128-bit-legacy-sse-version-">CVTPD2PS (128-bit Legacy SSE Version)<a class="anchor" href="#cvtpd2ps--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[63:0])
DEST[63:32] := Convert_Double_Precision_To_Single_Precision_Floating_Point(SRC[127:64])
DEST[127:64] := 0
DEST[MAXVL-1:128] (unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCVTPD2PS __m256 _mm512_cvtpd_ps( __m512d a);
</pre>
<pre>VCVTPD2PS __m256 _mm512_mask_cvtpd_ps( __m256 s, __mmask8 k, __m512d a);
</pre>
<pre>VCVTPD2PS __m256 _mm512_maskz_cvtpd_ps( __mmask8 k, __m512d a);
</pre>
<pre>VCVTPD2PS __m256 _mm512_cvt_roundpd_ps( __m512d a, int r);
</pre>
<pre>VCVTPD2PS __m256 _mm512_mask_cvt_roundpd_ps( __m256 s, __mmask8 k, __m512d a, int r);
</pre>
<pre>VCVTPD2PS __m256 _mm512_maskz_cvt_roundpd_ps( __mmask8 k, __m512d a, int r);
</pre>
<pre>VCVTPD2PS __m128 _mm256_mask_cvtpd_ps( __m128 s, __mmask8 k, __m256d a);
</pre>
<pre>VCVTPD2PS __m128 _mm256_maskz_cvtpd_ps( __mmask8 k, __m256d a);
</pre>
<pre>VCVTPD2PS __m128 _mm_mask_cvtpd_ps( __m128 s, __mmask8 k, __m128d a);
</pre>
<pre>VCVTPD2PS __m128 _mm_maskz_cvtpd_ps( __mmask8 k, __m128d a);
</pre>
<pre>VCVTPD2PS __m128 _mm256_cvtpd_ps (__m256d a)
</pre>
<pre>CVTPD2PS __m128 _mm_cvtpd_ps (__m128d a)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid, Precision, Underflow, Overflow, Denormal.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

68
x86/cvtpi2pd.html Normal file
View file

@ -0,0 +1,68 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CVTPI2PD
— Convert Packed Dword Integers to Packed Double Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CVTPI2PD
— Convert Packed Dword Integers to Packed Double Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64-Bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>66 0F 2A /r CVTPI2PD xmm, mm/m64<sup>1</sup></td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Convert two packed signed doubleword integers from mm/mem64 to two packed double precision floating-point values in xmm.</td></tr></table>
<blockquote>
<p>1. Operation is different for different operand sets; see the Description section.</p></blockquote>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Converts two packed signed doubleword integers in the source operand (second operand) to two packed double precision floating-point values in the destination operand (first operand).</p>
<p>The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an XMM register. In addition, depending on the operand configuration:</p>
<ul>
<li><strong>For operands </strong>xmm<strong>, </strong>mm<strong>: </strong>the instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the CVTPI2PD instruction is executed.</li>
<li><strong>For operands </strong>xmm<strong>, </strong>m64<strong>: </strong>the instruction does not cause a transition to MMX technology and does not take x87 FPU exceptions.</li></ul>
<p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>DEST[63:0] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[31:0]);
DEST[127:64] := Convert_Integer_To_Double_Precision_Floating_Point(SRC[63:32]);
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>CVTPI2PD __m128d _mm_cvtpi32_pd(__m64 a)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>None.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 23-6</span>, “Exception Conditions for Legacy SIMD/MMX Instructions with XMM and without FP Exception” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 3B.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

65
x86/cvtpi2ps.html Normal file
View file

@ -0,0 +1,65 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CVTPI2PS
— Convert Packed Dword Integers to Packed Single Precision Floating-Point Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CVTPI2PS
— Convert Packed Dword Integers to Packed Single Precision Floating-Point Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op/En</th>
<th>64-Bit Mode</th>
<th>Compat/Leg Mode</th>
<th>Description</th></tr>
<tr>
<td>NP 0F 2A /r CVTPI2PS xmm, mm/m64</td>
<td>RM</td>
<td>Valid</td>
<td>Valid</td>
<td>Convert two signed doubleword integers from mm/m64 to two single precision floating-point values in xmm.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>RM</td>
<td>ModRM:reg (r, w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Converts two packed signed doubleword integers in the source operand (second operand) to two packed single precision floating-point values in the destination operand (first operand).</p>
<p>The source operand can be an MMX technology register or a 64-bit memory location. The destination operand is an XMM register. The results are stored in the low quadword of the destination operand, and the high quadword remains unchanged. When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register.</p>
<p>This instruction causes a transition from x87 FPU to MMX technology operation (that is, the x87 FPU top-of-stack pointer is set to 0 and the x87 FPU tag word is set to all 0s [valid]). If this instruction is executed while an x87 FPU floating-point exception is pending, the exception is handled before the CVTPI2PS instruction is executed.</p>
<p>In 64-bit mode, use of the REX.R prefix permits this instruction to access additional registers (XMM8-XMM15).</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<pre>DEST[31:0] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[31:0]);
DEST[63:32] := Convert_Integer_To_Single_Precision_Floating_Point(SRC[63:32]);
(* High quadword of destination unchanged *)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>CVTPI2PS __m128 _mm_cvtpi32_ps(__m128 a, __m64 b)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Precision.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>See <span class="not-imported">Table 23-5</span>, “Exception Conditions for Legacy SIMD/MMX Instructions with XMM and FP Exception” in the Intel<sup>®</sup> 64 and IA-32 Architectures Software Developers Manual, Volume 3B.</p><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

212
x86/cvtps2dq.html Normal file
View file

@ -0,0 +1,212 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:x86="http://www.felixcloutier.com/x86"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><link rel="stylesheet" type="text/css" href="style.css"></link><title>CVTPS2DQ
— Convert Packed Single Precision Floating-Point Values to Packed SignedDoubleword Integer Values</title></head><body><header><nav><ul><li><a href='index.html'>Index</a></li><li>December 2023</li></ul></nav></header><h1>CVTPS2DQ
— Convert Packed Single Precision Floating-Point Values to Packed SignedDoubleword Integer Values</h1>
<table>
<tr>
<th>Opcode/Instruction</th>
<th>Op / En</th>
<th>64/32 bit Mode Support</th>
<th>CPUID Feature Flag</th>
<th>Description</th></tr>
<tr>
<td>66 0F 5B /r CVTPS2DQ xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>SSE2</td>
<td>Convert four packed single precision floating-point values from xmm2/mem to four packed signed doubleword values in xmm1.</td></tr>
<tr>
<td>VEX.128.66.0F.WIG 5B /r VCVTPS2DQ xmm1, xmm2/m128</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert four packed single precision floating-point values from xmm2/mem to four packed signed doubleword values in xmm1.</td></tr>
<tr>
<td>VEX.256.66.0F.WIG 5B /r VCVTPS2DQ ymm1, ymm2/m256</td>
<td>A</td>
<td>V/V</td>
<td>AVX</td>
<td>Convert eight packed single precision floating-point values from ymm2/mem to eight packed signed doubleword values in ymm1.</td></tr>
<tr>
<td>EVEX.128.66.0F.W0 5B /r VCVTPS2DQ xmm1 {k1}{z}, xmm2/m128/m32bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert four packed single precision floating-point values from xmm2/m128/m32bcst to four packed signed doubleword values in xmm1 subject to writemask k1.</td></tr>
<tr>
<td>EVEX.256.66.0F.W0 5B /r VCVTPS2DQ ymm1 {k1}{z}, ymm2/m256/m32bcst</td>
<td>B</td>
<td>V/V</td>
<td>AVX512VL AVX512F</td>
<td>Convert eight packed single precision floating-point values from ymm2/m256/m32bcst to eight packed signed doubleword values in ymm1 subject to writemask k1.</td></tr>
<tr>
<td>EVEX.512.66.0F.W0 5B /r VCVTPS2DQ zmm1 {k1}{z}, zmm2/m512/m32bcst{er}</td>
<td>B</td>
<td>V/V</td>
<td>AVX512F</td>
<td>Convert sixteen packed single precision floating-point values from zmm2/m512/m32bcst to sixteen packed signed doubleword values in zmm1 subject to writemask k1.</td></tr></table>
<h2 id="instruction-operand-encoding">Instruction Operand Encoding<a class="anchor" href="#instruction-operand-encoding">
</a></h2>
<table>
<tr>
<th>Op/En</th>
<th>Tuple Type</th>
<th>Operand 1</th>
<th>Operand 2</th>
<th>Operand 3</th>
<th>Operand 4</th></tr>
<tr>
<td>A</td>
<td>N/A</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr>
<tr>
<td>B</td>
<td>Full</td>
<td>ModRM:reg (w)</td>
<td>ModRM:r/m (r)</td>
<td>N/A</td>
<td>N/A</td></tr></table>
<h2 id="description">Description<a class="anchor" href="#description">
</a></h2>
<p>Converts four, eight or sixteen packed single precision floating-point values in the source operand to four, eight or sixteen signed doubleword integers in the destination operand.</p>
<p>When a conversion is inexact, the value returned is rounded according to the rounding control bits in the MXCSR register or the embedded rounding control bits. If a converted result cannot be represented in the destination format, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (2<sup>w</sup>-1, where w represents the number of bits in the destination format) is returned.</p>
<p>EVEX encoded versions: The source operand is a ZMM register, a 512-bit memory location or a 512-bit vector broadcasted from a 32-bit memory location. The destination operand is a ZMM register conditionally updated with writemask k1.</p>
<p>VEX.256 encoded version: The source operand is a YMM register or 256- bit memory location. The destination operand is a YMM register. The upper bits (MAXVL-1:256) of the corresponding ZMM register destination are zeroed.</p>
<p>VEX.128 encoded version: The source operand is an XMM register or 128- bit memory location. The destination operand is a XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are zeroed.</p>
<p>128-bit Legacy SSE version: The source operand is an XMM register or 128- bit memory location. The destination operand is an XMM register. The upper bits (MAXVL-1:128) of the corresponding ZMM register destination are unmodified.</p>
<p>VEX.vvvv and EVEX.vvvv are reserved and must be 1111b otherwise instructions will #UD.</p>
<h2 id="operation">Operation<a class="anchor" href="#operation">
</a></h2>
<h3 id="vcvtps2dq--encoded-versions--when-src-operand-is-a-register">VCVTPS2DQ (Encoded Versions) When SRC Operand is a Register<a class="anchor" href="#vcvtps2dq--encoded-versions--when-src-operand-is-a-register">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
IF (VL = 512) AND (EVEX.b = 1)
THEN
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(EVEX.RC);
ELSE
SET_ROUNDING_MODE_FOR_THIS_INSTRUCTION(MXCSR.RC);
FI;
FOR j := 0 TO KL-1
i := j * 32
IF k1[j] OR *no writemask*
THEN DEST[i+31:i] :=
Convert_Single_Precision_Floating_Point_To_Integer(SRC[i+31:i])
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vcvtps2dq--evex-encoded-versions--when-src-operand-is-a-memory-source">VCVTPS2DQ (EVEX Encoded Versions) When SRC Operand is a Memory Source<a class="anchor" href="#vcvtps2dq--evex-encoded-versions--when-src-operand-is-a-memory-source">
</a></h3>
<pre>(KL, VL) = (4, 128), (8, 256), (16, 512)
FOR j := 0 TO 15
i := j * 32
IF k1[j] OR *no writemask*
THEN
IF (EVEX.b = 1)
THEN
DEST[i+31:i] :=
Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0])
ELSE
DEST[i+31:i] :=
Convert_Single_Precision_Floating_Point_To_Integer(SRC[i+31:i])
FI;
ELSE
IF *merging-masking* ; merging-masking
THEN *DEST[i+31:i] remains unchanged*
ELSE ; zeroing-masking
DEST[i+31:i] := 0
FI
FI;
ENDFOR
DEST[MAXVL-1:VL] := 0
</pre>
<h3 id="vcvtps2dq--vex-256-encoded-version-">VCVTPS2DQ (VEX.256 Encoded Version)<a class="anchor" href="#vcvtps2dq--vex-256-encoded-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0])
DEST[63:32] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[63:32])
DEST[95:64] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[95:64])
DEST[127:96] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[127:96)
DEST[159:128] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[159:128])
DEST[191:160] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[191:160])
DEST[223:192] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[223:192])
DEST[255:224] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[255:224])
</pre>
<h3 id="vcvtps2dq--vex-128-encoded-version-">VCVTPS2DQ (VEX.128 Encoded Version)<a class="anchor" href="#vcvtps2dq--vex-128-encoded-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0])
DEST[63:32] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[63:32])
DEST[95:64] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[95:64])
DEST[127:96] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[127:96])
DEST[MAXVL-1:128] := 0
</pre>
<h3 id="cvtps2dq--128-bit-legacy-sse-version-">CVTPS2DQ (128-bit Legacy SSE Version)<a class="anchor" href="#cvtps2dq--128-bit-legacy-sse-version-">
</a></h3>
<pre>DEST[31:0] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[31:0])
DEST[63:32] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[63:32])
DEST[95:64] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[95:64])
DEST[127:96] := Convert_Single_Precision_Floating_Point_To_Integer(SRC[127:96])
DEST[MAXVL-1:128] (unmodified)
</pre>
<h2 id="intel-c-c++-compiler-intrinsic-equivalent">Intel C/C++ Compiler Intrinsic Equivalent<a class="anchor" href="#intel-c-c++-compiler-intrinsic-equivalent">
</a></h2>
<pre>VCVTPS2DQ __m512i _mm512_cvtps_epi32( __m512 a);
</pre>
<pre>VCVTPS2DQ __m512i _mm512_mask_cvtps_epi32( __m512i s, __mmask16 k, __m512 a);
</pre>
<pre>VCVTPS2DQ __m512i _mm512_maskz_cvtps_epi32( __mmask16 k, __m512 a);
</pre>
<pre>VCVTPS2DQ __m512i _mm512_cvt_roundps_epi32( __m512 a, int r);
</pre>
<pre>VCVTPS2DQ __m512i _mm512_mask_cvt_roundps_epi32( __m512i s, __mmask16 k, __m512 a, int r);
</pre>
<pre>VCVTPS2DQ __m512i _mm512_maskz_cvt_roundps_epi32( __mmask16 k, __m512 a, int r);
</pre>
<pre>VCVTPS2DQ __m256i _mm256_mask_cvtps_epi32( __m256i s, __mmask8 k, __m256 a);
</pre>
<pre>VCVTPS2DQ __m256i _mm256_maskz_cvtps_epi32( __mmask8 k, __m256 a);
</pre>
<pre>VCVTPS2DQ __m128i _mm_mask_cvtps_epi32( __m128i s, __mmask8 k, __m128 a);
</pre>
<pre>VCVTPS2DQ __m128i _mm_maskz_cvtps_epi32( __mmask8 k, __m128 a);
</pre>
<pre>VCVTPS2DQ __ m256i _mm256_cvtps_epi32 (__m256 a)
</pre>
<pre>CVTPS2DQ __m128i _mm_cvtps_epi32 (__m128 a)
</pre>
<h2 class="exceptions" id="simd-floating-point-exceptions">SIMD Floating-Point Exceptions<a class="anchor" href="#simd-floating-point-exceptions">
</a></h2>
<p>Invalid, Precision.</p>
<h2 class="exceptions" id="other-exceptions">Other Exceptions<a class="anchor" href="#other-exceptions">
</a></h2>
<p>VEX-encoded instructions, see <span class="not-imported">Table 2-19</span>, “Type 2 Class Exception Conditions.”</p>
<p>EVEX-encoded instructions, see <span class="not-imported">Table 2-46</span>, “Type E2 Class Exception Conditions.”</p>
<p>Additionally:</p>
<table>
<tr>
<td>#UD</td>
<td>If VEX.vvvv != 1111B or EVEX.vvvv != 1111B.</td></tr></table><footer><p>
This UNOFFICIAL, mechanically-separated, non-verified reference is provided for convenience, but it may be
inc<span style="opacity: 0.2">omp</span>lete or b<sub>r</sub>oke<sub>n</sub> in various obvious or non-obvious
ways. Refer to <a href="https://software.intel.com/en-us/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4">Intel® 64 and IA-32 Architectures Software Developers Manual</a> for anything serious.
</p></footer></body></html>

Some files were not shown because too many files have changed in this diff Show more