proj-plbook-plChX86Isa

Table of Contents for Programming Languages: a survey

x86

we won't go into much detail..

Addressing modes ( http://cs.nyu.edu/courses/fall10/V22.0201-002/addressing_modes.pdf ):

http://www.agner.org/optimize/blog/read.php?i=25 claims "The total number of x86 instructions is well above one thousand"

https://www.nayuki.io/page/a-fundamental-introduction-to-x86-assembly-programming

http://savannah.nongnu.org/projects/pgubook/

https://en.wikipedia.org/wiki/X86_instruction_listings

http://zsmith.co/intel.html

    AAA - Ascii Adjust for Addition
    AAD - Ascii Adjust for Division
    AAM - Ascii Adjust for Multiplication
    AAS - Ascii Adjust for Subtraction
    ADC - Add With Carry
    ADD - Arithmetic Addition
    AND - Logical And
    ARPL - Adjusted Requested Privilege Level of Selector (286+ PM)
    BOUND - Array Index Bound Check (80188+)
    BSF - Bit Scan Forward (386+)
    BSR - Bit Scan Reverse (386+)
    BSWAP - Byte Swap (486+)
    BT - Bit Test (386+)
    BTC - Bit Test with Compliment (386+)
    BTR - Bit Test with Reset (386+)
    BTS - Bit Test and Set (386+)
    CALL - Procedure Call
    CBW - Convert Byte to Word
    CDQ - Convert Double to Quad (386+)
    CLC - Clear Carry
    CLD - Clear Direction Flag
    CLI - Clear Interrupt Flag (disable)
    CLTS - Clear Task Switched Flag (286+ privileged)
    CMC - Complement Carry Flag
    CMP - Compare
    CMPS - Compare String (Byte, Word or Doubleword)
    CMPXCHG - Compare and Exchange
    CWD - Convert Word to Doubleword
    CWDE - Convert Word to Extended Doubleword (386+)
    DAA - Decimal Adjust for Addition
    DAS - Decimal Adjust for Subtraction
    DEC - Decrement
    DIV - Divide
    ENTER - Make Stack Frame (80188+)
    ESC - Escape
    HALT - Halt
    F2XM1 - Compute 2x-1
    FABS - Absolute value
    FADD - Floating point add
    FADDP - Floating point add and pop
    FBLD - Load BCD
    FBSTP - Store BCD and pop
    FCHS - Change sign
    FCLEX - Clear exceptions
    FNCLEX - Clear exceptions / no wait
    FCOM - Floating point compare
    FCOMP - Floating point compare and pop
    FCOMPP - Floating point compare and pop twice
    FCOS - Floating point cosine (387+)
    FDECSTP - Decrement floating point stack pointer
    FDISI - Disable interrupts (8087 only; others do fnop)
    FNDISI - Disable interrupts no wait (8087 only; others do fnop)
    FDIV - Floating divide
    FDIVP - Floating divide and pop
    FDIVR - Floating divide reversed
    FDIVRP - Floating divide reversed and pop
    FENI - Enable interrupts (8087 only; others do fnop)
    FNENI - Enable interrupts nowait (8087 only; others do fnop)
    FFREE - Free register
    FIADD - Integer add
    FICOM - Integer compare
    FICOMP - Integer compare and pop
    FIDIV - Integer divide
    FIDIVR - Integer divide reversed
    FILD - Load integer
    FIMUL - Integer multiply
    FINCSTP - Increment floating point stack pointer
    FINIT - Initialize floating point processor
    FNINIT - Initialize floating point processor no wait
    FIST - Store integer
    FISTP - Store integer and pop
    FISUB - Integer subtract
    FISUBR - Integer subtract reversed
    FLD - Floating point load
    FLDZ - Load constant onto stack: 0.0
    FLD1 - Load constant onto stack: 1.0
    FLDL2E - Load constant onto stack: logarithm base 2 (e)
    FLDL2T - Load constant onto stack: logarithm base 2 (10)
    FLDLG2 - Load constant onto stack: logarithm base 10 (2)
    FLDLN2 - Load constant onto stack: natural logarithm (2)
    FLDPI - Load constant onto stack: pi (3.14159...)
    FLDCW - Load control word
    FLDENV - Load environment state
    FMUL - Floating point multiply
    FMULP - Floating point multiply and pop
    FNOP - no operation
    FPATAN - Partial arctangent
    FPREM - Partial remainder
    FPREM1 - Partial remainder (IEEE compatible, 387+)
    FPTAN - Partial tangent
    FRNDINT - Round to integer
    FRSTOR - Restore saved state
    FSAVE - Save FPU state
    FSAVEW - Save FPU state / 16-bit format (387+)
    FSAVED - Save FPU state / 32-bit format (387+)
    FNSAVE - Save FPU state no wait
    FNSAVEW - Save FPU state no wait / 16-bit format (387+)
    FNSAVED - Save FPU state no wait / 32-bit format (387+)
    FSCALE - Scale by factor of 2
    FSETPM - Set protected mode (287 only; 387+ = fnop)
    FSIN - Sine (387+)
    FSINCOS - Sine and cosine (387+)
    FSQRT - Square root
    FST - Floating point store
    FSTP - Floating point store and pop
    FSTCW - Store control word
    FNSTCW - Store control word no wait
    FSTENV - Store FPU environment
    FSTENVW - Store FPU environment / 16-bit format (387+)
    FSTENVD - Store FPU environment / 32-bit format (387+)
    FNSTENV - Store FPU environment no wait
    FNSTENVW - Store FPU environment no wait / 16-bit format (387+)
    FNSTENVD - Store FPU environment no wait / 32-bit format (387+)
    FSTSW - Store status word
    FNSTSW - Store status word no wait
    FSUB - Floating point subtract
    FSUBP - Floating point subtract and pop
    FSUBR - Floating point reverse subtract
    FSUBRP - Floating point reverse subtract and pop
    FTST - Floating point test for zero
    FUCOM - Unordered floating point compare (387+)
    FUCOMP - Unordered floating point compare and pop (387+)
    FUCOMPP - Unordered floating point compare and pop twice (387+)
    FWAIT - Wait while FPU is executing
    FXAM - Examine condition flags
    FXCH - Exchange floating point registers
    FXTRACT - Extract exponent and significand
    FYL2X - Compute Y * log2(x)
    FYL2XP1 - Compute Y * log2(x+1)
    HLT - Halt CPU
    IDIV - Signed Integer Division
    IMUL - Signed Multiply
    IN - Input Byte or Word From Port
    INC - Increment
    INS - Input String from Port (80188+)
    INT - Interrupt
    INTO - Interrupt on Overflow
    INVD - Invalidate Cache (486+)
    INVLPG - Invalidate Translation Look-Aside Buffer Entry (486+)
    IRET/IRETD - Interrupt Return
    Jxx - Jump Instructions Table
    JCXZ/JECXZ - Jump if Register (E)CX is Zero
    JMP - Unconditional Jump
    LAHF - Load Register AH From Flags
    LAR - Load Access Rights (286+ protected)
    LDS - Load Pointer Using DS
    LEA - Load Effective Address
    LEAVE - Restore Stack for Procedure Exit (80188+)
    LES - Load Pointer Using ES
    LFS - Load Pointer Using FS (386+)
    LGDT - Load Global Descriptor Table (286+ privileged)
    LIDT - Load Interrupt Descriptor Table (286+ privileged)
    LGS - Load Pointer Using GS (386+)
    LLDT - Load Local Descriptor Table (286+ privileged)
    LMSW - Load Machine Status Word (286+ privileged)
    LOCK - Lock Bus
    LODS - Load String (Byte, Word or Double)
    LOOP - Decrement CX and Loop if CX Not Zero
    LOOPE/LOOPZ - Loop While Equal / Loop While Zero
    LOOPNZ/LOOPNE - Loop While Not Zero / Loop While Not Equal
    LSL - Load Segment Limit (286+ protected)
    LSS - Load Pointer Using SS (386+)
    LTR - Load Task Register (286+ privileged)
    MOV - Move Byte or Word
    MOVS - Move String (Byte or Word)
    MOVSX - Move with Sign Extend (386+)
    MOVZX - Move with Zero Extend (386+)
    MUL - Unsigned Multiply
    NEG - Twos Complement Negation
    NOP - No Operation (90h)
    NOT - Ones Compliment Negation (Logical NOT)
    OR - Inclusive Logical OR
    OUT - Output Data to Port
    OUTS - Output String to Port (80188+)
    POP - Pop Word off Stack
    POPA/POPAD - Pop All Registers onto Stack (80188+)
    POPF/POPFD - Pop Flags off Stack
    PUSH - Push Word onto Stack
    PUSHA/PUSHAD - Push All Registers onto Stack (80188+)
    PUSHF/PUSHFD - Push Flags onto Stack
    RCL - Rotate Through Carry Left
    RCR - Rotate Through Carry Right
    REP - Repeat String Operation
    REPE/REPZ - Repeat Equal / Repeat Zero
    REPNE/REPNZ - Repeat Not Equal / Repeat Not Zero
    RET/RETF - Return From Procedure
    ROL - Rotate Left
    ROR - Rotate Right
    SAHF - Store AH Register into FLAGS
    SAL/SHL - Shift Arithmetic Left / Shift Logical Left
    SAR - Shift Arithmetic Right
    SBB - Subtract with Borrow/Carry
    SCAS - Scan String (Byte, Word or Doubleword)
    SETAE/SETNB - Set if Above or Equal / Set if Not Below (386+)
    SETB/SETNAE - Set if Below / Set if Not Above or Equal (386+)
    SETBE/SETNA - Set if Below or Equal / Set if Not Above (386+)
    SETE/SETZ - Set if Equal / Set if Zero (386+)
    SETNE/SETNZ - Set if Not Equal / Set if Not Zero (386+)
    SETL/SETNGE - Set if Less / Set if Not Greater or Equal (386+)
    SETGE/SETNL - Set if Greater or Equal / Set if Not Less (386+)
    SETLE/SETNG - Set if Less or Equal / Set if Not greater or Equal (386+)
    SETG/SETNLE - Set if Greater / Set if Not Less or Equal (386+)
    SETS - Set if Signed (386+)
    SETNS - Set if Not Signed (386+)
    SETC - Set if Carry (386+)
    SETNC - Set if Not Carry (386+)
    SETO - Set if Overflow (386+)
    SETNO - Set if Not Overflow (386+)
    SETP/SETPE - Set if Parity / Set if Parity Even (386+)
    SETNP/SETPO - Set if No Parity / Set if Parity Odd (386+)
    SGDT - Store Global Descriptor Table (286+ privileged)
    SIDT - Store Interrupt Descriptor Table (286+ privileged)
    SHL - Shift Logical Left
    SHR - Shift Logical Right
    SHLD/SHRD - Double Precision Shift (386+)
    SLDT - Store Local Descriptor Table (286+ privileged)
    SMSW - Store Machine Status Word (286+ privileged)
    STC - Set Carry
    STD - Set Direction Flag
    STI - Set Interrupt Flag (Enable Interrupts)
    STOS - Store String (Byte, Word or Doubleword)
    STR - Store Task Register (286+ privileged)
    SUB - Subtract
    TEST - Test For Bit Pattern
    VERR - Verify Read (286+ protected)
    VERW - Verify Write (286+ protected)
    WAIT/FWAIT - Event Wait
    WBINVD - Write-Back and Invalidate Cache (486+)
    XCHG - Exchange
    XLAT/XLATB - Translate
    XOR - Exclusive OR

http://cse.unl.edu/~goddard/Courses/CSCE351/IntelArchitecture/InstructionSetSummary.pdf

30.1 New Intel Architecture Instructions The following sections give the Intel Architecture instructions that were new in the MMX Technology and in the Pentium Pro, Pentium, and Intel486 processors. 30.1.1 New Instructions Introduced with the MMX™ Technology The Intel MMX technology introduced a new set of instructions to the Intel Architecture, designed to enhance the performance of multimedia applications. These instructions are recognized by all Intel Architecture processors that implement the MMX technology. The MMX instructions are listed in “MMX™ Technology Instructions”. 30.1.2 New Instructions in the Pentium ® Pro Processor The following instructions are new in the Pentium Pro processor: • CMOV cc —Conditional move (see “Conditional Move Instructions”). • FCMOV cc —Floating-point conditional move on condition-code flags in EFLAGS register (see “Data Transfer Instructions”). • FCOMI/FCOMIP/FUCOMI/FUCOMIP—Floating?-point compare and set condition-code flags in EFLAGS register (see “Comparison and Classification Instructions”). • RDPMC—Read? performance monitoring counters (see “RDPMC—Read Performance- Monitoring Counters” in Chapter 3 of the Intel Architecture Software Developer’s Manual, Vo l u m e 2 ). (This instruction is also available in all Pentium ® processors that implement the MMX™ technology.) • UD2—Undefined instruction (see “No-Operation and Undefined Instructions”). 30-516 Embedded Pentium ® Processor Family Instruction Set Summary 30.1.3 New Instructions in the Pentium ® Processor The following instructions are new in the Pentium processor: • CMPXCHG8B (compare and exchange 8 bytes) instruction. • CPUID (CPU identification) instruction. (This instruction was introduced in the Pentium ® processor and added to later versions of the Intel486™ processor.) • RDTSC (read time-stamp counter) instruction. • RDMSR (read model-specific register) instruction. • WRMSR (write model-specific register) instruction. • RSM (resume from SMM) instruction. The form of the MOV instruction used to access the test registers has been removed on the Pentium and future Intel Architecture processors. 30.1.4 New Instructions in the Intel486™ Processor The following instructions are new in the Intel486 processor: • BSWAP (byte swap) instruction. • XADD (exchange and add) instruction. • CMPXCHG (compare and exchange) instruction. • Ι NVD (invalidate cache) instruction. • WBINVD (write-back and invalidate cache) instruction. • INVLPG (invalidate TLB entry) instruction. 30.2 Instruction Set List This section lists all the Intel Architecture instructions divided into three major groups: integer, MMX technology, floating-point, and system instructions. For each instruction, the mnemonic and descriptive names are given. When two or more mnemonics are given (for example, CMOVA/CMOVNBE), they represent different mnemonics for the same instruction opcode. Assemblers support redundant mnemonics for some instructions to make it easier to read code listings. For instance, CMOVA (Conditional move if above) and CMOVNBE (Conditional move is not below or equal) represent the same condition. 30.2.1 Integer Instructions Integer instructions perform the integer arithmetic, logic, and program flow control operations that programmers commonly use to write application and system software to run on an Intel Architecture processor. In the following sections, the integer instructions are divided into several instruction subgroups. Embedded Pentium ® Processor Family 30-517 Instruction Set Summary 30.2.1.1 Data Transfer Instructions MOV Move CMOVE/CMOVZ Conditional move if equal/Conditional move if zero CMOVNE/CMOVNZ Conditional move if not equal/Conditional move if not zero CMOVA/CMOVNBE Conditional move if above/Conditional move if not below or equal CMOVAE/CMOVNB Conditional move if above or equal/Conditional move if not below CMOVB/CMOVNAE Conditional move if below/Conditional move if not above or equal CMOVBE/CMOVNA Conditional move if below or equal/Conditional move if not above CMOVG/CMOVNLE Conditional move if greater/Conditional move if not less or equal CMOVGE/CMOVNL Conditional move if greater or equal/Conditional move if not less CMOVL/CMOVNGE Conditional move if less/Conditional move if not greater or equal CMOVLE/CMOVNG Conditional move if less or equal/Conditional move if not greater CMOVC Conditional move if carry CMOVNC Conditional move if not carry CMOVO Conditional move if overflow CMOVNO Conditional move if not overflow CMOVS Conditional move if sign (negative) CMOVNS Conditional move if not sign (non-negative) CMOVP/CMOVPE Conditional move if parity/Conditional move if parity even CMOVNP/CMOVPO Conditional move if not parity/Conditional move if parity odd XCHG Exchange BSWAP Byte swap XADD Exchange and add CMPXCHG Compare and exchange CMPXCHG8B Compare and exchange 8 bytes PUSH Push onto stack POP Pop off of stack PUSHA/PUSHAD Push general-purpose registers onto stack POPA/POPAD Pop general-purpose registers from stack IN Read from a port OUT Write to a port CWD/CDQ Convert word to doubleword/Convert doubleword to quadword CBW/CWDE Convert byte to word/Convert word to doubleword in EAX register MOVSX Move and sign extend MOVZX Move and zero extend 30-518 Embedded Pentium ® Processor Family Instruction Set Summary 30.2.1.2 Binary Arithmetic Instructions 30.2.1.3 Decimal Arithmetic 30.2.1.4 Logic Instructions ADD Integer add ADC Add with carry SUB Subtract SBB Subtract with borrow IMUL Signed multiply MUL Unsigned multiply IDIV Signed divide DIV Unsigned divide INC Increment DEC Decrement NEG Negate CMP Compare DAA Decimal adjust after addition DAS Decimal adjust after subtraction AAA ASCII adjust after addition AAS ASCII adjust after subtraction AAM ASCII adjust after multiplication AAD ASCII adjust before division AND And OR Or XOR Exclusive or NOT Not Embedded Pentium ® Processor Family 30-519 Instruction Set Summary 30.2.1.5 Shift and Rotate Instructions 30.2.1.6 Bit and Byte Instructions SAR Shift arithmetic right SHR Shift logical right SAL/SHL Shift arithmetic left/Shift logical left SHRD Shift right double SHLD Shift left double ROR Rotate right ROL Rotate left RCR Rotate through carry right RCL Rotate through carry left BT Bit test BTS Bit test and set BTR Bit test and reset BTC Bit test and complement BSF Bit scan forward BSR Bit scan reverse SETE/SETZ Set byte if equal/Set byte if zero SETNE/SETNZ Set byte if not equal/Set byte if not zero SETA/SETNBE Set byte if above/Set byte if not below or equal SETAE/SETNB/SETNC Set byte if above or equal/Set byte if not below/Set byte if not carry SETB/SETNAE/SETC Set byte if below/Set byte if not above or equal/Set byte if carry SETBE/SETNA Set byte if below or equal/Set byte if not above SETG/SETNLE Set byte if greater/Set byte if not less or equal SETGE/SETNL Set byte if greater or equal/Set byte if not less SETL/SETNGE Set byte if less/Set byte if not greater or equal SETLE/SETNG Set byte if less or equal/Set byte if not greater SETS Set byte if sign (negative) SETNS Set byte if not sign (non-negative) SETO Set byte if overflow SETNO Set byte if not overflow SETPE/SETP Set byte if parity even/Set byte if parity SETPO/SETNP Set byte if parity odd/Set byte if not parity TEST Logical compare 30-520 Embedded Pentium ® Processor Family Instruction Set Summary 30.2.1.7 Control Transfer Instructions JMP Jump JE/JZ Jump if equal/Jump if zero JNE/JNZ Jump if not equal/Jump if not zero JA/JNBE Jump if above/Jump if not below or equal JAE/JNB Jump if above or equal/Jump if not below JB/JNAE Jump if below/Jump if not above or equal JBE/JNA Jump if below or equal/Jump if not above JG/JNLE Jump if greater/Jump if not less or equal JGE/JNL Jump if greater or equal/Jump if not less JL/JNGE Jump if less/Jump if not greater or equal JLE/JNG Jump if less or equal/Jump if not greater JC Jump if carry JNC Jump if not carry JO Jump if overflow JNO Jump if not overflow JS Jump if sign (negative) JNS Jump if not sign (non-negative) JPO/JNP Jump if parity odd/Jump if not parity JPE/JP Jump if parity even/Jump if parity JCXZ/JECXZ Jump register CX zero/Jump register ECX zero LOOP Loop with ECX counter LOOPZ/LOOPE Loop with ECX and zero/Loop with ECX and equal LOOPNZ/LOOPNE Loop with ECX and not zero/Loop with ECX and not equal CALL Call procedure RET Return IRET Return from interrupt INT Software interrupt INTO Interrupt on overflow BOUND Detect value out of range ENTER High-level procedure entry LEAVE High-level procedure exit Embedded Pentium ® Processor Family 30-521 Instruction Set Summary 30.2.1.8 String Instructions MOVS/MOVSB Move string/Move byte string MOVS/MOVSW Move string/Move word string MOVS/MOVSD Move string/Move doubleword string CMPS/CMPSB Compare string/Compare byte string CMPS/CMPSW Compare string/Compare word string CMPS/CMPSD Compare string/Compare doubleword string SCAS/SCASB Scan string/Scan byte string SCAS/SCASW Scan string/Scan word string SCAS/SCASD Scan string/Scan doubleword string LODS/LODSB Load string/Load byte string LODS/LODSW Load string/Load word string LODS/LODSD Load string/Load doubleword string STOS/STOSB Store string/Store byte string STOS/STOSW Store string/Store word string STOS/STOSD Store string/Store doubleword string REP Repeat while ECX not zero REPE/REPZ Repeat while equal/Repeat while zero REPNE/REPNZ Repeat while not equal/Repeat while not zero INS/INSB Input string from port/Input byte string from port INS/INSW Input string from port/Input word string from port INS/INSD Input string from port/Input doubleword string from port OUTS/OUTSB Output string to port/Output byte string to port OUTS/OUTSW Output string to port/Output word string to port OUTS/OUTSD Output string to port/Output doubleword string to port 30-522 Embedded Pentium ® Processor Family Instruction Set Summary 30.2.1.9 Flag Control Instructions 30.2.1.10 Segment Register Instructions 30.2.1.11 Miscellaneous Instructions 30.2.2 MMX™ Technology Instructions The MMX instructions execute on those Intel Architecture processors that implement the Intel MMX technology. These instructions operate on packed-byte, packed-word, packed-doubleword, and quadword operands. As with the integer instructions, the following list of MMX instructions is divided into subgroups. STC Set carry flag CLC Clear the carry flag CMC Complement the carry flag CLD Clear the direction flag STD Set direction flag LAHF Load flags into AH register SAHF Store AH register into flags PUSHF/PUSHFD Push EFLAGS onto stack POPF/POPFD Pop EFLAGS from stack STI Set interrupt flag CLI Clear the interrupt flag LDS Load far pointer using DS LES Load far pointer using ES LFS Load far pointer using FS LGS Load far pointer using GS LSS Load far pointer using SS LEA Load effective address NOP No operation UB2 Undefined instruction XLAT/XLATB Table lookup translation CPUID Processor Identification Embedded Pentium ® Processor Family 30-523 Instruction Set Summary 30.2.2.1 MMX™ Data Transfer Instructions 30.2.2.2 MMX™ Conversion Instructions 30.2.2.3 MMX™ Packed Arithmetic Instructions MOVD Move doubleword MOVQ Move quadword PACKSSWB Pack words into bytes with signed saturation PACKSSDW Pack doublewords into words with signed saturation PACKUSWB Pack words into bytes with unsigned saturation PUNPCKHBW Unpack high-order bytes from words PUNPCKHWD Unpack high-order words from doublewords PUNPCKHDQ Unpack high-order doublewords from quadword PUNPCKLBW Unpack low-order bytes from words PUNPCKLWD Unpack low-order words from doublewords PUNPCKLDQ Unpack low-order doublewords from quadword PADDB Add packed bytes PADDW Add packed words PADDD Add packed doublewords PADDSB Add packed bytes with saturation PADDSW Add packed words with saturation PADDUSB Add packed unsigned bytes with saturation PADDUSW Add packed unsigned words with saturation PSUBB Subtract packed bytes PSUBW Subtract packed words PSUBD Subtract packed doublewords PSUBSB Subtract packed bytes with saturation PSUBSW Subtract packed words with saturation PSUBUSB Subtract packed unsigned bytes with saturation PSUBUSW Subtract packed unsigned words with saturation PMULHW Multiply packed words and store high result PMULLW Multiply packed words and store low result PMADDWD Multiply and add packed words 30-524 Embedded Pentium ® Processor Family Instruction Set Summary 30.2.2.4 MMX™ Comparison Instructions 30.2.2.5 MMX™ Logic Instructions 30.2.2.6 MMX™ Shift and Rotate Instructions 30.2.2.7 MMX™ State Management 30.2.3 Floating-Point Instructions The floating-point instructions are those that are executed by the processor’s floating-point unit (FPU). These instructions operate on floating-point (real), extended integer, and binary-coded decimal (BCD) operands. As with the integer instructions, the following list of floating-point instructions is divided into subgroups. PCMPEQB Compare packed bytes for equal PCMPEQW Compare packed words for equal PCMPEQD Compare packed doublewords for equal PCMPGTB Compare packed bytes for greater than PCMPGTW Compare packed words for greater than PCMPGTD Compare packed doublewords for greater than PAND Bitwise logical and PANDN Bitwise logical and not POR Bitwise logical or PXOR Bitwise logical exclusive or PSLLW Shift packed words left logical PSLLD Shift packed doublewords left logical PSLLQ Shift packed quadword left logical PSRLW Shift packed words right logical PSRLD Shift packed doublewords right logical PSRLQ Shift packed quadword right logical PSRAW Shift packed words right arithmetic PSRAD Shift packed doublewords right arithmetic EMMS Empty MMX state Embedded Pentium ® Processor Family 30-525 Instruction Set Summary 30.2.3.1 Data Transfer 30.2.3.2 Basic Arithmetic FLD Load real FST Store real FSTP Store real and pop FILD Load integer FIST Store integer FISTP Store integer and pop FBLD Load BCD FBSTP Store BCD and pop FXCH Exchange registers FCMOVE Floating-point conditional move if equal FCMOVNE Floating-point conditional move if not equal FCMOVB Floating-point conditional move if below FCMOVBE Floating-point conditional move if below or equal FCMOVNB Floating-point conditional move if not below FCMOVNBE Floating-point conditional move if not below or equal FCMOVU Floating-point conditional move if unordered FCMOVNU Floating-point conditional move if not unordered FADD Add real FADDP Add real and pop FIADD Add integer FSUB Subtract real FSUBP Subtract real and pop FISUB Subtract integer FSUBR Subtract real reverse FSUBRP Subtract real reverse and pop FISUBR Subtract integer reverse FMUL Multiply real FMULP Multiply real and pop FIMUL Multiply integer FDIV Divide real FDIVP Divide real and pop FIDIV Divide integer FDIVR Divide real reverse FDIVRP Divide real reverse and pop FIDIVR Divide integer reverse FPREM Partial remainder 30-526 Embedded Pentium ® Processor Family Instruction Set Summary 30.2.3.3 Comparison 30.2.3.4 Transcendental FPREMI IEEE Partial remainder FABS Absolute value FCHS Change sign FRNDINT Round to integer FSCALE Scale by power of two FSQRT Square root FXTRACT Extract exponent and significand FCOM Compare real FCOMP Compare real and pop FCOMPP Compare real and pop twice FUCOM Unordered compare real FUCOMP Unordered compare real and pop FUCOMPP Unordered compare real and pop twice FICOM Compare integer FICOMP Compare integer and pop FCOMI Compare real and set EFLAGS FUCOMI Unordered compare real and set EFLAGS FCOMIP Compare real, set EFLAGS, and pop FUCOMIP Unordered compare real, set EFLAGS, and pop FTST Test real FXAM Examine real FSIN Sine FCOS Cosine FSINCOS Sine and cosine FPTAN Partial tangent FPATAN Partial arctangent F2XM1 2 x − 1 FYL2X y ∗ log 2 x FYL2XP1 y ∗ log 2 (x+1) Embedded Pentium ® Processor Family 30-527 Instruction Set Summary 30.2.3.5 Load Constants 30.2.3.6 FPU Control FLD1 Load +1.0 FLDZ Load +0.0 FLDPI Load π FLDL2E Load log 2 e FLDLN2 Load log e 2 FLDL2T Load log 2 10 FLDLG2 Load log 10 2 FINCSTP Increment FPU register stack pointer FDECSTP Decrement FPU register stack pointer FFREE Free floating-point register FINIT Initialize FPU after checking error conditions FNINIT Initialize FPU without checking error conditions FCLEX Clear floating-point exception flags after checking for error conditions FNCLEX Clear floating-point exception flags without checking for error conditions FSTCW Store FPU control word after checking error conditions FNSTCW Store FPU control word without checking error conditions FLDCW Load FPU control word FSTENV Store FPU environment after checking error conditions FNSTENV Store FPU environment without checking error conditions FLDENV Load FPU environment FSAVE Save FPU state after checking error conditions FNSAVE Save FPU state without checking error conditions FRSTOR Restore FPU state FSTSW Store FPU status word after checking error conditions FNSTSW Store FPU status word without checking error conditions WAIT/FWAIT Wait for FPU FNOP FPU no operation 30-528 Embedded Pentium ® Processor Family Instruction Set Summary 30.2.4 System Instructions The following system instructions are used to control those functions of the processor that are provided to support for operating systems and executives.

LGDT Load global descriptor table (GDT) register SGDT Store global descriptor table (GDT) register LLDT Load local descriptor table (LDT) register SLDT Store local descriptor table (LDT) register LTR Load task register STR Store task register LIDT Load interrupt descriptor table (IDT) register SIDT Store interrupt descriptor table (IDT) register MOV Load and store control registers LMSW Load machine status word SMSW Store machine status word CLTS Clear the task-switched flag ARPL Adjust requested privilege level LAR Load access rights LSL Load segment limit VERR Verify segment for reading VERW Verify segment for writing MOV Load and store debug registers INVD Invalidate cache, no writeback WBINVD Invalidate cache, with writeback INVLPG Invalidate TLB Entry LOCK (prefix) Lock Bus HLT Halt processor RSM Return from system management mode (SSM) RDMSR Read model-specific register WRMSR Write model-specific register RDPMC Read performance monitoring counters RDTSC Read time stamp

30.3 Data Movement Instructions The data movement instructions move bytes, words, doublewords, or quadwords both between memory and the processor’s registers and between registers. These instructions are divided into four groups: • General-purpose data movement. • Exchange. Stack manipulation. • Type-conversion.

it's also useful to see which features of an architecture are not considered good ones:

" Removal of older features: A number of "system programming" features of the x86 architecture are not used in modern operating systems and are not available on AMD64 in long (64-bit and compatibility) mode. These include segmented addressing (although the FS and GS segments are retained in vestigial form for use as extra base pointers to operating system structures)[1](p70), the task state switch mechanism, and Virtual 8086 mode. These features remain fully implemented in "legacy mode," thus permitting these processors to run 32-bit and 16-bit operating systems without modification. A number of instructions which proved to be rarely useful are not supported in 64-bit mode: saving/restoring of segment registers on the stack, saving/restoring of all registers (PUSHA/POPA), decimal arithmetic, BOUND and INTO instructions, and "far" jumps and calls with immediate operands. " -- https://en.wikipedia.org/wiki/X86-64#Architectural_features

" Often it is the case that an instruction the CPU designer feels is important turns out to be less useful than anticipated. For example, the LOOP instruction on the 80x86 CPU sees very little use in modern high-performance programs. The 80x86 ENTER instruction is another good example." -- http://www.plantation-productions.com/Webster/www.artofasm.com/Linux/HTML/ISA.html#1013376

As of Sep 6, 2014, my very sloppy attempt to count the mnemonics listed on https://en.wikipedia.org/wiki/X86_instruction_listings (excluding undocumented ones) yields a count of 678 mnemonics (AAA,AAD,AAM,AAS,ADC,ADD,ADDPD,ADDPS,ADDSD,ADDSS,ADDSUBPD,ADDSUBPS,AESDEC,AESDECLAST,AESENC,AESENCLAST,AESIMC,AESKEYGENASSIST,AND,ANDNPD,ANDNPS,ANDPD,ANDPS,ARPL,BLENDPD,BLENDPS,BLENDVPD,BLENDVPS,BOUND,BSF,BSR,BSWAP,BT,BTC,BTR,BTS,CALL,CBW,CDQ,CDQE,CLC,CLD,CLFLUSH,CLGI,CLI,CLTS,CMC,CMOVA,CMOVAE,CMOVB,CMOVBE,CMOVC,CMOVE,CMOVG,CMOVGE,CMOVL,CMOVLE,CMOVNA,CMOVNAE,CMOVNB,CMOVNBE,CMOVNC,CMOVNE,CMOVNG,CMOVNGE,CMOVNL,CMOVNLE,CMOVNO,CMOVNP,CMOVNS,CMOVNZ,CMOVO,CMOVP,CMOVPE,CMOVPO,CMOVS,CMOVZ,CMP,CMPPD,CMPPS,CMPSB,CMPSD,CMPSD*,CMPSQ,CMPSS,CMPSW,CMPXCHG,CMPXCHG16B,CMPXCHG8B,COMISD,COMISS,CPUID,CQO,CRC32,CVTDQ2PD,CVTDQ2PS,CVTPD2DQ,CVTPD2PI,CVTPD2PS,CVTPI2PD,CVTPI2PS,CVTPS2DQ,CVTPS2PD,CVTPS2PI,CVTSD2SI,CVTSD2SS,CVTSI2SD,CVTSI2SS,CVTSS2SD,CVTSS2SI,CVTTPD2DQ,CVTTPD2PI,CVTTPS2DQ,CVTTPS2PI,CVTTSD2SI,CVTTSS2SI,CWD,CWDE,DAA,DAS,DEC,DIV,DIVPD,DIVPS,DIVSD,DIVSS,DPPD,DPPS,EMMS,ENTER,ESC,EXTRACTPS,EXTRQ,F2XM1,FABS,FADD,FADDP,FBLD,FBSTP,FCHS,FCLEX,FCMOVB,FCMOVBE,FCMOVE,FCMOVNB,FCMOVNBE,FCMOVNE,FCMOVNU,FCMOVU,FCOM,FCOMI,FCOMIP,FCOMP,FCOMPP,FCOS,FDECSTP,FDISI,FDIV,FDIVP,FDIVR,FDIVRP,FEMMS,FENI,FFREE,FIADD,FICOM,FICOMP,FIDIV,FIDIVR,FILD,FIMUL,FINCSTP,FINIT,FIST,FISTP,FISTTP,FISUB,FISUBR,FLD,FLD1,FLDCW,FLDENV,FLDENVD,FLDENVW,FLDL2E,FLDL2T,FLDLG2,FLDLN2,FLDPI,FLDZ,FMUL,FMULP,FNCLEX,FNDISI,FNENI,FNINIT,FNOP,FNSAVE,FNSAVEW,FNSTCW,FNSTENV,FNSTENVW,FNSTSW,FPATAN,FPREM,FPREM1,FPTAN,FRNDINT,FRSTOR,FRSTORD,FRSTORW,FSAVE,FSAVED,FSAVEW,FSCALE,FSETPM,FSIN,FSINCOS,FSQRT,FST,FSTCW,FSTENV,FSTENVD,FSTENVW,FSTP,FSTSW,FSUB,FSUBP,FSUBR,FSUBRP,FTST,FUCOM,FUCOMI,FUCOMIP,FUCOMP,FUCOMPP,FWAIT,FXAM,FXCH,FXRSTOR,FXSAVE,FXTRACT,FYL2X,FYL2XP1,HADDPD,HADDPS,HLT,HSUBPD,HSUBPS,IDIV,IMUL,IN,INC,INS,INSD,INSERTPS,INSERTQ,INT,INTO,INVD,INVLPG,INVLPGA,IRET,IRETQ,IRETx,Jcc,JCXZ,JECXZ,JMP,JRCXZ,LAHF,LAR,LDDQU,LDMXCSR,LDS,LEA,LEAVE,LES,LFENCE,LFS,LGDT,LIDT,LLDT,LMSW,LOADALL,LOCK,LODSB,LODSD,LODSQ,LODSW,LOOP,LOOPD,LOOPW,LSL,LSS,LTR,LZCNT,MASKMOVDQU,MASKMOVQ,MAXPD,MAXPS,MAXSD,MAXSS,MFENCE,MINPD,MINPS,MINSD,MINSS,MONITOR,MOV,MOV(,MOVAPD,MOVAPS,MOVD,MOVDDUP,MOVDQ2Q,MOVDQA,MOVDQU,MOVHLPS,MOVHPD,MOVHPS,MOVLHPS,MOVLPD,MOVLPS,MOVMSKPD,MOVMSKPS,MOVNTDQ,MOVNTDQA,MOVNTI,MOVNTPD,MOVNTPS,MOVNTQ,MOVNTSD,MOVNTSS,MOVQ,MOVQ2DQ,MOVSB,MOVSD,MOVSD*,MOVSHDUP,MOVSLDUP,MOVSS,MOVSW,MOVSX,MOVSXD,MOVUPD,MOVUPS,MOVZX,MPSADBW,MUL,MULPD,MULPS,MULSD,MULSS,MWAIT,NEG,NOP,NOT,OR,ORPD,ORPS,OUT,OUTS,OUTSD,PABSB,PABSD,PABSW,PACKSSDW,PACKSSWB,PACKUSDW,PACKUSWB,PADDB,PADDD,PADDQ,PADDSB,PADDSW,PADDUSB,PADDUSW,PADDW,PALIGNR,PAND,PANDN,PAUSE,PAVGB,PAVGUSB,PAVGW,PBLENDVB,PBLENDW,PCMPEQB,PCMPEQD,PCMPEQQ,PCMPEQW,PCMPESTRI,PCMPESTRM,PCMPGTB,PCMPGTD,PCMPGTQ,PCMPGTW,PCMPISTRI,PCMPISTRM,PEXTRB,PEXTRD,PEXTRQ,PEXTRW,PF2ID,PF2IW,PFACC,PFADD,PFCMPEQ,PFCMPGE,PFCMPGT,PFMAX,PFMIN,PFMUL,PFNACC,PFPNACC,PFRCP,PFRCPIT1,PFRCPIT2,PFRCPV,PFRSQIT1,PFRSQRT,PFRSQRTV,PFSUB,PFSUBR,PHADDD,PHADDSW,PHADDW,PHMINPOSUW,PHSUBD,PHSUBSW,PHSUBW,PI2FD,PI2FW,PINSRB,PINSRD/PINSRQ,PINSRW,PMADDUBSW,PMADDWD,PMAXSB,PMAXSD,PMAXSW,PMAXUB,PMAXUD,PMAXUW,PMINSB,PMINSD,PMINSW,PMINUB,PMINUD,PMINUW,PMOVMSKB,PMOVSXBD,PMOVSXBQ,PMOVSXBW,PMOVSXDQ,PMOVSXWD,PMOVSXWQ,PMOVZXBD,PMOVZXBQ,PMOVZXBW,PMOVZXDQ,PMOVZXWD,PMOVZXWQ,PMULDQ,PMULHRSW,PMULHRW,PMULHUW,PMULHW,PMULLD,PMULLW,PMULUDQ,POP,POPA,POPAD,POPCNT,POPF,POPFD,POPFQ,POR,PREFETCH,PREFETCH0,PREFETCH1,PREFETCH2,PREFETCHNTA,PREFETCHW,PSADBW,PSHUFB,PSHUFD,PSHUFHW,PSHUFLW,PSHUFW,PSIGNB,PSIGND,PSIGNW,PSLLD,PSLLDQ,PSLLQ,PSLLW,PSRAD,PSRAW,PSRLD,PSRLDQ,PSRLQ,PSRLW,PSUBB,PSUBD,PSUBQ,PSUBSB,PSUBSW,PSUBUSB,PSUBUSW,PSUBW,PSWAPD,PTEST,PUNPCKHBW,PUNPCKHDQ,PUNPCKHQDQ,PUNPCKHWD,PUNPCKLBW,PUNPCKLDQ,PUNPCKLQDQ,PUNPCKLWD,PUSH,PUSHA,PUSHAD,PUSHF,PUSHFD,PUSHFQ,PXOR,RCL,RCPPS,RCPSS,RCR,RDMSR,RDPMC,RDTSC,RDTSCP,REPxx,RET,RETF,RETN,ROL,ROR,ROUNDPD,ROUNDPS,ROUNDSD,ROUNDSS,RSM,RSQRTPS,RSQRTSS,SAHF,SAL,SAR,SBB,SCASB,SCASD,SCASQ,SCASW,SETA,SETAE,SETB,SETBE,SETC,SETE,SETG,SETGE,SETL,SETLE,SETNA,SETNAE,SETNB,SETNBE,SETNC,SETNE,SETNG,SETNGE,SETNL,SETNLE,SETNO,SETNP,SETNS,SETNZ,SETO,SETP,SETPE,SETPO,SETS,SETZ,SFENCE,SGDT,SHL,SHLD,SHR,SHRD,SHUFPD,SHUFPS,SIDT,SKINIT,SLDT,SMSW,SQRTPD,SQRTPS,SQRTSD,SQRTSS,STC,STD,STGI,STI,STMXCSR,STOSB,STOSD,STOSQ,STOSW,STR,SUB,SUBPD,SUBPS,SUBSD,SUBSS,SWAPGS,SYSCALL,SYSENTER,SYSEXIT,SYSRET,TEST,UCOMISD,UCOMISS,UD2,UNPCKHPD,UNPCKHPS,UNPCKLPD,UNPCKLPS,VERR,VERW,VFMADDPD,VFMADDPS,VFMADDSD,VFMADDSS,VFMADDSUBPD,VFMADDSUBPS,VFMSUBADDPD,VFMSUBADDPS,VFMSUBPD,VFMSUBPS,VFMSUBSD,VFMSUBSS,VFNMADDPD,VFNMADDPS,VFNMADDSD,VFNMADDSS,VFNMSUBPD,VFNMSUBPS,VFNMSUBSD,VFNMSUBSS,VMCALL,VMCLEAR,VMLAUNCH,VMLOAD,VMMCALL,VMPTRLD,VMPTRST,VMREAD,VMRESUME,VMRUN,VMSAVE,VMWRITE,VMXOFF,VMXON,WAIT,WBINVD,WRMSR,XADD,XCHG,XLAT,XOR,XORPD,XORPS).

https://news.ycombinator.com/item?id=12354445 counts: "As for how many x86 instructions, there are 981 unique mnemonics and 3,684 variants (per https://stefanheule.com/papers/pldi16-strata.pdf). Note that some mnemonics mask several instructions--mov is particularly bad about that. I don't know if those counts are considered only up to AVX-2 or if they extend to the AVX-512 instruction set as well."

x86 examples

x86 Hello World

On an x86-64 GNU/Linux system:

Create a file hello.asm with the following content:

section     .text
global      _start                              ;must be declared for linker (ld)

_start:                                         ;tell linker entry point

    mov     edx,len                             ;message length
    mov     ecx,msg                             ;message to write
    mov     ebx,1                               ;file descriptor (stdout)
    mov     eax,4                               ;system call number (sys_write)
    int     0x80                                ;call kernel

    mov     eax,1                               ;system call number (sys_exit)
    int     0x80                                ;call kernel

section     .data

msg     db  'Hello, world!',0xa                 ;our dear string
len     equ $ - msg                             ;length of our dear string

Now do:

nasm -f elf64 hello.asm
ld -s -o hello hello.o
./hello

To run with a debugger, for example ddd (an GUI interface to gdb), add the '-g' file to nasm to add debugging info:

nasm -g -f dwarf -f elf64 hello.asm
ld -o hello hello.o
ddd hello

Adapted from [1] and [2].

x86 instruction set encoding

x86 variants

gem5 x86 microop ISA

gem5 simulator's x86 microop ISA

All of these except for the unload instructions have an immediate variants; these immediate variants are not listed here:

The rest don't have immediate variants:

Out of the above, the following can be predicated:

the book x86-64 Assembly Language Programming with Ubuntu

The subset of instructions discussed in the book is (from [3] Appendix B):

Position independent code

misc/todo

" pcwalton 1 hour ago [-]

> Instruction density is very important especially since caches are big and consume a lot of power too.

Which hurts x86 a lot, because x86-64 is very space inefficient for a variable length ISA. The REX prefixes add up to make x86-64 just as space-inefficient as AArch64.

reply " -- https://news.ycombinator.com/item?id=19329161

"

	 snvzz 1 hour ago [-]

>CISC vs RISC is almost completely irrelevant these days.

This nonsense keeps coming up. No, it's not irrelevant. It matters. A lot.

A CISC design is complex, but it doesn't stop there. This complexity spreads down the chain. Implementations get complex, bugs happen. Making formal proofs of an implementation's correctness becomes impossible. Writing a compiler back end will be complex. Debugging it will be complex. Writing a proof that the machine code meets both the ISA specification and implements the same thing the higher level language does is also complex.

Now, where's the advantage of CISC to justify this complexity? Yeah, right.

reply

" -- https://news.ycombinator.com/item?id=19329469

x86 Calling conventions

https://c9x.me/compile/bib/abi-x64.pdf

Opinions

Retrospectives

On AVX-512 (according to [5]:

"First thing I did last time was write the compiler. Then we designed the architecture. Any "neat features" the compiler couldn't use - thrown away. Went through 4-5 major architectures that way before converging on something usable." -- Tom Forsyth

Tools

256LOL

256LOL -- x86 subset assembler in 256 lines of code

The subset is:

" Instruction Example Description of the Example nop nop No operation (do nothing)

— Data Movement — mov register, immediate mov eax, 0F00Dh Place the value F00D (hexadecimal) in EAX mov register, register mov eax, ebx Copy the value from the EBX register into EAX mov register, [register] mov eax, [ebx] Treat EBX as pointer; load 32-bit value from memory into EAX mov [register], register mov [eax], ebx Treat EAX as pointer; store 32-bit value from EBX in memory

— Arithmetic — add register, register add eax, ebx EAX = EAX + EBX cdq cdq Sign-extend EAX into EDX in preparation for idiv dec register dec eax EAX = EAX - 1 div register div ebx Unsigned division: EDX:EAX ÷ EBX, setting EAX = quotient, EDX = remainder idiv register idiv ebx Signed division: EDX:EAX ÷ EBX, setting EAX = quotient, EDX = remainder imul register imul ebx Signed multiplication: EDX:EAX = EAX × EBX inc register inc eax EAX = EAX + 1 neg register neg eax EAX = -EAX mul register mul ebx Unsigned multiplication: EDX:EAX = EAX × EBX sub register, register sub eax, ebx EAX = EAX - EBX

— Bitwise Operations — and register, register and eax, ebx EAX = EAX & EBX not register not eax EAX = ~EAX or register, register or eax, ebx EAX = EAX

sar register, immediate sar eax, 2 Shift EAX right by 2 bits (sign-fill) sar register, cl sar eax, cl Shift EAX right by CL bits (sign-fill) shl register, immediate shl eax, 2 Shift EAX left by 2 bits shl register, cl shl eax, cl Shift EAX left by number of bits in CL shr register, immediate shr eax, 2 Shift EAX right by 2 bits (zero-fill) shr register, cl shr eax, cl Shift EAX right by CL bits (zero-fill) xor register, register xor eax, ebx EAX = EAX ^ EBX
EBX

— Comparison — cmp register, register cmp eax, ebx Compare EAX to EBX, setting flags for conditional jump

— Control Flow — jmp bytes jmp -10 Jump -10 bytes, i.e., move EIP backward by 10 bytes ja bytes ja -10 Jump if above (>, unsigned) jae bytes jae -10 Jump if above or equal (>=, unsigned) jb bytes jb -10 Jump if below (<, unsigned) jbe bytes jbe -10 Jump if below or equal (<=, unsigned) je bytes je -10 Jump if equal jg bytes jg -10 Jump if greater (>, signed) jge bytes jge -10 Jump if greater or equal (>=, signed) jl bytes jl -10 Jump if less (<, signed) jle bytes jle -10 Jump if less or equal (<=, signed) jne bytes jne -10 Jump if not equal

— Function Calls — call register call eax Call function at pointer stored in EAX push register push eax Push value of EAX onto the stack pop register pop eax Pop a value from the stack into EAX ret immediate ret 4 Return from function, removing 4 bytes of stack arguments " -- [6]

Opinions and comparisons

x86 Links