.topic hcContents=1640

  80x86 integer opcode help
 

Welcome to the NASM-IDE 1.7 online help. This section contains details on the
Intel 80x86 instruction set, covering instructions supported by the 8086 up
to the Pentium(tm) II. This help does not include instructions relating to
3DNow! or SSE. Please refer to the NASM documentation for information on
these instructions.


  Contents
 

  {Using the opcode listings:UsingOpcodeHelp}

  {Alphabetical opcode listing:AlphaList}

  {Opcode listing by minimum processor requirement:ProcList}


.topic UsingOpcodeHelp=20000

  Using 80x86 integer opcode help
 

This help file contains entries for all 80x86 instructions. Each entry
contains the following sections:

  Description

This contains a detailed description of how the instruction works. This
information is based on the Intel Architecture Software Developer's Manual
Volume 2, Instruction Set Reference (#243191), available from Intel's web
site (http://www.intel.com/).

  Flags affected

Describes any changes to the EFLAGS register as a result of the instruction
being executed.

  Instruction size and timings

Contains a table showing the size of an instruction in bytes and the number
of clock cycles the instruction takes to executed. Timings are shown for the
8086 up to the Pentium processor. Instruction pairing information is also
included for the Pentium timings. This information is taken from a HTML
document available from http://www.quantasm.com/.

The following symbols are used in the timing tables:

 Operands
 

 acc   = AL, AX or EAX unless specified otherwise
 reg   = any general register
 r8    = any 8-bit register
 r16   = any general purpose 16-bit register
 r32   = any general purpose 32-bit register
 imm   = immediate data
 imm8  = 8-bit immediate data
 imm16 = 16-bit immediate data
 mem   = memory address
 mem8  = address of 8-bit data item
 mem16 = address of 16-bit data item
 mem32 = address of 32-bit data item
 mem48 = address of 48-bit data item
 dest  = 16/32-bit destination
 short = 8-bit destination

 Instruction timings
 

 n  -  generally refers to a number of repeated counts
 m  -  in a jump or call;
          286     : bytes in next instruction
          386/486 : number of components
                    (each byte of opcode) + 1 (if immediate data)
                    + 1 (if displacement)

 EA = cycles to calculate the Effective Address
          8088/8086: base                             = 5
                     BP + DI or BX + SI               = 7
                     BP + DI + disp or BX + SI + disp = 11

                     index                            = 5
                     BX + DI or BP + SI               = 8
                     BX + DI + disp or BP + SI + disp = 12

                     disp                             = 6
                     segment override                 = +2

          286 - 486: base + index + disp              = +1
                     all others, no penalty


 Instruction length
 

The byte count includes the opcode length and length of any required
displacement or immediate data. If the displacement is optional, it
is shown as d() with the possible lengths in parentheses. If the
immediate data is optional, it is shown as i() with the possible
lengths in parentheses.


 Pairing categories for Pentium
 

 NP = not pairable
 UV = pairable in the U pipe or V pipe
 PU = pairable in the U pipe only
 PV = pairable in the V pipe only


  Example

This section contains an example of how the instruction can be used.


 { Back to contents screen:hcContents}


.topic AlphaList

  80x86 integer opcodes (alphabetical)
 

 {AAA}                           - ASCII adjust after addition
 {AAD}                           - ASCII adjust AX before division
 {AAM}                           - ASCII adjust AX after multiplication
 {AAS}                           - ASCII adjust AL after subtraction
 {ADC}                           - Add with carry
 {ADD}                           - Integer addition
 {AND}                           - Logical AND
 {ARPL}                          - Adjusted Requested Privilege Level of selector (286+)
 {BOUND}                         - Array index bound check (186+)
 {BSF}                           - Bit scan forward (386+)
 {BSR}                           - Bit scan reverse (386+)
 {BSWAP}                         - Byte swap (486+)
 {BT}                            - Bit test (386+)
 {BTC}                           - Bit test with compliment (386+)
 {BTR}                           - Bit test with reset (386+)
 {BTS}                           - Bit test and set (386+)
 {CALL}                          - Call subroutine
 {CBW}                           - Convert byte to word
 {CDQ}                           - Convert double to quad (386+)
 {CLC}                           - Clear carry
 {CLD}                           - Clear direction flag
 {CLI}                           - Clear interrupt flag
 {CLTS}                          - Clear task switched flag (286+)
 {CMC}                           - Complement carry flag
 {CMOVcc}                        - Conditional move (Pentium Pro+)
 {CMP}                           - Compare
 {CMPS}                          - Compare string (byte, word or doubleword)
 {CMPXCHG}                       - Compare and exchange (486+)
 {CMPXCHG8B}                     - Compare and exchange 8 bytes (Pentium+)
 {CPUID}                         - CPU identification (486+)
 {CWD}                           - Convert word to doubleword
 {CWDE}                          - Convert word to extended doubleword (386+)
 {DAA}                           - Decimal adjust AL after addition
 {DAS}                           - Decimal adjust AL after subtraction
 {DEC}                           - Decrement
 {DIV}                           - Unsigned divide
 {EMMS}                          - Empty MMX state (MMX)
 {ENTER}                         - Make stack frame for procedure parameters (186+)
 {ESC}                           - Escape
 {HLT}                           - Halt CPU
 {IDIV}                          - Signed integer division
 {IMUL}                          - Signed multiply
 {IN}                            - Input byte or word from port
 {INC}                           - Increment
 {INS}                           - Input string from port (byte, word or doubleword) (186+)
 {INT}                           - Interrupt
 {INTO}                          - Interrupt on overflow
 {INVD}                          - Invalidate data cache (486+)
 {INVLPG}                        - Invalidate translation look-aside buffer (TLB) entry (486+)
 {IRET/IRETD:IRET}                    - Interrupt return (IRETD 386+)
 {Jcc}                           - Jump on condition code
 {JMP}                           - Unconditional jump
 {LAHF}                          - Load register flags into AH
 {LAR}                           - Load access rights (286+)
 {LDS}                           - Load far pointer
 {LEA}                           - Load effective address
 {LEAVE}                         - Restore stack for procedure exit (186+)
 {LES:LDS}                           - Load far pointer
 {LFS:LDS}                           - Load far pointer (386+)
 {LGDT}                          - Load Global Descriptor Table (286+)
 {LGS:LDS}                           - Load far pointer (386+)
 {LIDT:LGDT}                          - Load Interrupt Descriptor Table (286+)
 {LLDT}                          - Load Local Descriptor Table (286+)
 {LMSW}                          - Load Machine Status Word (286+)
 {LOCK}                          - Lock bus
 {LODS}                          - Load string (byte, word or doubleword)
 {LOOP}                          - Decrement CX and loop if CX not zero
 {LOOPE/LOOPZ:LOOPE}                   - Loop while equal / loop while zero
 {LOOPNZ/LOOPNE:LOOPNZ}                 - Loop while not zero / loop while not equal
 {LSL}                           - Load segment limit (286+)
 {LSS:LDS}                           - Load far pointer (386+)
 {LTR}                           - Load task register (286+)
 {MOV}                           - Move data
 {MOVD}                          - Move 32 bits (doubleword) (MMX)
 {MOVQ}                          - Move 64 bits (quadword) (MMX)
 {MOVS}                          - Move string (byte, word or doubleword)
 {MOVSX}                         - Move with sign extend (386+)
 {MOVZX}                         - Move with zero extend (386+)
 {MUL}                           - Unsigned multiply
 {NEG}                           - Two's complement negation
 {NOP}                           - No operation
 {NOT}                           - One's compliment negation (Logical NOT)
 {OR}                            - Inclusive logical OR
 {OUT}                           - Output data to port
 {OUTS}                          - Output string to port (byte, word or doubleword) (186+)
 {PACKSSWB/PACKSSDW:PACKSSWB}             - Pack with signed saturation (MMX)
 {PACKUSWB}                      - Pack with unsigned saturation (MMX)
 {PADDB/PADDW/PADDD:PADDB}             - Packed add (MMX)
 {PADDSB/PADDSW:PADDSB}                 - Packed add with saturation (MMX)
 {PADDUSB/PADDUSW:PADDUSB}               - Packed add unsigned with saturation (MMX)
 {PAND}                          - Logical AND (MMX)
 {PANDN}                         - Logical AND NOT (MMX)
 {PCMPEQB/PCMPEQW/PCMPEQD:PCMPEQB}       - Packed compare for equal (MMX)
 {PCMPGTB/PCMPGTW/PCMPGTD:PCMPGTB}       - Packed compare for greater than (MMX)
 {PMADDWD}                       - Packed multiply and add (MMX)
 {PMULHW}                        - Packed multiply high (MMX)
 {PMULLW}                        - Packed multiply low (MMX)
 {POP}                           - Pop word off stack
 {POPA/POPAD:POPA}                    - Pop all registers onto stack (186+)
 {POPF/POPFD:POPF}                    - Pop flags off stack
 {POR}                           - Bitwise logical OR (MMX)
 {PSLLW/PSLLD/PSLLQ:PSLLW}             - Packed shift left logical (MMX)
 {PSRAW/PSRAD:PSRAW}                   - Packed shift right arithmetic (MMX)
 {PSRLW/PSRLD/PSRLQ:PSRLW}             - Packed shift right logical (MMX)
 {PSUBB/PSUBW/PSUBD:PSUBB}             - Packed subtract (MMX)
 {PSUBSB/PSUBSW:PSUBSB}                 - Packed subtract with saturation (MMX)
 {PSUBUSB/PSUBUSW:PSUBUSB}               - Packed subtract unsigned with saturation (MMX)
 {PUNPCKHBW/PUNPCKHWD/PUNPCKHDQ:PUNPCKHBW} - Unpack high packed data (MMX)
 {PUNPCKLBW/PUNPCKLWD/PUNPCKLDQ:PUNPCKLBW} - Unpack low packed data (MMX)
 {PUSH}                          - Push word onto stack
 {PUSHA/PUSHAD:PUSHA}                  - Push all registers onto stack (186+)
 {PUSHF/PUSHFD:PUSHF}                  - Push flags onto stack
 {PXOR}                          - Logical exclusive OR (MMX)
 {RCL}                           - Rotate through carry left
 {RCR:RCL}                           - Rotate through carry right
 {RDMSR}                         - Read from Model Specific Register (Pentium+)
 {RDPMC}                         - Read Performance-Monitoring Counters (MMX/Pentium Pro+)
 {RDTSC}                         - Read Time-Stamp Counter (Pentium+)
 {REP}                           - Repeat string operation
 {REPE/REPZ:REP}                     - Repeat while equal / repeat while zero
 {REPNE/REPNZ:REP}                   - Repeat while not equal / repeat while not zero
 {RET/RETF:RET}                      - Return from procedure
 {ROL:RCL}                           - Rotate left
 {ROR:RCL}                           - Rotate right
 {RSM}                           - Resume from System Management Mode (Pentium+)
 {SAHF}                          - Store AH register into flags
 {SAL}                           - Shift arithmetic left
 {SAR:SAL}                           - Shift arithmetic right
 {SBB}                           - Subtract with borrow / carry
 {SCAS}                          - Scan string  (byte, word or doubleword)
 {SETcc}                         - Set byte on condition (386+)
 {SGDT}                          - Store Global Descriptor Table (286+)
 {SIDT:SGDT}                          - Store Interrupt Descriptor Table (286+)
 {SHL:SAL}                           - Shift logical left
 {SHR:SAL}                           - Shift logical right
 {SHLD}                          - Double precision shift left (386+)
 {SHRD}                          - Double precision shift right (386+)
 {SLDT}                          - Store Local Descriptor Table (286+)
 {SMSW}                          - Store Machine Status Word (286+)
 {STC}                           - Set carry flag
 {STD}                           - Set direction flag
 {STI}                           - Set interrupt flag (enable interrupts)
 {STOS}                          - Store string (byte, word or doubleword)
 {STR}                           - Store task register (286+)
 {SUB}                           - Subtract
 {TEST}                          - Test for bit pattern (logical compare)
 {UD2}                           - Undefined instruction
 {VERR}                          - Verify read (286+)
 {VERW:VERR}                          - Verify write (286+)
 {WAIT}                          - Wait for coprocessor
 {WBINVD}                        - Write-back and invalidate data cache (486+)
 {WRMSR}                         - Write to Model Specific Register (Pentium+)
 {XADD}                          - Exchange and add (486+)
 {XCHG}                          - Exchange
 {XLAT/XLATB:XLAT}                    - Translate
 {XOR}                           - Exclusive OR

 { Back to contents screen:hcContents}


.topic ProcList

  80x86 integer opcodes (by processor)
 

  8086 and above

 {AAA}                           - ASCII adjust after addition
 {AAD}                           - ASCII adjust AX before division
 {AAM}                           - ASCII adjust AX after multiplication
 {AAS}                           - ASCII adjust AL after subtraction
 {ADC}                           - Add with carry
 {ADD}                           - Integer addition
 {AND}                           - Logical AND
 {CALL}                          - Call subroutine
 {CBW}                           - Convert byte to word
 {CLC}                           - Clear carry
 {CLD}                           - Clear direction flag
 {CLI}                           - Clear interrupt flag
 {CMC}                           - Complement carry flag
 {CMP}                           - Compare
 {CMPS}                          - Compare string (byte, word or doubleword)
 {CWD}                           - Convert word to doubleword
 {DAA}                           - Decimal adjust AL after addition
 {DAS}                           - Decimal adjust AL after subtraction
 {DEC}                           - Decrement
 {DIV}                           - Unsigned divide
 {ESC}                           - Escape
 {HLT}                           - Halt CPU
 {IDIV}                          - Signed integer division
 {IMUL}                          - Signed multiply
 {IN}                            - Input byte or word from port
 {INC}                           - Increment
 {INT}                           - Interrupt
 {INTO}                          - Interrupt on overflow
 {IRET}                          - Interrupt return
 {Jcc}                           - Jump on condition code
 {JMP}                           - Unconditional jump
 {LAHF}                          - Load register flags into AH
 {LDS}                           - Load far pointer
 {LEA}                           - Load effective address
 {LES:LDS}                           - Load far pointer
 {LOCK}                          - Lock bus
 {LODS}                          - Load string (byte, word or doubleword)
 {LOOP}                          - Decrement CX and loop if CX not zero
 {LOOPE/LOOPZ:LOOPE}                   - Loop while equal / loop while zero
 {LOOPNZ/LOOPNE:LOOPNZ}                 - Loop while not zero / loop while not equal
 {MOV}                           - Move data
 {MOVS}                          - Move string (byte, word or doubleword)
 {MUL}                           - Unsigned multiply
 {NEG}                           - Two's complement negation
 {NOP}                           - No operation
 {NOT}                           - One's compliment negation (Logical NOT)
 {OR}                            - Inclusive logical OR
 {OUT}                           - Output data to port
 {POP}                           - Pop word off stack
 {POPF/POPFD:POPF}                    - Pop flags off stack
 {PUSH}                          - Push word onto stack
 {PUSHF/PUSHFD:PUSHF}                  - Push flags onto stack
 {RCL}                           - Rotate through carry left
 {RCR:RCL}                           - Rotate through carry right
 {REP}                           - Repeat string operation
 {REPE/REPZ:REP}                     - Repeat while equal / repeat while zero
 {REPNE/REPNZ:REP}                   - Repeat while not equal / repeat while not zero
 {RET/RETF:RET}                      - Return from procedure
 {ROL:RCL}                           - Rotate left
 {ROR:RCL}                           - Rotate right
 {SAHF}                          - Store AH register into flags
 {SAL}                           - Shift arithmetic left
 {SAR:SAL}                           - Shift arithmetic right
 {SBB}                           - Subtract with borrow / carry
 {SCAS}                          - Scan string  (byte, word or doubleword)
 {SHL:SAL}                           - Shift logical left
 {SHR:SAL}                           - Shift logical right
 {STC}                           - Set carry flag
 {STD}                           - Set direction flag
 {STI}                           - Set interrupt flag (enable interrupts)
 {STOS}                          - Store string (byte, word or doubleword)
 {SUB}                           - Subtract
 {TEST}                          - Test for bit pattern (logical compare)
 {UD2}                           - Undefined instruction
 {WAIT}                          - Wait for coprocessor
 {XCHG}                          - Exchange
 {XLAT/XLATB:XLAT}                    - Translate
 {XOR}                           - Exclusive OR

  186 and above

 {BOUND}                         - Array index bound check (186+)
 {ENTER}                         - Make stack frame for procedure parameters (186+)
 {INS}                           - Input string from port (byte, word or doubleword) (186+)
 {LEAVE}                         - Restore stack for procedure exit (186+)
 {OUTS}                          - Output string to port (byte, word or doubleword) (186+)
 {POPA/POPAD:POPA}                    - Pop all registers onto stack (186+)
 {PUSHA/PUSHAD:PUSHA}                  - Push all registers onto stack (186+)

  286 and above

 {ARPL}                          - Adjusted Requested Privilege Level of selector (286+)
 {CLTS}                          - Clear task switched flag (286+)
 {LAR}                           - Load access rights (286+)
 {LGDT}                          - Load Global Descriptor Table (286+)
 {LIDT:LGDT}                          - Load Interrupt Descriptor Table (286+)
 {LLDT}                          - Load Local Descriptor Table (286+)
 {LMSW}                          - Load Machine Status Word (286+)
 {LSL}                           - Load segment limit (286+)
 {LTR}                           - Load task register (286+)
 {SGDT}                          - Store Global Descriptor Table (286+)
 {SIDT:SGDT}                          - Store Interrupt Descriptor Table (286+)
 {SLDT}                          - Store Local Descriptor Table (286+)
 {SMSW}                          - Store Machine Status Word (286+)
 {STR}                           - Store task register (286+)
 {VERR}                          - Verify read (286+)
 {VERW:VERR}                          - Verify write (286+)

  386 and above

 {BSF}                           - Bit scan forward (386+)
 {BSR}                           - Bit scan reverse (386+)
 {BT}                            - Bit test (386+)
 {BTC}                           - Bit test with compliment (386+)
 {BTR}                           - Bit test with reset (386+)
 {BTS}                           - Bit test and set (386+)
 {CDQ}                           - Convert double to quad (386+)
 {CWDE}                          - Convert word to extended doubleword (386+)
 {IRETD:IRET}                         - Interrupt return (386+)
 {LFS:LDS}                           - Load far pointer (386+)
 {LGS:LDS}                           - Load far pointer (386+)
 {LSS:LDS}                           - Load far pointer (386+)
 {MOVSX}                         - Move with sign extend (386+)
 {MOVZX}                         - Move with zero extend (386+)
 {SETcc}                         - Set byte on condition (386+)
 {SHLD}                          - Double precision shift left (386+)
 {SHRD}                          - Double precision shift right (386+)

  486 and above

 {BSWAP}                         - Byte swap (486+)
 {CMPXCHG}                       - Compare and exchange (486+)
 {CPUID}                         - CPU identification (486+)
 {INVD}                          - Invalidate data cache (486+)
 {INVLPG}                        - Invalidate translation look-aside buffer (TLB) entry (486+)
 {WBINVD}                        - Write-back and invalidate data cache (486+)
 {XADD}                          - Exchange and add (486+)

  Pentium(tm) class and above

 {CMPXCHG8B}                     - Compare and exchange 8 bytes (Pentium+)
 {RDMSR}                         - Read from Model Specific Register (Pentium+)
 {RDTSC}                         - Read Time-Stamp Counter (Pentium+)
 {RSM}                           - Resume from System Management Mode (Pentium+)
 {WRMSR}                         - Write to Model Specific Register (Pentium+)

  Pentium Pro(tm) class and above

 {CMOVcc}                        - Conditional move (Pentium Pro+)
 {RDPMC}                         - Read Performance-Monitoring Counters (MMX/Pentium Pro+)

  Multimedia Extensions (MMX) opcodes

 {EMMS}                          - Empty MMX state (MMX)
 {MOVD}                          - Move 32 bits (doubleword) (MMX)
 {MOVQ}                          - Move 64 bits (quadword) (MMX)
 {PACKSSWB/PACKSSDW:PACKSSWB}             - Pack with signed saturation (MMX)
 {PACKUSWB}                      - Pack with unsigned saturation (MMX)
 {PADDB/PADDW/PADDD:PADDB}             - Packed add (MMX)
 {PADDSB/PADDSW:PADDSB}                 - Packed add with saturation (MMX)
 {PADDUSB/PADDUSW:PADDUSB}               - Packed add unsigned with saturation (MMX)
 {PAND}                          - Logical AND (MMX)
 {PANDN}                         - Logical AND NOT (MMX)
 {PCMPEQB/PCMPEQW/PCMPEQD:PCMPEQB}       - Packed compare for equal (MMX)
 {PCMPGTB/PCMPGTW/PCMPGTD:PCMPGTB}       - Packed compare for greater than (MMX)
 {PMADDWD}                       - Packed multiply and add (MMX)
 {PMULHW}                        - Packed multiply high (MMX)
 {PMULLW}                        - Packed multiply low (MMX)
 {POR}                           - Bitwise logical OR (MMX)
 {PSLLW/PSLLD/PSLLQ:PSLLW}             - Packed shift left logical (MMX)
 {PSRAW/PSRAD:PSRAW}                   - Packed shift right arithmetic (MMX)
 {PSRLW/PSRLD/PSRLQ:PSRLW}             - Packed shift right logical (MMX)
 {PSUBB/PSUBW/PSUBD:PSUBB}             - Packed subtract (MMX)
 {PSUBSB/PSUBSW:PSUBSB}                 - Packed subtract with saturation (MMX)
 {PSUBUSB/PSUBUSW:PSUBUSB}               - Packed subtract unsigned with saturation (MMX)
 {PUNPCKHBW/PUNPCKHWD/PUNPCKHDQ:PUNPCKHBW} - Unpack high packed data (MMX)
 {PUNPCKLBW/PUNPCKLWD/PUNPCKLDQ:PUNPCKLBW} - Unpack low packed data (MMX)
 {PXOR}                          - Logical exclusive OR (MMX)
 {RDPMC}                         - Read Performance-Monitoring Counters (MMX/Pentium Pro+)


 { Back to contents screen:hcContents}


.topic AAA

  AAA - ASCII adjust after addition
 

  Description
 

Adjusts the sum of two unpacked BCD values to create an unpacked BCD result.
The AL register is the implied source and destination operand for this instruction.
The AAA instruction is only useful when it follows an ADD instruction that adds
(binary addition) two unpacked BCD values and stores a byte result in the AL
register. The AAA instruction then adjusts the contents of the AL register
to contain the correct 1-digit unpacked BCD result.

If the addition produces a decimal carry, the AH register is incremented by 1,
and the CF and AF flags are set. If there was no decimal carry, the CF and AF
flags are cleared and the AH register is unchanged. In either case, bits 4
through 7 of the AL register are cleared to 0.


  Flags affected
 

The AF and CF flags are set to 1 if the adjustment results in a decimal carry;
otherwise they are cleared to 0. The OF, SF, ZF, and PF flags are undefined.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       8       8       3       4       3       3   NP


  Example
 

 aaa       ; ASCII adjust after addition


 { Back to contents screen:hcContents}


.topic AAD

  AAD - ASCII adjust for division
 

  Description
 

Adjusts two unpacked BCD digits (the least-significant digit in the AL
register and the most-significant digit in the AH register) so that a
division operation performed on the result will yield a correct unpacked
BCD value. The AAD instruction is only useful when it precedes a DIV
instruction that divides (binary division) the adjusted value in the AX
register by an unpacked BCD value.

The AAD instruction sets the value in the AL register to (AL + (10 * AH)),
and then clears the AH register to 00H. The value in the AX register is then
equal to the binary equivalent of the original unpacked two-digit (base 10)
number in registers AH and AL.


  Flags affected
 

The SF, ZF, and PF flags are set according to the result; the OF, AF, and
CF flags are undefined.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  2      60      15      14      19      14      10   NP


  Example
 

 aad       ; ASCII adjust for division


 { Back to contents screen:hcContents}


.topic AAM

  AAM - ASCII adjust for multiplication
 

  Description
 

Adjusts the result of the multiplication of two unpacked BCD values to create
a pair of unpacked (base 10) BCD values. The AX register is the implied source
and destination operand for this instruction. The AAM instruction is only
useful when it follows a MUL instruction that multiplies (binary
multiplication) two unpacked BCD values and stores a word result in the AX
register. The AAM instruction then adjusts the contents of the AX register to
contain the correct 2-digit unpacked (base 10) BCD result.


  Flags affected
 

The SF, ZF, and PF flags are set according to the result. The OF, AF, and CF
flags are undefined.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  2      83      19      16      17      15      18   NP


  Example
 

 aam       ; ASCII adjust for multiplication


 { Back to contents screen:hcContents}


.topic AAS

  AAS - ASCII adjust after subtraction
 

  Description
 

Adjusts the result of the subtraction of two unpacked BCD values to create a
unpacked BCD result. The AL register is the implied source and destination
operand for this instruction. The AAS instruction is only useful when it
follows a SUB instruction that subtracts (binary subtraction) one unpacked
BCD value from another and stores a byte result in the AL register. The AAA
instruction then adjusts the contents of the AL register to contain the
correct 1-digit unpacked BCD result.

If the subtraction produced a decimal carry, the AH register is decremented
by 1, and the CF and AF flags are set. If no decimal carry occurred, the CF
and AF flags are cleared, and the AH register is unchanged. In either case,
the AL register is left with its top nibble set to 0.

  Flags affected
 

The AF and CF flags are set to 1 if there is a decimal borrow; otherwise,
they are cleared to 0.

The OF, SF, ZF, and PF flags are undefined.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       8       7       3       4       3       3   NP


  Example
 

 aas       ; ASCII adjust after subtraction


 { Back to contents screen:hcContents}


.topic ADC

  ADC - Add with carry
 

  Description
 

Adds the destination operand (first operand), the source operand (second
operand), and the carry (CF) flag and stores the result in the destination
operand.

The destination operand can be a register or a memory location; the source
operand can be an immediate, a register, or a memory location. (However, two
memory operands cannot be used in one instruction.)

The state of the CF flag represents a carry from a previous addition. When an
immediate value is used as an operand, it is sign-extended to the length of
the destination operand format.

The ADC instruction does not distinguish between signed or unsigned operands.
Instead, the processor evaluates the result for both data types and sets the
OF and CF flags to indicate a carry in the signed or unsigned result,
respectively. The SF flag indicates the sign of the signed result.

The ADC instruction is usually executed as part of a multibyte or multiword
addition in which an ADD instruction is followed by an ADC instruction.


  Flags affected
 

The OF, SF, ZF, AF, CF, and PF flags are set according to the result.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 reg, reg     2       3       3       2       2       1       1   PU
 mem, reg  2+d(0,2)  24+EA   10       7       7       3       3   PU
 reg, mem  2+d(0,2)  13+EA   10       7       6       2       2   PU
 reg, imm  2+i(1,2)   4       4       3       2       1       1   PU
 mem, imm  2+d(0,2)  23+EA   16       7       7       3       3   PU*
            +i(1,2)
 acc, imm  1+i(1,2)   4       4       3       2       1       1   PU

 * = not pairable if there is a displacement and immediate


  Example
 

 adc  eax, ebx  ; add with carry


 { Back to contents screen:hcContents}


.topic ADD

  ADD - addition
 

  Description
 

Adds the first operand (destination operand) and the second operand (source
operand) and stores the result in the destination operand. The destination
operand can be a register or a memory location; the source operand can be an
immediate, a register, or a memory location. (However, two memory operands
cannot be used in one instruction.)

When an immediate value is used as an operand, it is sign-extended to the
length of the destination operand format.

The ADD instruction does not distinguish between signed or unsigned operands.
Instead, the processor evaluates the result for both data types and sets the
OF and CF flags to indicate a carry in the signed or unsigned result,
respectively.

The SF flag indicates the sign of the signed result.


  Flags affected
 

The OF, SF, ZF, AF, CF, and PF flags are set according to the result.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 reg, reg     2       3       3       2       2       1       1   UV
 mem, reg  2+d(0,2)  24+EA   10       7       7       3       3   UV
 reg, mem  2+d(0,2)  13+EA   10       7       6       2       2   UV
 reg, imm  2+i(1,2)   4       4       3       2       1       1   UV
 mem, imm  2+d(0,2)  23+EA   16       7       7       3       3   UV*
            +i(1,2)
 acc, imm  1+i(1,2)   4       4       3       2       1       1   UV

 * = not pairable if there is a displacement and immediate


  Example
 

 add  eax, ebx  ; addition


 { Back to contents screen:hcContents}


.topic AND

  AND - Logical AND
 

  Description
 

Performs a bitwise AND operation on the destination (first) and source
(second) operands and stores the result in the destination operand location.
The source operand can be an immediate, a register, or a memory location; the
destination operand can be a register or a memory location. (However, two
memory operands cannot be used in one instruction.) Each bit of the result of
the AND instruction is a 1 if both corresponding bits of the operands are 1;
otherwise, it becomes a 0.


  Flags affected
 

The OF and CF flags are cleared; the SF, ZF, and PF flags are set according to
the result. The state of the AF flag is undefined.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 reg, reg     2       3       3       2       2       1       1   UV
 mem, reg  2+d(0,2)  24+EA   10       7       7       3       3   UV
 reg, mem  2+d(0,2)  13+EA   10       7       6       2       2   UV
 reg, imm  2+i(1,2)   4       4       3       2       1       1   UV
 mem, imm  2+d(0,2)  23+EA   16       7       7       3       3   UV*
            +i(1,2)
 acc, imm  1+i(1,2)   4       4       3       2       1       1   UV

 * = not pairable if there is a displacement and immediate


  Example
 

 and  eax, abx  ; logical AND


 { Back to contents screen:hcContents}


.topic ARPL

  ARPL - Adjust RPL field of segment selector (286+)
 

  Description
 

Compares the RPL fields of two segment selectors. The first operand (the
destination operand) contains one segment selector and the second operand
(source operand) contains the other. (The RPL field is located in bits 0 and
1 of each operand.)

If the RPL field of the destination operand is less than the RPL field of the
source operand, the ZF flag is set and the RPL field of the destination
operand is increased to match that of the source operand. Otherwise, the ZF
flag is cleared and no change is made to the destination operand. (The
destination operand can be a word register or a memory location; the source
operand must be a word register.)

The ARPL instruction is provided for use by operating-system procedures
(however, it can also be used by applications). It is generally used to
adjust the RPL of a segment selector that has been passed to the operating
system by an application program to match the privilege level of the
application program. Here the segment selector passed to the operating
system is placed in the destination operand and segment selector for the
application program's code segment is placed in the source operand. (The RPL
field in the source operand represents the privilege level of the application
program.)

Execution of the ARPL instruction then insures that the RPL of the
segment selector received by the operating system is no lower (does not have
a higher privilege) than the privilege level of the application program. (The
segment selector for the application program's code segment can be read from
the stack following a procedure call.)


  Flags affected
 

The ZF flag is set to 1 if the RPL field of the destination operand is less
than that of the source operand; otherwise, is cleared to 0.


  Instruction size and timings
 

 operands   bytes                   286     386     486     Pentium
 reg, reg    2                      10      20       9       7   NP
 mem, reg  2+d(0-2)                 11      21       9       7   NP


  Example
 

 arpl  ax, bx  ; Adjust RPL field of selector


 { Back to contents screen:hcContents}


.topic BOUND

  BOUND - Check array index against bounds (186+)
 

  Description
 

Determines if the first operand (array index) is within the bounds of an array
specified the second operand (bounds operand). The array index is a signed
integer located in a register. The bounds operand is a memory location that
contains a pair of signed doubleword-integers (when the operand-size attribute
is 32) or a pair of signed word-integers (when the operand-size attribute
is 16).

The first doubleword (or word) is the lower bound of the array and the second
doubleword (or word) is the upper bound of the array. The array index must be
greater than or equal to the lower bound and less than or equal to the upper
bound plus the operand size in bytes.

If the index is not within bounds, a BOUND range exceeded exception (#BR) is
signaled. (When a this exception is generated, the saved return instruction
pointer points to the BOUND instruction.)

The bounds limit data structure (two words or doublewords containing the lower
and upper limits of the array) is usually placed just before the array itself,
making the limits addressable via a constant offset from the beginning of the
array. Because the address of the array already will be present in a register,
this practice avoids extra bus cycles to obtain the effective address of the
array bounds.


  Flags affected
 

None.


  Instruction size and timings
 

 operands  bytes           186     286     386     486     Pentium
 reg, mem    4             35      13      10       7       8   NP


  Example
 

 bound  bx, array       ; check array index (bx) against bounds


 { Back to contents screen:hcContents}


.topic BSF

  BSF - Bit scan forward (386+)
 

  Description
 

Searches the source operand (second operand) for the least significant set
bit (1 bit). If a least significant 1 bit is found, its bit index is stored
in the destination operand (first operand).

The source operand can be a register or a memory location; the destination
operand is a register. The bit index is an unsigned offset from bit 0 of the
source operand. If the contents source operand are 0, the contents of the
destination operand is undefined.


  Flags affected
 

The ZF flag is set to 1 if all the source operand is 0; otherwise, the ZF flag
is cleared. The CF, OF, SF, AF, and PF, flags are undefined.


  Instruction size and timings
 

 operands    bytes                           386     486     Pentium
 r16, r16     3                             10+3n    6-42   6-34  NP
 r32, r32     3                             10+3n    6-42   6-42  NP
 r16, m16  3+d(0,1,2)                       10+3n    7-43   6-35  NP
 r32, m32  3+d(0,1,2,4)                     10+3n    7-43   6-43  NP


  Example
 

 bsf  eax, [esi]  ; Bit scan forward


 { Back to contents screen:hcContents}


.topic BSR

  BSR - Bit scan reverse (386+)
 

  Description
 

Searches the source operand (second operand) for the most significant set bit
(1 bit). If a most significant 1 bit is found, its bit index is stored in the
destination operand (first operand). The source operand can be a register or
a memory location; the destination operand is a register. The bit index is an
unsigned offset from bit 0 of the source operand. If the contents source operand
are 0, the contents of the destination operand is undefined.


  Flags affected
 

The ZF flag is set to 1 if all the source operand is 0; otherwise, the ZF
flag is cleared. The CF, OF, SF, AF, and PF, flags are undefined.


  Instruction size and timings
 

 operands    bytes                           386     486     Pentium
 r16, r16     3                             10+3n    6-103  7-39  NP
 r32, r32     3                             10+3n    7-104  7-71  NP
 r16, m16  3+d(0,1,2)                       10+3n    6-103  7-40  NP
 r32, m32  3+d(0,1,2,4)                     10+3n    7-104  7-72  NP


  Example
 

 bsr  eax, [esi]  ; Bit scan reverse


 { Back to contents screen:hcContents}


.topic BSWAP

  BSWAP - Byte swap (486+)
 

  Description
 

Reverses the byte order of a 32-bit (destination) register: bits 0 through 7
are swapped with bits 24 through 31, and bits 8 through 15 are swapped with
bits 16 through 23. This instruction is provided for converting little-endian
values to big-endian format and vice versa. To swap bytes in a word value
(16-bit register), use the {XCHG} instruction. When the BSWAP instruction
references a 16-bit register, the result is undefined.


  Flags affected
 

None.


  Instruction size and timings
 

 operand   bytes                                   486     Pentium
 r32        2                                       1       1   NP


  Example
 

 bswap  eax  ; byte swap


 { Back to contents screen:hcContents}


.topic BT

  BT - Bit test (386+)
 

  Description
 


Selects the bit in a bit string (specified with the first operand, called the
bit base) at the bit-position designated by the bit offset operand (second
operand) and stores the value of the bit in the CF flag.

The bit base operand can be a register or a memory location; the bit offset
operand can be a register or an immediate value. If the bit base operand
specifies a register, the instruction takes the modulo 16 or 32 (depending
on the register size) of the bit offset operand, allowing any bit position
to be selected in a 16- or 32-bit register, respectively.

If the bit base operand specifies a memory location, it represents the
address of the byte in memory that contains the bit base (bit 0 of the
specified byte) of the bit string. The offset operand then selects a bit
position within the range -2^31 to 2^31 - 1 for a register offset and 0 to
31 for an immediate offset.


  Flags affected
 

The CF flag contains the value of the selected bit. The OF, SF, ZF, AF, and
PF flags are undefined.


  Instruction size and timings
 

 operands     bytes                           386     486     Pentium
 reg, reg      3                               3       3       4   NP
 mem, reg    3+d(0,1,2,4)                     12       8       9   NP
 reg, imm8     3+i(1)                          3       3       4   NP
 mem, imm8   3+d(0,1,2,4)+i(1)                 6       3       4   NP


  Example
 

 bt  eax, 4  ; Bit test


 { Back to contents screen:hcContents}


.topic BTC

  BTC - Bit test and complement (386+)
 

  Description
 

Selects the bit in a bit string (specified with the first operand, called the
bit base) at the bit-position designated by the bit offset operand (second
operand), stores the value of the bit in the CF flag, and complements the
selected bit in the bit string.

The bit base operand can be a register or a memory location; the bit offset
operand can be a register or an immediate value. If the bit base operand
specifies a register, the instruction takes the modulo 16 or 32 (depending on
the register size) of the bit offset operand, allowing any bit position to be
selected in a 16- or 32-bit register, respectively.

If the bit base operand specifies a memory location, it represents the
address of the byte in memory that contains the bit base (bit 0 of the
specified byte) of the bit string.

The offset operand then selects a bit position within the range -2^31 to
2^31 - 1 for a register offset and 0 to 31 for an immediate offset.


  Flags affected
 

The CF flag contains the value of the selected bit before it is complemented.
The OF, SF, ZF, AF, and PF flags are undefined.


  Instruction size and timings
 

 operands    bytes                           386     486     Pentium
 reg, reg     3                               6       6       7   NP
 mem, reg   3+d(0,1,2,4)                     13      13      13   NP
 reg, imm8    3+i(1)                          6       6       7   NP
 mem, imm8  3+d(0,1,2,4)+i(1)                 8       8       8   NP


  Example
 

 btc  eax, 4  ; Bit test and complement


 { Back to contents screen:hcContents}


.topic BTR

  BTR - Bit test and reset (386+)
 

  Description
 

Selects the bit in a bit string (specified with the first operand, called the
bit base) at the bit-position designated by the bit offset operand (second
operand), stores the value of the bit in the CF flag, and clears the selected
bit in the bit string to 0. The bit base operand can be a register or a memory
location; the bit offset operand can be a register or an immediate value.

If the bit base operand specifies a register, the instruction takes the modulo
16 or 32 (depending on the register size) of the bit offset operand, allowing
any bit position to be selected in a 16- or 32-bit register, respectively.

If the bit base operand specifies a memory location, it represents the address
of the byte in memory that contains the bit base (bit 0 of the specified byte)
of the bit string.

The offset operand then selects a bit position within the range -2^31 to
2^31 - 1 for a register offset and 0 to 31 for an immediate offset.


  Flags affected
 

The CF flag contains the value of the selected bit before it is cleared. The
OF, SF, ZF, AF, and PF flags are undefined.


  Instruction size and timings
 

 operands    bytes                           386     486     Pentium
 reg, reg     3                               6       6       7   NP
 mem, reg   3+d(0,1,2,4)                     13      13      13   NP
 reg, imm8    3+i(1)                          6       6       7   NP
 mem, imm8  3+d(0,1,2,4)+i(1)                 8       8       8   NP


  Example
 

 btr  eax, 4    ; Bit test and reset


 { Back to contents screen:hcContents}


.topic BTS

  BTS - Bit test and set (386+)
 

  Description
 

Selects the bit in a bit string (specified with the first operand, called the
bit base) at the bit-position designated by the bit offset operand (second
operand), stores the value of the bit in the CF flag, and sets the selected
bit in the bit string to 1. The bit base operand can be a register or a
memory location; the bit offset operand can be a register or an immediate
value.

If the bit base operand specifies a register, the instruction takes the modulo
16 or 32 (depending on the register size) of the bit offset operand, allowing
any bit position to be selected in a 16- or 32-bit register, respectively.

If the bit base operand specifies a memory location, it represents the address
of the byte in memory that contains the bit base (bit 0 of the specified byte)
of the bit string).

The offset operand then selects a bit position within the range -2^31 to
2^31 - 1 for a register offset and 0 to 31 for an immediate offset.


  Flags affected
 

The CF flag contains the value of the selected bit before it is set. The OF,
SF, ZF, AF, and PF flags are undefined.


  Instruction size and timings
 

 operands    bytes                           386     486     Pentium
 reg, reg     3                               6       6       7   NP
 mem, reg   3+d(0,1,2,4)                     13      13      13   NP
 reg, imm8    3+i(1)                          6       6       7   NP
 mem, imm8  3+d(0,1,2,4)+i(1)                 8       8       8   NP


  Example
 

 bts  eax, 4    ; Bit test and set


 { Back to contents screen:hcContents}


.topic CALL

  CALL - procedure call
 

  Description
 

Pushes the Instruction Pointer (and Code Segment for far calls) onto the
stack and loads Instruction Pointer with the address of the procedure.
Code continues with execution at CS:IP.

  Flags affected
 

None.

  Instruction size and timings
 

 operand    bytes   8088    186     286     386     486     Pentium
 near        3      23      14      7+m     7+m      3       1   PV
 reg         2      20      13      7+m     7+m      5       2   NP
 mem16    2+d(0-2)  29+EA   19      11+m    10+m     5       2   NP
 far         5      36      23      13+m    17+m    18       4   NP
 mem32    2+d(0-2)  53+EA   38      16+m    22+m    17       4   NP

 Protected Mode

 operand    bytes                   286     386     486     Pentium
 far         5                      26+m    34+m    20     4-13  NP
 mem32    2+d(0-2)                  29+m    38+m    20     5-14  NP

 Cycles not shown for calls through call and task gates


  Example
 

 call my_procedure      ; Call procedure 'my_procedure'


 { Back to contents screen:hcContents}


.topic CBW

  CBW - Convert byte to word
 

  Description
 

Converts byte in AL to word value in AX by extending sign of AL
throughout register AH.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       2       3       3       3   NP


  Example
 

 cbw       ; Convert byte to word

 { Back to contents screen:hcContents}


.topic CDQ

  CDQ - Convert double to quad (386+)
 

  Description
 

Converts signed DWORD in EAX to a signed quad word in EDX:EAX by extending
the high order bit of EAX throughout EDX.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes                           386     486     Pentium
  1                               2       3       2   NP


  Example
 

 cdq       ; Convert double to quad


 { Back to contents screen:hcContents}


.topic CLC

  CLC - Clear carry
 

  Description
 

Clears the Carry Flag in the EFLAGS register (sets it to 0).

  Flags affected
 

Modifies CF.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       2       2       2       2   NP


  Example
 

 clc       ; Clear carry flag


 { Back to contents screen:hcContents}

.topic CLD

  CLD - Clear direction flag
 

  Description
 

Clears the Direction Flag in the EFLAGS register causing string instructions
to increment the (E)SI and (E)DI index registers.


  Flags affected
 

Modifies DF.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       2       2       2       2   NP


  Example
 

 cld       ; Clear direction flag


 { Back to contents screen:hcContents}


.topic CLI

  CLI - Clear interrupt flag
 

  Description
 

Clears the IF flag in the EFLAGS register. No other flags are affected.
Clearing the IF flag causes the processor to ignore maskable external
interrupts. The IF flag and the {CLI} and {STI} instructions have no affect
on the generation of exceptions and NMI interrupts.


  Flags affected
 

Modifies IF.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       3       3       5       7   NP


  Example
 

 cli       ; Clear interrupt flag


 { Back to contents screen:hcContents}


.topic CLTS

  CLTS - Clear task switched flag (286+)
 

  Description
 

Clears the task-switched (TS) flag in the CR0 register. This instruction is
intended for use in operating-system procedures. It is a privileged
instruction that can only be executed at a CPL of 0. It is allowed to be
executed in real-address mode to allow initialization for protected mode.

The processor sets the TS flag every time a task switch occurs. The flag is
used to synchronize the saving of FPU context in multitasking applications.


  Flags affected
 

Modifies the TS flag in the CR0 register.


  Instruction size and timings
 

 bytes                   286     386     486     Pentium
  2                       3       5       7      10   NP


  Example
 

 clts       ; Clear task switched flag


 { Back to contents screen:hcContents}


.topic CMC

  CMC - Complement carry flag
 

  Description
 

Complements the carry flag in the EFLAGS register.


  Flags affected
 

Modifies CF to the value of ~CF.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       2       2       2       2   NP


  Example
 

 cmc        ; Complement carry flag


 { Back to contents screen:hcContents}


.topic CMOVcc

  CMOVcc - Conditional move
 

  Description
 

The CMOVcc instructions check the state of one or more of the status flags
in the EFLAGS register (CF, OF, PF, SF, and ZF) and perform a move operation
if the flags are in a specified state (or condition). A condition code (cc)
is associated with each instruction to indicate the condition being tested
for. If the condition is not satisfied, a move is not performed and execution
continues with the instruction following the CMOVcc instruction.

These instructions can move a 16 or 32 bit value from memory to a general
purpose register or from one general purpose register to another. Conditional
moves of 8 bit register operands are not supported.

The conditions for each CMOVcc mnemonic is given in the description column
of the table below. The terms "less" and "greater" are used for comparisons
of signed integers and the terms "above" and "below" are used for unsigned
integers.

Because a particular state of the status flags can sometimes be interpreted
in two ways, two mnemonics are defined for some opcodes. For example, the
CMOVA (conditional move if above) instruction and the CMOVNBE (conditional
move if not below or equal) instruction are alternate mnemonics for the
opcode.

The CMOVcc instructions are new for the Pentium Pro processor family;
however, they may not be supported by all the processors in the family.
Software can determine if the CMOVcc instructions are supported by checking
the processor's feature information with the {CPUID} instruction.


 Instruction    Description

 CMOVA          Move if above (CF=0 and ZF=0)
 CMOVAE         Move if above or equal (CF=0)
 CMOVB          Move if below (CF=1)
 CMOVBE         Move if below or equal (CF=1 or ZF=1)
 CMOVC          Move if carry (CF=1)
 CMOVE          Move if equal (ZF=1)
 CMOVG          Move if greater (ZF=0 and SF=OF)
 CMOVGE         Move if greater or equal (SF=OF)
 CMOVL          Move if less (SF<>OF)
 CMOVLE         Move if less or equal (ZF=1 or SF<>OF)
 CMOVNA         Move if not above (CF=1 or ZF=1)
 CMOVNAE        Move if not above or equal (CF=1)
 CMOVNB         Move if not below (CF=0)
 CMOVNBE        Move if not below or equal (CF=0 and ZF=0)
 CMOVNC         Move if not carry (CF=0)
 CMOVNE         Move if not equal (ZF=0)
 CMOVNG         Move if not greater (ZF=1 or SF<>OF)
 CMOVNGE        Move if not greater or equal (SF<>OF)
 CMOVNL         Move if not less (SF=OF)
 CMOVNLE        Move if not less or equal (ZF=0 and SF=OF)
 CMOVNO         Move if not overflow (OF=0)
 CMOVNP         Move if not parity (PF=0)
 CMOVNS         Move if not sign (SF=0)
 CMOVNZ         Move if not zero (ZF=0)
 CMOVO          Move if overflow (OF=0)
 CMOVP          Move if parity (PF=1)
 CMOVPE         Move if parity even (PF=1)
 CMOVPO         Move if parity odd (PF=0)
 CMOVS          Move if sign (SF=1)
 CMOVZ          Move if zero (ZF=1)


  Flags affected
 

None.


  Instruction size and timings
 

Not available.


  Example
 

 cmovs ax, bx   ; Move ax to bx if the sign flag is set (SF = 1)


 { Back to contents screen:hcContents}


.topic CMP

  CMP - Compare two operands
 

  Description
 

Compares the first source operand with the second source operand and sets
the status flags in the EFLAGS register according to the results. The
comparison is performed by subtracting the second operand from the first
operand and then setting the status flags in the same manner as the {SUB}
instruction.

When an immediate value is used as an operand, it is sign-extended to the
length of the first operand.

The CMP instruction is typically used in conjunction with a conditional jump
({Jcc}), condition move ({CMOVcc}), or {SETcc} instruction. The condition
codes used by the {Jcc}, {CMOVcc}, and {SETcc} instructions are based on the
results of a CMP instruction.


  Flags affected
 

The CF, OF, SF, ZF, AF, and PF flags are set according to the result.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 reg, reg     2       3       3       2       2       1       1   UV
 mem, reg  2+d(0,2)  13+EA   10       7       5       2       2   UV
 reg, mem  2+d(0,2)  13+EA   10       6       6       2       2   UV
 reg, imm  2+i(1,2)   4       4       3       2       1       1   UV
 mem, imm  2+d(0,2)  14+EA   10       6       5       2       2   UV*
            +i(1,2)
 acc, imm  1+i(1,2)   4       4       3       2       1       1   UV

        * = not pairable if there is a displacement and immediate


  Example
 

 cmp eax, 3     ; Compare eax register with 3


 { Back to contents screen:hcContents}


.topic CMPS

  CMPS - Compare string operands
 

  Description
 

Compares the byte, word, or double word specified with the first source
operand with the byte, word, or double word specified with the second source
operand and sets the status flags in the EFLAGS register according to the
results.

Both the source operands are located in memory. The address of the first
source operand is read from either the DS:ESI or the DS:SI registers
(depending on the address-size attribute of the instruction, 32 or 16,
respectively).

The address of the second source operand is read from either the ES:EDI or
the ES:DI registers (again depending on the address-size attribute of the
instruction). The DS segment may be overridden with a segment override
prefix, but the ES segment cannot be overridden.

After the comparison, the (E)SI and (E)DI registers are incremented or
decremented automatically according to the setting of the DF flag in the
EFLAGS register. (If the DF flag is 0, the (E)SI and (E)DI register are
incremented; if the DF flag is 1, the (E)SI and (E)DI registers are
decremented.) The registers are incremented or decremented by 1 for byte
operations, by 2 for word operations, or by 4 for doubleword operations.

The CMPSB, CMPSW, and CMPSD instructions can be preceded by the {REP}
prefix for block comparisons of ECX bytes, words, or doublewords. More often,
however, these instructions will be used in a {LOOP} construct that takes
some action based on the setting of the status flags before the next
comparison is made.

Please note that NASM does not support the CMPS instruction. It does however
support the CMPSB, CMPSW and CMPSD versions of the instruction.


  Flags affected
 

The CF, OF, SF, ZF, AF, and PF flags are set according to the temporary
result of the comparison.


  Instruction size and timings
 

 variations    bytes   8088    186     286     386     486     Pentium
 cmpsb          1      30      22       8      10       8       5   NP
 cmpsw          1      -       -        -      10       8       5   NP
 cmpsd          1      -       -        -      10       8       5   NP
 repX cmpsb     2      9+30n   5+22n   5+9n    5+9n    7+7n*   9+4n NP
 repX cmpsw     2      9+30n   5+22n   5+9n    5+9n    7+7n*   9+4n NP
 repX cmpsd     2       -       -       -      5+9n    7+7n*   9+4n NP


 repX = repe, repz, repne or repnz
 * : 5 if n = 0


  Example
 

 repne cmpsb        ; Repeat compare until operands are not equal


 { Back to contents screen:hcContents}


.topic CMPXCHG

  CMPXCHG - Compare and exchange (486+)
 

  Description
 

Compares the value in the AL, AX, or EAX register (depending on the size of
the operand) with the first operand (destination operand). If the two values
are equal, the second operand (source operand) is loaded into the destination
operand. Otherwise, the destination operand is loaded into the AL, AX, or EAX
register.


  Flags affected
 

The ZF flag is set if the values in the destination operand and register AL,
AX, or EAX are equal; otherwise it is cleared. The CF, PF, AF, SF, and OF
flags are set according to the results of the comparison operation.


  Instruction size and timings
 


 operands        bytes                           486     Pentium
 reg, reg         3                               6       5   NP
 mem, reg       3+d(0-2)                         7-10     6   NP


  Example
 

 cmpxchg ebx, edx   ; Compare and exchange ebx and edx


 { Back to contents screen:hcContents}


.topic CMPXCHG8B

  CMPXCHG8B - Compare and exchange 8 bytes (Pentium+)
 

  Description
 

Compares the 64-bit value in EDX:EAX with the operand (destination operand).
If the values are equal, the 64-bit value in ECX:EBX is stored in the
destination operand. Otherwise, the value in the destination operand is
loaded into EDX:EAX. The destination operand is an 8-byte memory location.
For the EDX:EAX and ECX:EBX register pairs, EDX and ECX contain the high-order
32 bits and EAX and EBX contain the low-order 32 bits of a 64-bit value.


  Flags affected
 

The ZF flag is set if the destination operand and EDX:EAX are equal;
otherwise it is cleared. The CF, PF, AF, SF, and OF flags are unaffected.


  Instruction size and timings
 


 operands      bytes                                   Pentium
 mem, reg     3+d(0-2)                                 10   NP


  Example
 

 cmpxchg8b [ebx], edx   ; Compare and exchange ebx and edx


 { Back to contents screen:hcContents}


.topic CPUID

  CPUID - CPU identification (486+)
 

  Description
 

Provides processor identification information in registers EAX, EBX, ECX, and
EDX. This information identifies the vendor, family, model, and stepping of
the processor, feature information, and cache information.

Note that only later versions of 486 processors support the CPUID instruction.
The CPUID instruction officially became available on Pentium processors.

An input value loaded into the EAX register determines what information is
returned as shown below.

 Initial
   EAX          Information Provided about the Processor
  Value

    0           EAX  Maximum CPUID Input Value (2 for the Pentium  Pro
                     processor and 1 for the Pentium processor and the later
                     versions of 486 class processors that support the
                     CPUID instruction).

                EBX, ECX and EDX contain a vendor name, e.g.

                     "GenuineIntel" for Intel processors
                     "AuthenticAMD" for AMD processors
                     "CyrixInstead" for Cyrix processors
                     "UMCUMCUMCUMC" for UMC processors


    1           EAX  Version Information (Type, Family, Model, and Stepping ID)
                EBX  Reserved
                ECX  Reserved
                EDX  Feature Information

    2           EAX  Cache and TLB Information
                EBX  Cache and TLB Information
                ECX  Cache and TLB Information
                EDX  Cache and TLB Information


Bit 21 of the EFLAGS register can be used to determine if the CPUID
instruction is supported by the processor.

The following information details the contents of the registers for the
CPUID instruction.

 Calling CPUID with EAX = 1 returns:

         EAX[3:0]       <- Stepping ID
         EAX[7:4]       <- Model
         EAX[11:8]      <- Family
                                 ; 3 - 386 family
                                 ; 4 - 486 family
                                 ; 5 - Pentium family
                                 ; 6 - Pentium Pro family
         EAX[15:12]     <- Reserved
                                 ; 0 - Original OEM processor
                                 ; 1 - OverDrive
                                 ; 2 - Dual Processor

         Note: Pentium chips have pin CPUTYPE which defines which socket the
               CPU is located in. For example: if the chip is in the first
               socket AX = 0245h, however in the second socket it would
               equal 2425h

         EAX[31:16]     <- Reserved and set to zeros

         EDX            <- Compability flags (a bit is set if a feature
                           is supported by the CPU)

         EDX[0]         <- FPU : FPU on Chip
         EDX[1]         <- VME : Virtual Mode Extension present
         EDX[2]         <- DE  : Debbuging Extentions
         EDX[3]         <- PSE : CPU supports 4MB size pages
         EDX[4]         <- TSC : TSC present (see {RDTSC} opcode)
         EDX[5]         <- MSR : CPU has Pentium Compatible MSRs
         EDX[6]         <- PAE : Physical Address Extension
         EDX[7]         <- MCE : Machine Check Exception
         EDX[8]         <- CX8 : Supports the {CMPXCHG8B} instruction
         EDX[9]         <- APIC: Local APIC on Chip (Intel)
                           PGE : Page Global Extension (AMD K5)
         EDX[10]        <- Reserved
         EDX[11]        <- Reserved
         EDX[12]        <- MTRR: CPU supports Memory Type Range Register
                                 (MTRR)
         EDX[13]        <- PGE : Page Global Feature support
         EDX[14]        <- MCA : Machine Check Architecture
         EDX[15]        <- CMOV: CPU supports {CMOVcc} instruction
         EDX[22..16]    <- Reserved
         EDX[23]        <- MMX : CPU supports IA MMX
         EDX[31:24]     <- Reserved and set to zeros


 Calling CPUID with EAX = 2 returns (Pentium Pro+):

         AL = 1     (Pentium Pro)

         The remainder of EAX and EBX, ECX and EDX contain bytes which
         described cache architecture on the chip.

         Value         Description
         00h           None
         01h           Instruction TLB, 4K page, 4 way, 64 entry
         02h           Instruction TLB, 4M page, 4 way, 4 entry
         03h           Data TLB, 4K page, 4 way, 64 entry
         04h           Data TLB, 4M page, 4 way, 8 entry
         06h           Instruction Cache, 8K, 4 way, 32 bytes per line
         0Ah           Data cache, 8K, 2 way, 32 bytes per line
         41h           Unified cache, 32 bytes per line, 4 way, 128KB
         42h           Unified cache, 32 bytes per line, 4 way, 256KB
         43h           Unified cache, 32 bytes per line, 4 way, 512KB


  Flags affected
 

None.


  Instruction size and timings
 


 bytes                                           Pentium
  2                                              14   NP


  Example
 

 cpuid          ; Get CPU information


 { Back to contents screen:hcContents}


.topic CWD

  CWD - Convert word to doubleword
 

  Description
 

Extends sign of word in register AX throughout register DX forming
a doubleword quantity in DX:AX.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       5       4       2       2       3       2   NP


  Example
 

 cwd            ; Convert word in AX to doubleword in DX:AX


 { Back to contents screen:hcContents}


.topic CWDE

  CWDE - Convert word to extended doubleword (386+)
 

  Description
 

Converts a signed word in AX to a signed doubleword in EAX by extending the
sign bit of AX throughout EAX.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes                           386     486     Pentium
  1                               3       3       3   NP


  Example
 

 cwde           ; Convert word in AX to doubleword in EAX


 { Back to contents screen:hcContents}


.topic DAA

  DAA - Decimal adjust AL after addition
 

  Description
 

Adjusts the sum of two packed BCD values to create a packed BCD result. The
AL register is the implied source and destination operand. The DAA instruction
is only useful when it follows an ADD instruction that adds (binary addition)
two 2-digit, packed BCD values and stores a byte result in the AL register.
The DAA instruction then adjusts the contents of the AL register to contain
the correct 2-digit, packed BCD result. If a decimal carry is detected, the
CF and AF flags are set accordingly.


  Flags affected
 

The CF and AF flags are set if the adjustment of the value results in a
decimal carry in either digit of the result (see above). The SF, ZF, and PF
flags are set according to the result. The OF flag is undefined.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       4       4       3       4       2       3   NP


  Example
 

 add al, bx     ; Before: AL=79H BL=35H EFLAGS(OSZAPC)=XXXXXX
                ; After : AL=AEH BL=35H EFLAGS(0SZAPC)=110000

 daa            ; Before: AL=AEH BL=35H EFLAGS(OSZAPC)=110000
                ; After : AL=14H BL=35H EFLAGS(0SZAPC)=X00111


 { Back to contents screen:hcContents}


.topic DAS

  DAS - Decimal adjust AL after subtraction
 

  Description
 

Adjusts the result of the subtraction of two packed BCD values to create a
packed BCD result. The AL register is the implied source and destination
operand. The DAS instruction is only useful when it follows a SUB instruction
that subtracts (binary subtraction) one 2-digit, packed BCD value from
another and stores a byte result in the AL register. The DAS instruction
then adjusts the contents of the AL register to contain the correct 2-digit,
packed BCD result. If a decimal borrow is detected, the CF and AF flags are
set accordingly.


  Flags affected
 

The CF and AF flags are set if the adjustment of the value results in a
decimal borrow in either digit of the result (see above). The SF, ZF, and PF
flags are set according to the result. The OF flag is undefined.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       4       4       3       4       2       3   NP


  Example
 

 sub al, bl     ; Before: AL=35H BL=47H EFLAGS(OSZAPC)=XXXXXX
                ; After : AL=EEH BL=47H EFLAGS(0SZAPC)=010111

 das            ; Before: AL=EEH BL=47H EFLAGS(OSZAPC)=010111
                ; After : AL=88H BL=47H EFLAGS(0SZAPC)=X10111


{ Back to contents screen:hcContents}


.topic DEC

  DEC - Decrement
 

  Description
 

Subtracts 1 from the destination operand, while preserving the state of the
CF flag. The destina-tion operand can be a register or a memory location.
This instruction allows a loop counter to be updated without disturbing the
CF flag. (To perform a decrement operation that updates the CF flag, use a
{SUB} instruction with an immediate operand of 1.)


  Flags affected
 

The CF flag is not affected. The OF, SF, ZF, AF, and PF flags are set
according to the result.


  Instruction size and timings
 

 operand     bytes   8088    186     286     386     486     Pentium
 r8           2       3       3       2       2       1       1   UV
 r16          1       3       3       2       2       1       1   UV
 r32          1       3       3       2       2       1       1   UV
 mem       2+d(0,2)  23+EA   15       7       6       3       3   UV


  Example
 

 dec eax        ; Subtract one from eax (eax = eax - 1)


 { Back to contents screen:hcContents}


.topic DIV

  DIV - Unsigned divide
 

  Description
 

Divides (unsigned) the value in the AX register, DX:AX register pair, or
EDX:EAX register pair (dividend) by the source operand (divisor) and stores
the result in the AX (AH:AL), DX:AX, or EDX:EAX registers. The source operand
can be a general-purpose register or a memory location. The action of this
instruction depends on the operand size, as shown in the following table:


 Operand Size     Dividend    Divisor    Quotient    Remainder    Maximum
                                                                  Quotient

 Word/byte        AX          r/m8       AL          AH           255
 Doubleword/
 word             DX:AX       r/m16      AX          DX           65,535
 Quadword/
 doubleword       EDX:EAX     r/m32      EAX         EDX          2^32 - 1


Non-integral results are truncated towards 0. The remainder is always less
than the divisor in magnitude.


  Flags affected
 

The CF flag is not affected. The OF, SF, ZF, AF, and PF flags are set according
to the result.


  Instruction size and timings
 

 operand     bytes   8088    186     286     386     486     Pentium
 r8           2     80-90     29     14      14      16      17   NP
 r16          2    144-162    38     22      22      24      25   NP
 r32          2       -       -       -      38      40      41   NP
 mem8    2+d(0-2)   86-96+EA  35     17      17      16      17   NP
 mem16   2+d(0-2)  150-168+EA 44     25      25      24      25   NP
 mem32   2+d(0-2)     -       -       -      41      40      41   NP


  Example
 

 div ebx        ; divide EDX:EAX by EBX


 { Back to contents screen:hcContents}


.topic EMMS

  EMMS - Empty MMX state (MMX)
 

  Description
 

Sets the values of all the tags in the FPU tag word to empty (all ones).
This operation marks the MMX registers as available, so they can subsequently
be used by floating-point instructions. All other MMX instructions (other
than the EMMS instruction) set all the tags in FPU tag word to valid (all
zeros).

The EMMS instruction must be used to clear the MMX state at the end of all
MMX routines and before calling other procedures or subroutines that may
execute floating-point instructions. If a floating-point instruction loads
one of the registers in the FPU register stack before the FPU tag word has
been reset by the EMMS instruction, a floating-point stack overflow can occur
that will result in a floating-point exception or incorrect result.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.


  Example
 

 emms           ; Set the FP tag word to empty


 { Back to contents screen:hcContents}


.topic ENTER

  ENTER - Make stack frame for parameters (186+)
 

  Description
 

Creates a stack frame for a procedure. The first operand (size operand)
specifies the size of the stack frame (that is, the number of bytes of
dynamic storage allocated on the stack for the procedure). The second operand
(nesting level operand) gives the lexical nesting level (0 to 31) of the
procedure. The nesting level determines the number of stack frame pointers
that are copied into the "display area" of the new stack frame from the
preceding frame. Both of these operands are immediate values.

The stack-size attribute determines whether the BP (16 bits) or EBP (32 bits)
register specifies the current frame pointer and whether SP (16 bits) or ESP
(32 bits) specifies the stack pointer.

The ENTER and companion LEAVE instructions are provided to support block
structured languages. The ENTER instruction (when used) is typically the
first instruction in a procedure and is used to set up a new stack frame for
a procedure. The LEAVE instruction is then used at the end of the procedure
(just before the RET instruction) to release the stack frame.

If the nesting level is 0, the processor pushes the frame pointer from the
EBP register onto the stack, copies the current stack pointer from the ESP
register into the EBP register, and loads the ESP register with the current
stack-pointer value minus the value in the size operand. For nesting levels
of 1 or greater, the processor pushes additional frame pointers on the stack
before adjusting the stack pointer. These additional frame pointers provide
the called procedure with access points to other nested frames on the stack.


  Flags affected
 

None.


  Instruction size and timings
 

 operands     bytes   8088    186     286     386     486     Pentium
 imm16, 0       3      -      15      11      10      14      11   NP
 imm16, 1       4      -      25      15      12      17      15   NP
 imm16, imm8    4      -   22+16n    12+4n   15+4n   17+3i  15+2i  NP
                           n = imm8-1;  i = imm8


  Example
 

 enter 1, 0             ; Create stack frame of 1 byte with 0 nesting levels


 { Back to contents screen:hcContents}


.topic ESC

  ESC - Escape
 

  Description
 

Provides access to the data bus for other resident processors.
The CPU treats it as a NOP but places memory operand on bus.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.


  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic HLT

  HLT - Halt
 

  Description
 

Stops instruction execution and places the processor in a HALT state. An
enabled interrupt, NMI, or a reset will resume execution. If an interrupt
(including NMI) is used to resume execution after a HLT instruction, the
saved instruction pointer (CS:EIP) points to the instruction following the
HLT instruction.

The HLT instruction is a privileged instruction. When the processor is
running in protected or virtual-8086 mode, the privilege level of a program
or procedure must be 0 to execute the HLT instruction.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       2       5       4       4   NP


  Example
 

 hlt            ; Enter halt state


 { Back to contents screen:hcContents}


.topic IDIV

  IDIV - Signed divide
 

  Description
 

Divides (signed) the value in the AL, AX, or EAX register by the source
operand and stores the result in the AX, DX:AX, or EDX:EAX registers. The
source operand can be a general-purpose register or a memory location.

The action of this instruction depends on the operand size, as shown in the
following table:


 Operand      Dividend    Divisor    Quotient  Remainder  Quotient
 Size                                                     Range

 Word/
 byte         AX          r/m8       AL        AH         -128 to +127
 Doubleword/
 word         DX:AX       r/m16      AX        DX         -32,768 to +32,767
 Quadword/
 doubleword   EDX:EAX     r/m32      EAX       EDX        -2^31 to 2^32 - 1


Non-integral results are truncated towards 0. The sign of the remainder is
always the same as the sign of the dividend. The absolute value of the
remainder is always less than the absolute value of the divisor.


  Flags affected
 

The CF, OF, SF, ZF, AF, and PF flags are undefined.


  Instruction size and timings
 

 operand    bytes    8088      186    286    386    486     Pentium
 r8          2     101-112    44-52   17     19     19      22   NP
 r16         2     165-184    53-61   25     27     27      30   NP
 r32         2       -          -      -     43     43      46   NP
 mem8   2+d(0-2)  107-118+EA  50-58   20     22     20      22   NP
 mem16  2+d(0-2)  171-190+EA  59-67   28     30     28      30   NP
 mem32  2+d(0-2)     -          -      -     46     44      46   NP


  Example
 

 idiv ebx       ; Signed divide EAX by EBX


 { Back to contents screen:hcContents}


.topic IMUL

  IMUL - Signed multiply
 

  Description
 

Performs a signed multiplication of two operands. This instruction has three
forms, depending on the number of operands.

  One-operand form

This form is identical to that used by the MUL instruction. Here, the source
operand (in a general-purpose register or memory location) is multiplied by
the value in the AL, AX, or EAX register (depending on the operand size) and
the product is stored in the AX, DX:AX, or EDX:EAX registers, respectively.

  Two-operand form (286+)

With this form the destination operand (the first operand) is multiplied by
the source operand (second operand). The destination operand is a general-purpose
register and the source operand is an immediate value, a general-purpose
register (386+), or a memory location (386+). The product is then stored in
the destination operand location.

  Three-operand form (286+)

This form requires a destination operand (the first operand) and two source
operands (the second and the third operands). Here, the first source operand
(which can be a general-purpose register or a memory location) is multiplied
by the second source operand (an immediate value). The product is then stored
in the destination operand (a general-purpose register).

When an immediate value is used as an operand, it is sign-extended to the
length of the destination operand format.

The CF and OF flags are set when significant bits are carried into the upper
half of the result. The CF and OF flags are cleared when the result fits
exactly in the lower half of the result.

The three forms of the IMUL instruction are similar in that the length of the
product is calculated to twice the length of the operands. With the one-operand
form, the product is stored exactly in the destination. With the two- and three-
operand forms, however, the result is truncated to the length of the
destination before it is stored in the destination register. Because of this
truncation, the CF or OF flag should be tested to ensure that no significant
bits are lost.

The two- and three-operand forms may also be used with unsigned operands
because the lower half of the product is the same regardless if the operands
are signed or unsigned. The CF and OF flags, however, cannot be used to
determine if the upper half of the result is non-zero.


  Flags affected
 

For the one operand form of the instruction, the CF and OF flags are set when
significant bits are carried into the upper half of the result and cleared
when the result fits exactly in the lower half of the result. For the two- and
three-operand forms of the instruction, the CF and OF flags are set when the
result must be truncated to fit in the destination operand size and cleared
when the result fits exactly in the destination operand size. The SF, ZF, AF,
and PF flags are undefined.


  Instruction size and timings
 

  One-operand form

 operand    bytes   8088     186    286     386     486     Pentium
 r8          2      80-98    25-28  13      9-14    13-18   11   NP
 r16         2     128-154   34-37  21      9-22    13-26   11   NP
 r32         2       -        -      -      9-38    13-42   10   NP
 mem8    2+d(0-2)  86-104+EA 32-34  16     12-17    13-18   11   NP
 mem16   2+d(0-2) 134-160+EA 40-43  24     12-25    13-26   11   NP
 mem32   2+d(0-2)    -        -      -     12-41    13-42   10   NP


  Two- and three-operand form

 operands       bytes     186   286    386         486      Pentium
 r16, imm      2+i(1,2)    -    21  9-14/9-22  13-18/13-26  10   NP
 r32, imm      2+i(1,2)    -     -     9-38       13-42     10   NP
 r16,r16,imm   2+i(1,2)  22/29  21  9-14/9-22  13-18/13-26  10   NP
 r32,r32,imm   2+i(1,2)    -     -     9-38       13-42     10   NP
 r16,m16,imm   2+d(0-2)  25/32  24 12-17/12-25 13-18/13-26  10   NP
                +i(1,2)
 r32,m32,imm   2+d(0-2)+i(1,2)   -    12-41       13-42     10   NP
 r16, r16      2+i(1,2)    -     -     9-22    13-18/13-26  10   NP
 r32, r32      2+i(1,2)    -     -     9-38       13-42     10   NP
 r16, m16      2+d(0-2)+i(1,2)   -    12-25    13-18/13-26  10   NP
 r32, m32      2+d(0-2)+i(1,2)   -    12-41       13-42     10   NP


  Example
 

 imul ebx               ; EDX:EAX = EAX * EBX
 imul ecx, ebx          ; ECX = ECX * EBX
 imul ecx, ebx, 10      ; ECX = EBX * 10


 { Back to contents screen:hcContents}


.topic IN

  IN - Input from port
 

  Description
 

Copies the value from the I/O port specified with the second operand (source
operand) to the destination operand (first operand). The source operand can
be a byte-immediate or the DX register; the destination operand can be
register AL, AX, or EAX, depending on the size of the port being accessed
(8, 16, or 32 bits, respectively). Using the DX register as a source operand
allows I/O port addresses from 0 to 65,535 to be accessed; using a byte
immediate allows I/O port addresses 0 to 255 to be accessed.

When accessing an 8-bit I/O port, the opcode determines the port size; when
accessing a 16- and 32-bit I/O port, the operand-size attribute determines
the port size.

At the machine code level, I/O instructions are shorter when accessing 8-bit
I/O ports. Here, the upper eight bits of the port address will be 0.


  Flags affected
 

None.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 al, imm8     2      14      10       5      12      14       7   NP
 ax, imm8     2      14      10       5      12      14       7   NP
 eax, imm8    2       -       -       -      12      14       7   NP
 al, dx       1      12       8       5      13      14       7   NP
 ax, dx       1      12       8       5      13      14       7   NP
 eax, dx      1       -       -       -      13      14       7   NP

                             Protected mode

 operands     bytes                           386     486     Pentium
 acc, imm      2                           6/26/26  9/29/27  4/21/19 NP
 acc, dx       1                           7/27/27  8/28/27  4/21/19 NP

 Cycles for: CPL <= IOPL / CPL > IOPL / V86


  Example
 

 in  al, dx     ; AL = the value read from I/O port number DX


 { Back to contents screen:hcContents}


.topic INC

  INC - Increment
 

  Description
 

Adds 1 to the destination operand, while preserving the state of the CF flag. #
The destination operand can be a register or a memory location. This
instruction allows a loop counter to be updated without disturbing the CF
flag. (Use a {ADD} instruction with an immediate operand of 1 to perform an
increment operation that does updates the CF flag.)


  Flags affected
 

The CF flag is not affected. The OF, SF, ZF, AF, and PF flags are set
according to the result.


  Instruction size and timings
 

 operand     bytes   8088    186     286     386     486     Pentium
 r8           2       3       3       2       2       1       1   UV
 r16          1       3       3       2       2       1       1   UV
 r32          1       3       3       2       2       1       1   UV
 mem       2+d(0,2)  23+EA   15       7       6       3       3   UV


  Example
 

 inc ebx        ; EBX = EBX + 1


 { Back to contents screen:hcContents}


.topic INS

  INS - Input from port to string (186+)
 

  Description
 

Copies the data from the I/O port specified with the source operand (second
operand) to the destination operand (first operand). The source operand is an
I/O port address (from 0 to 65,535) that is read from the DX register. The
destination operand is a memory location, the address of which is read from
either the ES:EDI or the ES:DI registers (depending on the address-size
attribute of the instruction, 32 or 16, respectively). (The ES segment cannot
be overridden with a segment override prefix.)

The size of the I/O port being accessed (that is, the size of the source and
destination operands) is determined by the opcode for an 8-bit I/O port or by
the operand-size attribute of the instruction for a 16- or 32-bit I/O port.

At the assembly-code level, two forms of this instruction are allowed: the
"explicit-operands" form and the "no-operands" form. The explicit-operands
form (specified with the INS mnemonic) allows the source and destination
operands to be specified explicitly. This version of the instruction is NOT
supported by NASM.

The no-operands form provides "short forms" of the byte, word, and doubleword
versions of the INS instructions. Here also DX is assumed by the processor to
be the source operand and ES:(E)DI is assumed to be the destination operand.
The size of the I/O port is specified with the choice of mnemonic: INSB (byte),
INSW (word), or INSD (doubleword).

After the byte, word, or doubleword is transferred from the I/O port to the
memory location, the (E)DI register is incremented or decremented
automatically according to the setting of the DF flag in the EFLAGS register.
(If the DF flag is 0, the (E)DI register is incremented; if the DF flag is 1,
the (E)DI register is decremented.) The (E)DI register is incremented or
decremented by 1 for byte operations, by 2 for word operations, or by 4 for
doubleword operations.

The INSB, INSW, and INSD instructions can be preceded by the REP prefix
for block input of ECX bytes, words, or doublewords. See {REP/REPE/REPZ/REPNE/REPNZ:REP}
for a description of the REP prefix.


  Flags affected
 

None.


  Instruction size and timings
 

 variations  bytes   8088    186     286     386     486     Pentium
 insb         1       -      14       5      15      17      9    NP
 insw         1       -      14       5      15      17      9    NP
 insd         1       -       -       -      15      17      9    NP

 Protected Mode

             bytes                           386     486     Pentium
              1                           9/29/29 10/32/30 6/24/22 NP

 Cycles for: CPL <= IOPL / CPL > IOPL / V86


  Example
 

 rep insb       ; Repeatedly input bytes from the port specified by DX


 { Back to contents screen:hcContents}


.topic INT

  INT - Call interrupt procedure
 

  Description
 


The INT n instruction generates a call to the interrupt or exception handler
specified with the destination operand. The destination operand specifies an
interrupt vector number from 0 to 255, encoded as an 8-bit unsigned
intermediate value. Each interrupt vector number provides an index to a gate
descriptor in the IDT. The first 32 interrupt vector numbers are reserved by
Intel for system use. Some of these interrupts are used for internally
generated exceptions.

The INT n instruction is the general mnemonic for executing a software-generated
call to an interrupt handler. The {INTO} instruction is a special mnemonic for
calling overflow exception interrupt vector number 4.

The INT 3 instruction generates a special one byte opcode (CC) that is
intended for calling the debug exception handler. (This one byte form is
valuable because it can be used to replace the first byte of any instruction
with a breakpoint, including other one byte instructions, without overwriting
other code). To further support its function as a debug breakpoint, the
interrupt generated with the CC opcode also differs from the regular software
interrupts as follows:

Note that the "normal" 2-byte opcode for INT 3 (CD03) does not have these
special features. Intel and Microsoft assemblers will not generate the CD03
opcode from any mnemonic, but this opcode can be created by direct numeric
code definition or by self-modifying code.

The action of the INT n instruction (including the {INTO} and INT 3
instructions) is similar to that of a far call made with the CALL instruction.
The primary difference is that with the INT n instruction, the EFLAGS register
is pushed onto the stack before the return address. (The return address is a
far address consisting of the current values of the CS and EIP registers.)
Returns from interrupt procedures are handled with the IRET instruction, which
pops the EFLAGS information and return address from the stack.

The interrupt vector number specifies an interrupt descriptor in the interrupt
descriptor table (IDT); that is, it provides index into the IDT. The selected
interrupt descriptor in turn contains a pointer to an interrupt or exception
handler procedure. In protected mode, the IDT contains an array of 8-byte
descriptors, each of which is an interrupt gate, trap gate, or task gate.
In real-address mode, the IDT is an array of 4-byte far pointers (2-byte code
segment selector and a 2-byte instruction pointer), each of which point
directly to a procedure in the selected segment. (Note that in real-address
mode, the IDT is called the interrupt vector table, and it's pointers are
called interrupt vectors.)


  Flags affected
 

The EFLAGS register is pushed onto the stack. The IF, TF, NT, AC, RF, and VM
flags may be cleared, depending on the mode of operation of the processor
when the INT instruction is executed. If the interrupt uses a task gate, any
flags may be set or cleared, controlled by the EFLAGS image in the new task's
TSS.


  Instruction size and timings
 

 operands  bytes   8088    186     286     386     486     Pentium
   3        1      72      45      23+m    33      26      13   NP
   imm8     2      71      47      23+m    37      30      16   NP

                             Protected mode

   bytes   8088    186     286     386     486     Pentium
    1      -       -     (40-78)+m 59-99   44-71  27-82 NP


  Example
 

 int 21h        ; Call DOS services interrupt vector


 { Back to contents screen:hcContents}


.topic INTO

  INTO - Call interrupt procedure if overflow
 

  Description
 



  Flags affected
 

If the Overflow Flag is set this instruction generates an INT 4 which causes
the code addressed by 0000:0010 to be executed.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1      4/73    4/48    3/24+m  3/35    3/28    4/13 NP

                             Protected mode

 bytes                   286     386     486     Pentium
  1                    (40-78)+m 59-99   44-71  27-56 NP


  Example
 

 into           ; Call interrupt 4 if overflow flag is set


 { Back to contents screen:hcContents}


.topic INVD

  INVD - Invalidate data cache (486+)
 

  Description
 

Invalidates (flushes) the processor's internal caches and issues a special-function
bus cycle that directs external caches to also flush themselves. Data held in
internal caches is not written back to main memory.

After executing this instruction, the processor does not wait for the external
caches to complete their flushing operation before proceeding with instruction
execution. It is the responsibility of hardware to respond to the cache flush
signal.

The INVD instruction is a privileged instruction. When the processor is
running in protected mode, the CPL of a program or procedure must be 0 to
execute this instruction.

Use this instruction with care. Data cached internally and not written back
to main memory will be lost. Unless there is a specific requirement or benefit
to flushing caches without writing back modified cache lines (for example,
testing or fault recovery where cache coherency with main memory is not a
concern), software should use the WBINVD instruction.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  2       -       -       -       -       4      15   NP


  Example
 

 invd      ; Invalidate data cache


 { Back to contents screen:hcContents}


.topic INVLPG

  INVLPG - Invalidate TLB entry (486+)
 

  Description
 

Invalidates (flushes) the translation lookaside buffer (TLB) entry specified
with the source operand. The source operand is a memory address. The processor
determines the page that contains that address and flushes the TLB entry for
that page.

The INVLPG instruction is a privileged instruction. When the processor is
running in protected mode, the CPL of a program or procedure must be 0 to
execute this instruction.

The INVLPG instruction normally flushes the TLB entry only for the specified
page; however, in some cases, it flushes the entire TLB.


  Flags affected
 

None.


  Instruction size and timings
 

  operands  bytes                                   486     Pentium
  mem32      5                                       12      25   NP


  Example
 

 invlpg [eax]   ; Invalidate TLB entry specified by eax


 { Back to contents screen:hcContents}


.topic IRET

  IRET/IRETD - Interrupt return
 

  Description
 

Returns control to point of interruption by popping IP, CS and then the EFLAGS
from the stack and continues execution at this location.  CPU exception
interrupts will return to the instruction that cause the exception because
the CS:IP placed on the stack during the interrupt is the address of the
offending instruction.


  Flags affected
 

All the flags and fields in the EFLAGS register are potentially modified,
depending on the mode of operation of the processor. If performing a return
from a nested task to a previous task, the EFLAGS register will be modified
according to the EFLAGS image stored in the previous task's TSS.


  Instruction size and timings
 


  IRET

 bytes   8088    186     286     386     486     Pentium
  1       44      28      17+m    22      15     8-27  NP


  IRETD (386+)

 bytes                           386     486     Pentium
  1                               22      15     10-27  NP


  Example
 

 iret           ; Return from interrupt


 { Back to contents screen:hcContents}


.topic Jcc

  Jcc - Jump on condition
 

  Description
 

Checks the state of one or more of the status flags in the EFLAGS register
(CF, OF, PF, SF, and ZF) and, if the flags are in the specified state
(condition), performs a jump to the target instruction specified by the
destination operand. A condition code (cc) is associated with each instruction
to indicate the condition being tested for. If the condition is not satisfied,
the jump is not performed and execution continues with the instruction
following the Jcc instruction.

The target instruction is specified with a relative offset (a signed offset
relative to the current value of the instruction pointer in the EIP register).
A relative offset (rel8, rel16, or rel32) is generally specified as a label
in assembly code, but at the machine code level, it is encoded as a signed,
8-bit or 32-bit immediate value, which is added to the instruction pointer.
Instruction coding is most efficient for offsets of -128 to +127. If the
operand-size attribute is 16, the upper two bytes of the EIP register are
cleared to 0s, resulting in a maximum instruction pointer size of 16 bits.

The conditions for each Jcc mnemonic are given in the "Description" column of
the table below. The terms "less" and "greater" are used for comparisons of
signed integers and the terms "above" and "below" are used for unsigned
integers.

Because a particular state of the status flags can sometimes be interpreted
in two ways, two mnemonics are defined for some opcodes. For example, the JA
(jump if above) instruction and the JNBE (jump if not below or equal)
instruction are alternate mnemonics for the opcode.

The Jcc instruction does not support far jumps (jumps to other code segments).
When the target for the conditional jump is in a different segment, use the
opposite condition from the condition being tested for the Jcc instruction,
and then access the target with an unconditional far jump (JMP instruction)
to the other segment. For example, the following conditional far jump is
illegal:

        jz FARLABEL  ; Far jump

To accomplish this far jump, use the following two instructions:

        jnz BEYOND   ; If not zero skip far jump
        jmp FARLABEL ; Unconditional far jump

        BEYOND:      ; Label

The JECXZ and JCXZ instructions differs from the other Jcc instructions
because they do not check the status flags. Instead they check the contents
of the ECX and CX registers, respectively, for 0. Either the CX or ECX
register is chosen according to the address-size attribute. These instructions
are useful at the beginning of a conditional loop that terminates with a
conditional loop instruction (such as {LOOPNE:LOOPNZ}). They prevent entering
the loop when the ECX or CX register is equal to 0, which would cause the loop
to execute 2^32 or 64K times, respectively, instead of zero times.

All conditional jumps are converted to code fetches of one or two cache lines,
regardless of jump address or cacheability.


  Jump Instructions Table

 Mnemonic    Meaning                    Jump Condition

 JA          Jump if Above                         CF=0 and ZF=0
 JAE         Jump if Above or Equal                CF=0
 JB          Jump if Below                         CF=1
 JBE         Jump if Below or Equal                CF=1 or ZF=1
 JC          Jump if Carry                         CF=1
 JCXZ        Jump if CX Zero                       CX=0
 JE          Jump if Equal                         ZF=1
 JG          Jump if Greater (signed)              ZF=0 and SF=OF
 JGE         Jump if Greater or Equal (signed)     SF=OF
 JL          Jump if Less (signed)                 SF != OF
 JLE         Jump if Less or Equal (signed)        ZF=1 or SF != OF
 JMP         Unconditional Jump                    unconditional
 JNA         Jump if Not Above                     CF=1 or ZF=1
 JNAE        Jump if Not Above or Equal            CF=1
 JNB         Jump if Not Below                     CF=0
 JNBE        Jump if Not Below or Equal            CF=0 and ZF=0
 JNC         Jump if Not Carry                     CF=0
 JNE         Jump if Not Equal                     ZF=0
 JNG         Jump if Not Greater (signed)          ZF=1 or SF != OF
 JNGE        Jump if Not Greater or Equal (signed) SF != OF
 JNL         Jump if Not Less (signed)             SF=OF
 JNLE        Jump if Not Less or Equal (signed)    ZF=0 and SF=OF
 JNO         Jump if Not Overflow (signed)         OF=0
 JNP         Jump if No Parity                     PF=0
 JNS         Jump if Not Signed (signed)           SF=0
 JNZ         Jump if Not Zero                      ZF=0
 JO          Jump if Overflow (signed)             OF=1
 JP          Jump if Parity                        PF=1
 JPE         Jump if Parity Even                   PF=1
 JPO         Jump if Parity Odd                    PF=0
 JS          Jump if Signed (signed)               SF=1
 JZ          Jump if Zero                          ZF=1


  Flags affected
 

None.


  Instruction size and timings
 

  Jcc

 operand     bytes   8088    186     286     386     486     Pentium
 near8        2      4/16    4/13    3/7+m   3/7+m   1/3     1    PV
 near16       3       -       -       -      3/7+m   1/3     1    PV

 Note: Cycles shown for no jump/jump

  JCXZ / JECXZ

 operand    bytes   8088    186     286     386     486     Pentium
 dest        2      6/18    5/16    4/8+m   5/9+m   5/8     5/6  NP
 dest        2       -       -       -      5/9+m   5/8     5/6  NP


  Example
 

 jne not_equal  ; Jump to label 'not_equal' if not equal condition is met

 jcxz cx_zero   ; Jump to label 'cx_zero' if the CX register is zero


 { Back to contents screen:hcContents}


.topic JMP

  JMP - Unconditional jump
 

  Description
 

Transfers program control to a different point in the instruction stream
without recording return information. The destination (target) operand
specifies the address of the instruction being jumped to. This operand can
be an immediate value, a general-purpose register, or a memory location.

This instruction can be used to execute four different types of jumps:

  Near jump

A jump to an instruction within the current code segment (the segment
currently pointed to by the CS register), sometimes referred to as an
intrasegment jump.

  Short jump

A near jump where the jump range is limited to -128 to +127 from the current
EIP value.

  Far jump

A jump to an instruction located in a different segment than the current code
segment but at the same privilege level, sometimes referred to as an
intersegment jump.

  Task switch

A jump to an instruction located in a different task. A task switch can only
be executed in protected mode.

  Near and Short Jumps

When executing a near jump, the processor jumps to the address (within the
current code segment) that is specified with the target operand. The target
operand specifies either an absolute offset (that is an offset from the base
of the code segment) or a relative offset (a signed displacement relative to
the current value of the instruction pointer in the EIP register). A near
jump to a relative offset of 8-bits (rel8) is referred to as a short jump.
The CS register is not changed on near and short jumps.

An absolute offset is specified indirectly in a general-purpose register or a
memory location (r/m16 or r/m32). The operand-size attribute determines the
size of the target operand (16 or 32 bits). Absolute offsets are loaded
directly into the EIP register. If the operand-size attribute is 16, the
upper two bytes of the EIP register are cleared to 0s, resulting in a maximum
instruction pointer size of 16 bits.

A relative offset (rel8, rel 16, or rel32) is generally specified as a label
in assembly code. This value is added to the value in the EIP register. (Here,
the EIP register contains the address of the instruction following the JMP
instruction). When using relative offsets, the opcode (for short vs. near jumps)
and the operand-size attribute (for near relative jumps) determines the size
of the target operand (8, 16, or 32 bits).

  Far Jumps in Real-Address or Virtual-8086 Mode

When executing a far jump in real-address or virtual-8086 mode, the processor
jumps to the code segment and offset specified with the target operand. Here
the target operand specifies an absolute far address either directly with a
pointer (ptr16:16 or ptr16:32) or indirectly with a memory location (m16:16
or m16:32). With the pointer method, the segment and address of the called
procedure is encoded in the instruc-tion, using a 4-byte (16-bit operand size)
or 6-byte (32-bit operand size) far address immediate. With the indirect
method, the target operand specifies a memory location that contains a 4-byte
(16-bit operand size) or 6-byte (32-bit operand size) far address. The far
address is loaded directly into the CS and EIP registers. If the operand-size
attribute is 16, the upper two bytes of the EIP register are cleared to 0s.

  Far Jumps in Protected Mode

When the processor is operating in protected mode, the JMP instruction can be
used to perform the following three types of far jumps:

  - A far jump to a conforming or non-conforming code segment.
  - A far jump through a call gate.
  - A task switch.

(The JMP instruction cannot be used to perform interprivilege level far jumps.)

In protected mode, the processor always uses the segment selector part of the
far address to access the corresponding descriptor in the GDT or LDT. The
descriptor type (code segment, call gate, task gate, or TSS) and access
rights determine the type of jump to be performed.

If the selected descriptor is for a code segment, a far jump to a code
segment at the same privilege level is performed. (If the selected code
segment is at a different privilege level and the code segment is non-conforming,
a general-protection exception is generated.) A far jump to the same privilege
level in protected mode is very similar to one carried out in real-address or
virtual-8086 mode. The target operand specifies an absolute far address either
directly with a pointer (ptr16:16 or ptr16:32) or indirectly with a memory
location (m16:16 or m16:32). The operand-size attribute determines the size
of the offset (16 or 32 bits) in the far address. The new code segment
selector and its descriptor are loaded into CS register, and the offset from
the instruction is loaded into the EIP register. Note that a call gate
(described in the next paragraph) can also be used to perform far call to a
code segment at the same privilege level. Using this mechanism provides an
extra level of indirection and is the preferred method of making jumps
between 16- bit and 32-bit code segments.


  Flags affected
 

All flags are affected if a task switch occurs; no flags are affected if a
task switch does not occur.


  Instruction size and timings
 

 operand     bytes   8088    186     286     386     486     Pentium
 short        2      15      13      7+m     7+m      3       1   PV
 near         3      15      13      7+m     7+m      3       1   PV
 far          5      15      13     11+m    12+m     17       3   NP
 r16          2      11      11      7+m     7+m      5       2   NP
 mem16      2+d(0,2) 18+EA   17     11+m    10+m      5       2   NP
 mem32      2+d(4)   24+EA   26     15+m    12+m     13       4   NP
 r32          2       -       -       -      7+m      5       2   NP
 mem32      2+d(0,2)  -       -       -     10+m      5       2   NP
 mem48      2+d(6)    -       -       -     12+m     13       4   NP


  Example
 

 jmp target_address     ; Unconditional jump to target_address


 { Back to contents screen:hcContents}


.topic LAHF

  LAHF - Load status flags into AH register
 

  Description
 

Moves the low byte of the EFLAGS register (which includes status flags SF, ZF,
AF, PF, and CF) to the AH register. Reserved bits 1, 3, and 5 of the EFLAGS
register are set in the AH register.


  Flags affected
 

None (that is, the state of the flags in the EFLAGS register are not affected).


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       4       2       2       2       3       2   NP


  Example
 

 lahf      ; Load flags into AH


 { Back to contents screen:hcContents}


.topic LAR

  LAR - Load access rights byte (286+)
 

  Description
 

Loads the access rights from the segment descriptor specified by the second
operand (source operand) into the first operand (destination operand) and
sets the ZF flag in the EFLAGS register. The source operand (which can be a
register or a memory location) contains the segment selector for the segment
descriptor being accessed. The destination operand is a general-purpose register.

The processor performs access checks as part of the loading process. Once loaded
in the destination register, software can perform additional checks on the
access rights information.

When the operand size is 32 bits, the access rights for a segment descriptor
include the type and DPL fields and the S, P, AVL, D/B, and G flags, all of
which are located in the second double-word (bytes 4 through 7) of the segment
descriptor. The doubleword is masked by 00FXFF00H before it is loaded into the
destination operand.

When the operand size is 16 bits, the access rights include the type and DPL
fields. Here, the two lower-order bytes of the doubleword are masked by FF00H
before being loaded into the destination operand.

This instruction performs the following checks before it loads the access rights
in the destination register:

  Checks that the segment selector is not null.

  Checks that the segment selector points to a descriptor that is within the
   limits of the GDT or LDT being accessed

  Checks that the descriptor type is valid for this instruction. All code and
   data segment descriptors are valid for (can be accessed with) the LAR
   instruction. The valid system segment and gate descriptor types are given
   in the following table.

  If the segment is not a conforming code segment, it checks that the specified
   segment descriptor is visible at the CPL (that is, if the CPL and the RPL of
   the segment selector are less than or equal to the DPL of the segment
   selector).

If the segment descriptor cannot be accessed or is an invalid type for the instruction, the ZF flag is cleared and no access rights are loaded in the destination operand.
The LAR instruction can only be executed in protected mode.

 Type               Name                    Valid
 0                  Reserved                No
 1                  Available 16-bit TSS    Yes
 2                  LDT                     Yes
 3                  Busy 16-bit TSS         Yes
 4                  16-bit call gate        Yes
 5                  16-bit/32-bit task gate Yes
 6                  16-bit interrupt gate   No
 7                  16-bit trap gate        No
 8                  Reserved                No
 9                  Available 32-bit TSS    Yes
 A                  Reserved                No
 B                  Busy 32-bit TSS         Yes
 C                  32-bit call gate        Yes
 D                  Reserved                No
 E                  32-bit interrupt gate   No
 F                  32-bit trap gate        No


  Flags affected
 

The ZF flag is set to 1 if the access rights are loaded successfully;
otherwise, it is cleared to 0.


  Instruction size and timings
 

 operands    bytes                   286     386     486     Pentium
 r16, r16     3                      14      15      11       8   NP
 r32, r32     3                       -      15      11       8   NP
 r16, m16     3                      16      16      11       8   NP
 r32, m32     3                       -      16      11       8   NP


  Example
 

 lar ebx, eax           ; EBX = access rights descriptor specified by EAX


 { Back to contents screen:hcContents}


.topic LDS

  LDS/LES/LFS/LGS/LSS - Load far pointer
 

  Description
 

Loads a far pointer (segment selector and offset) from the second operand
(source operand) into a segment register and the first operand (destination
operand). The source operand specifies a 48-bit or a 32-bit pointer in memory
depending on the current setting of the operand-size attribute (32 bits or 16
bits, respectively). The instruction opcode and the destination operand
specify a segment register/general-purpose register pair. The 16-bit segment
selector from the source operand is loaded into the segment register
specified with the opcode (DS, SS, ES, FS, or GS). The 32-bit or 16-bit
offset is loaded into the register specified with the destination operand.

If one of these instructions is executed in protected mode, additional
information from the segment descriptor pointed to by the segment selector
in the source operand is loaded in the hidden part of the selected segment
register.

Also in protected mode, a null selector (values 0000 through 0003) can be
loaded into DS, ES, FS, or GS registers without causing a protection
exception. (Any subsequent reference to a segment whose corresponding segment
register is loaded with a null selector, causes a general-protection exception
and no memory reference to the segment occurs.)


  Flags affected
 

None.


  Instruction size and timings
 

  LDS/LES

 operands    bytes   8088    186     286     386     486     Pentium
 reg, mem   2+d(2)   24+EA   18       7       7       6       4   NP


  LFS/LGS/LSS (386+)

 operands    bytes                           386     486     Pentium
 reg, mem   3+d(2,4)                          7       6       4   NP


  Example
 

 lds si, ptr_1          ; DS:DI points to ptr_1


 { Back to contents screen:hcContents}


.topic LEA

  LEA - Load effective address
 

  Description
 

Computes the effective address of the second operand (the source operand) and
stores it in the first operand (destination operand). The source operand is a
memory address (offset part) specified with one of the processors addressing
modes; the destination operand is a general-purpose register. The address-size
and operand-size attributes affect the action performed by this instruction,
as shown in the following table. The operand-size attribute of the instruction
is determined by the chosen register; the address-size attribute is determined
by the attribute of the code segment.

 Operand   Address      Action Performed
 Size      Size

 16        16           16-bit effective address is calculated and stored in
                        requested 16-bit register destination.

 16        32           32-bit effective address is calculated. The lower 16
                        bits of the address are stored in the requested 16-bit
                        register destination.

 32        16           16-bit effective address is calculated. The 16-bit
                        address is zero-extended and stored in the requested
                        32-bit register destination.

 32        32           32-bit effective address is calculated and stored in
                        the requested 32-bit register destination.


  Flags affected
 

None.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 r16, mem    2+d(2)  2+EA     6       3       2      1-2      1   UV
 r32, mem    2+d(2)   -       -       -       2      1-2      1   UV


  Example
 


 lea  eax, [eax+ebx*2+3]       ; EAX = effective address of [eax+ebx*2+3]


 { Back to contents screen:hcContents}


.topic LEAVE

  LEAVE - High level procedure exit (186+)
 

  Description
 

Releases the stack frame set up by an earlier {ENTER} instruction. The LEAVE
instruction copies the frame pointer (in the EBP register) into the stack
pointer register (ESP), which releases the stack space allocated to the stack
frame. The old frame pointer (the frame pointer for the calling procedure that
was saved by the {ENTER} instruction) is then popped from the stack into the
EBP register, restoring the calling procedure's stack frame.

A RET instruction is commonly executed following a LEAVE instruction to
return program control to the calling procedure.


  Flags affected
 

None.


  Instruction size and timings
 

  bytes           186     286     386     486     Pentium
   1               8       5       4       5       3   NP


  Example
 

 leave     ; Exit current procedure


 { Back to contents screen:hcContents}


.topic LGDT

  LGDT/LIDT - Load Global/Interrupt Descriptor Table Register (286+)
 

  Description
 

Loads the values in the source operand into the global descriptor table
register (GDTR) or the interrupt descriptor table register (IDTR). The source
operand specifies a 6-byte memory location that contains the base address
(a linear address) and the limit (size of table in bytes) of the global
descriptor table (GDT) or the interrupt descriptor table (IDT). If operand-size
attribute is 32 bits, a 16-bit limit (lower 2 bytes of the 6-byte data operand)
and a 32-bit base address (upper 4 bytes of the data operand) are loaded into
the register. If the operand-size attribute is 16 bits, a 16-bit limit (lower
2 bytes) and a 24-bit base address (third, fourth, and fifth byte) are loaded.
Here, the high-order byte of the operand is not used and the high-order byte
of the base address in the GDTR or IDTR is filled with zeros.

The LGDT and LIDT instructions are used only in operating-system software;
they are not used in application programs. They are the only instructions
that directly load a linear address (that is, not a segment-relative address)
and a limit in protected mode. They are commonly executed in real-address
mode to allow processor initialization prior to switching to protected mode.


  Flags affected
 

None.


  Instruction size and timings
 

 operand     bytes                   286     386     486     Pentium
  mem48       5                      11      11      11       6   NP


  Example
 

 lgdt descriptor[ebx]   ; Load Global Descriptor Table


 { Back to contents screen:hcContents}


.topic LLDT

  LLDT - Load Local Descriptor Table register (286+)
 

  Description
 

Loads the source operand into the segment selector field of the local
descriptor table register (LDTR). The source operand (a general-purpose
register or a memory location) contains a segment selector that points to a
local descriptor table (LDT). After the segment selector is loaded in the
LDTR, the processor uses to segment selector to locate the segment descriptor
for the LDT in the global descriptor table (GDT). It then loads the segment
limit and base address for the LDT from the segment descriptor into the LDTR.

The segment registers DS, ES, SS, FS, GS, and CS are not affected by this
instruction, nor is the LDTR field in the task state segment (TSS) for the
current task.

If the source operand is 0, the LDTR is marked invalid and all references to
descriptors in the LDT (except by the LAR, VERR, VERW or LSL instructions)
cause a general protection exception.

The operand-size attribute has no effect on this instruction.

The LLDT instruction is provided for use in operating-system software; it should
not be used in application programs. Also, this instruction can only be
executed in protected mode.


  Flags affected
 

None.


  Instruction size and timings
 

 operand     bytes                   286     386     486     Pentium
  r16         3                      17      20      11       9   NP
  mem16     3+d(0-2)                 19      24      11       9   NP


  Example
 

 lldt ax        ; Load LDT with AX


 { Back to contents screen:hcContents}


.topic LMSW

  LMSW - Load Machine Status Word (286+)
 

  Description
 

Loads the source operand into the machine status word, bits 0 through 15 of
register CR0. The source operand can be a 16-bit general-purpose register or
a memory location. Only the low-order 4 bits of the source operand (which
contains the PE, MP, EM, and TS flags) are loaded into CR0. The PG, CD, NW,
AM, WP, NE, and ET flags of CR0 are not affected. The operand-size attribute
has no effect on this instruction.

If the PE flag of the source operand (bit 0) is set to 1, the instruction
causes the processor to switch to protected mode. While in protected mode,
the LMSW instruction cannot be used clear the PE flag and force a switch back
to real-address mode.

The LMSW instruction is provided for use in operating-system software; it
should not be used in application programs. In protected or virtual-8086 mode,
it can only be executed at CPL 0.

This instruction is provided for compatibility with the Intel 286 processor;
programs and procedures intended to run on the Pentium Pro, Pentium, 486, and
386 processors should use the {MOV} (control registers) instruction to load
the whole CR0 register. The MOV CR0 instruction can be used to set and clear
the PE flag in CR0, allowing a procedure or program to switch between
protected and real-address modes.

This instruction is a serializing instruction.


  Flags affected
 

None.


  Instruction size and timings
 

 operand     bytes                   286     386     486     Pentium
  r16         3                       3      10      13       8   NP
  mem16     3+d(0-2)                  6      13      13       8   NP


  Example
 

 lmsw ax     ; load machine status word with ax


 { Back to contents screen:hcContents}


.topic LOCK

  LOCK - Lock bus
 

  Description
 

Causes the processor's LOCK# signal to be asserted during execution of the
accompanying instruction (turns the instruction into an atomic instruction).
In a multiprocessor environment, the LOCK# signal insures that the processor
has exclusive use of any shared memory while the signal is asserted.

Note that in later Intel Architecture processors (such as the Pentium Pro
processor), locking may occur without the LOCK# signal being asserted.

The LOCK prefix can be prepended only to the following instructions and to
those forms of the instructions that use a memory operand: ADD, ADC, AND,
BTC, BTR, BTS, CMPXCHG, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG.
An undefined opcode exception will be generated if the LOCK prefix is used
with any other instruction. The XCHG instruction always asserts the LOCK#
signal regardless of the presence or absence of the LOCK prefix.

The LOCK prefix is typically used with the BTS instruction to perform a
read-modify-write operation on a memory location in shared memory environment.

The integrity of the LOCK prefix is not affected by the alignment of the
memory field. Memory locking is observed for arbitrarily misaligned fields.

Intel Architecture Compatibility

Beginning with the Pentium Pro processor, when the LOCK prefix is prefixed to
an instruction and the memory area being accessed is cached internally in the
processor, the LOCK# signal is generally not asserted. Instead, only the
processor's cache is locked. Here, the processor's cache coherency mechanism
insures that the operation is carried out atomically with regards to memory.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       0       0       1       1   NP


  Example
 

 lock  mov mem, 1       ; Lock the bus while performing a memory move


 { Back to contents screen:hcContents}


.topic LODS

  LODS - Load string
 

  Description
 

Loads a byte, word, or doubleword from the source operand into the AL, AX, or
EAX register, respectively. The source operand is a memory location, the
address of which is read from the DS:EDI or the DS:SI registers (depending on
the address-size attribute of the instruction, 32 or 16, respectively). The
DS segment may be overridden with a segment override prefix.

At the assembly-code level, two forms of this instruction are allowed: the
"explicit-operands" form and the "no-operands" form. The explicit-operands
form (specified with the LODS mnemonic) is not supported by NASM.

The no-operands form provides "short forms" of the byte, word, and doubleword
versions of the LODS instructions. Here also DS:(E)SI is assumed to be the
source operand and the AL, AX, or EAX register is assumed to be the
destination operand. The size of the source and destination operands is
selected with the mnemonic: LODSB (byte loaded into register AL), LODSW
(word loaded into AX), or LODSD (doubleword loaded into EAX).

After the byte, word, or doubleword is transferred from the memory location
into the AL, AX, or EAX register, the (E)SI register is incremented or
decremented automatically according to the setting of the DF flag in the
EFLAGS register. (If the DF flag is 0, the (E)SI register is incremented;
if the DF flag is 1, the ESI register is decremented.) The (E)SI register
is incremented or decremented by 1 for byte operations, by 2 for word
operations, or by 4 for doubleword operations.

The LODSB, LODSW, and LODSD instructions can be preceded by the REP prefix
for block loads of ECX bytes, words, or doublewords. More often, however,
these instructions are used within a LOOP construct because further
processing of the data moved into the register is usually necessary before
the next transfer can be made. See {REP/REPE/REPZ/REPNE/REPNZ:REP} for more
information on the repeat prefix.


  Flags affected
 

None.


  Instruction size and timings
 

 variations  bytes   8088    186     286     386     486     Pentium
 lodsb        1      16      10       5       5       5       2   NP
 lodsw        1      16      10       5       5       5       2   NP
 lodsd        1       -       -       -       5       5       2   NP


  Example
 

 lodsb     ; Move the byte from DS:SI into AL


 { Back to contents screen:hcContents}


.topic LOOP

  LOOP - Decrement CX and loop if CX not zero
 

  Description
 

Performs a loop operation using the ECX or CX register as a counter. Each
time the LOOP instruction is executed, the count register is decremented,
then checked for 0. If the count is 0, the loop is terminated and program
execution continues with the instruction following the LOOP instruction.
If the count is not zero, a near jump is performed to the destination
(target) operand, which is presumably the instruction at the beginning of
the loop. If the address-size attribute is 32 bits, the ECX register is used
as the count register; otherwise the CX register is used.

The target instruction is specified with a relative offset (a signed offset
relative to the current value of the instruction pointer in the EIP register).
This offset is generally specified as a label in assembly code, but at the
machine code level, it is encoded as a signed, 8-bit immediate value, which
is added to the instruction pointer. Offsets of -128 to +127 are allowed
with this instruction.


  Flags affected
 

None.


  Instruction size and timings
 

 operand   bytes   8088    186     286     386     486     Pentium
 short      2      5/17    5/15    4/8+m   11+m    6/7     5/6  NP

 Note: timings shown for no jump (CX = 0) / jump (CX <> 0)


  Example
 

 loop loop_start        ; If CX <> 0 then loop to 'loop_start'


 { Back to contents screen:hcContents}


.topic LOOPE

  LOOPE/LOOPZ - Loop while equal / loop while zero
 

  Description
 

Loops in the same way as the {LOOP} instruction except that the loop occurs
only when (E)CX <> 0 and the zero flag (ZF) is set.


  Flags affected
 

None.


  Instruction size and timings
 

 operand   bytes   8088    186     286     386     486     Pentium
 short      2      6/18    5/16    4/8     11+m    6/9     7/8  NP


  Example
 

 loope loop_start             ; Loop if CX <> 0 and ZF = 1


 { Back to contents screen:hcContents}

.topic LOOPNZ

  LOOPNE/LOOPNZ - Loop while not equal / loop while not zero
 

  Description
 

Loops in the same way as the {LOOP} instruction except that the loop occurs
only when (E)CX <> 0 and the zero flag (ZF) is not set.


  Flags affected
 

None.


  Instruction size and timings
 

 operand bytes   8088    186     286     386     486     Pentium
 short    2      5/19    5/16    4/8     11+m    6/9     7/8  NP


  Example
 

 loopne loop_start      ; Loop if CX <> 0 and ZF = 0


 { Back to contents screen:hcContents}


.topic LSL

  LSL - Load segment limit (286+)
 

  Description
 

Loads the unscrambled segment limit from the segment descriptor specified
with the second operand (source operand) into the first operand (destination
operand) and sets the ZF flag in the EFLAGS register. The source operand
(which can be a register or a memory location) contains the segment selector
for the segment descriptor being accessed. The destination operand is a
general-purpose register.

The processor performs access checks as part of the loading process. Once
loaded in the destination register, software can compare the segment limit
with the offset of a pointer.

The segment limit is a 20-bit value contained in bytes 0 and 1 and in the
first 4 bits of byte 6 of the segment descriptor. If the descriptor has a
byte granular segment limit (the granularity flag is set to 0), the
destination operand is loaded with a byte granular value (byte limit).

If the descriptor has a page granular segment limit (the granularity flag is
set to 1), the LSL instruction will translate the page granular limit (page
limit) into a byte limit before loading it into the destination operand. The
translation is performed by shifting the 20-bit "raw" limit left 12 bits and
filling the low-order 12 bits with 1s.

When the operand size is 32 bits, the 32-bit byte limit is stored in the
destination operand. When the operand size is 16 bits, a valid 32-bit limit
is computed; however, the upper 16 bits are truncated and only the low-order
16 bits are loaded into the destination operand.

This instruction performs the following checks before it loads the segment
limit into the destination register:

  Checks that the segment selector is not null.
  Checks that the segment selector points to a descriptor that is within
   the limits of the GDT or LDT being accessed
  Checks that the descriptor type is valid for this instruction. All code
   and data segment descriptors are valid for (can be accessed with) the LSL
   instruction.
  If the segment is not a conforming code segment, the instruction checks
   that the specified segment descriptor is visible at the CPL (that is, if
   the CPL and the RPL of the segment selector are less than or equal to the
   DPL of the segment selector).

If the segment descriptor cannot be accessed or is an invalid type for the
instruction, the ZF flag is cleared and no value is loaded in the destination
operand.


  Flags affected
 

The ZF flag is set to 1 if the segment limit is loaded successfully;
otherwise, it is cleared to 0.


  Instruction size and timings
 

 operands    bytes                   286     386     486     Pentium
 r16, r16     3                      14      20/25   10       8   NP
 r32, r32     3                       -      20/25   10       8
 r16, m16   3+d(0,2)                 16      21/26   10       8
 r32, m32   3+d(0,2)                  -      21/26   10       8


  Example
 

 lsl eax, ebx           ; Load segment limit of EBX and store in EAX


 { Back to contents screen:hcContents}


.topic LTR

  LTR - Load task register (286+)
 

  Description
 

Loads the source operand into the segment selector field of the task register.
The source operand (a general-purpose register or a memory location) contains
a segment selector that points to a task state segment (TSS). After the segment
selector is loaded in the task register, the processor uses the segment
selector to locate the segment descriptor for the TSS in the global
descriptor table (GDT). It then loads the segment limit and base address for
the TSS from the segment descriptor into the task register. The task pointed
to by the task register is marked busy, but a switch to the task does not
occur.

The LTR instruction is provided for use in operating-system software; it
should not be used in application programs. It can only be executed in
protected mode when the CPL is 0. It is commonly used in initialization
code to establish the first task to be executed.

The operand-size attribute has no effect on this instruction.


  Flags affected
 

None.


  Instruction size and timings
 

 operand     bytes                   286     386     486     Pentium
 r16          3                      17      23      20      10   NP
 mem16      3+d(0,2)                 19      27      20      10


  Example
 

 ltr ax         ; Load task register from AX


 { Back to contents screen:hcContents}


.topic MOV

  MOV - Move data
 

  Description
 

Copies the second operand (source operand) to the first operand (destination
operand). The source operand can be an immediate value, general-purpose
register, segment register, or memory location; the destination register can
be a general-purpose register, segment register, or memory location. Both
operands must be the same size, which can be a byte, a word, or a doubleword.

The MOV instruction cannot be used to load the CS register. Attempting to do
so results in an invalid opcode exception. To load the CS register, use the
far {JMP}, {CALL}, or {RET} instruction.

If the destination operand is a segment register (DS, ES, FS, GS, or SS), the
source operand must be a valid segment selector. In protected mode, moving a
segment selector into a segment register automatically causes the segment
descriptor information associated with that segment selector to be loaded
into the hidden (shadow) part of the segment register. While loading this
information, the segment selector and segment descriptor information is
validated. The segment descriptor data is obtained from the GDT or LDT entry
for the specified segment selector.

A null segment selector (values 0000-0003) can be loaded into the DS, ES, FS,
and GS registers without causing a protection exception. However, any
subsequent attempt to reference a segment whose corresponding segment
register is loaded with a null value causes a general protection exception
and no memory reference occurs.

Loading the SS register with a MOV instruction inhibits all interrupts until
after the execution of the next instruction. This operation allows a stack
pointer to be loaded into the ESP register with the next instruction
(MOV ESP, stack-pointer value) before an interrupt occurs. The {LSS:LDS}
instruction offers a more efficient method of loading the SS and ESP
registers.

When operating in 32-bit mode and moving data between a segment register and
a general-purpose register, the Intel Architecture 32-bit processors do not
require the use of the 16-bit operand-size prefix (a byte with the value 66H)
with this instruction, but most assemblers will insert it if the standard
form of the instruction is used (for example, MOV DS, AX). The processor will
execute this instruction correctly, but it will usually require an extra
clock. With most assemblers, using the instruction form MOV DS, EAX will
avoid this unneeded 66H prefix. When the processor executes the instruction
with a 32-bit general-purpose register, it assumes that the 16 least-significant
bits of the general-purpose register are the destination or source operand.
If the register is a destination operand, the resulting value in the two
high-order bytes of the register is implementation dependent.

For the PentiumPro processor, the two high-order bytes are filled with zeros;
for earlier 32-bit Intel Architecture processors, the two high order bytes
are undefined.


  Flags affected
 

None.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 reg, reg     2       2       2       2       2       1       1   UV
 mem, reg  2+d(0-2)  13+EA    9       3       2       1       1   UV
 reg, mem  2+d(0-2)  12+EA   12       5       4       1       1   UV
 mem, imm  2+d(0-2)  14+EA   12-13    3       2       1       1   UV*
            +i(1,2)
 reg, imm  2+i(1,2)   4       3-4     2       2       1       1   UV

 acc, mem     3      14       8       5       4       1       1   UV
 mem, acc     3      14       9       3       2       1       1   UV

 * = not pairable if there is a displacement and immediate

  Segment Register Moves  - Real Mode

 operands    bytes   8088    186     286     386     486     Pentium
 seg, r16     2       2       2       2       2       3     2-11   NP
 seg, m16   2+d(0,2) 12+EA    9       5       5       3     3-12   NP
 r16, seg     2       2       2       2       2       3       1    NP
 m16, seg   2+d(0,2) 13+EA   11       3       2       3       1    NP


  Segment Register Moves - Protected Mode

 operands    bytes                   286     386     486     Pentium
 seg, r16     2                      17      18       9     2-11*  NP
 seg, m16   2+d(0,2)                 19      19       9     3-12*  NP

 * = add 8 if new descriptor; add 6 if SS


  Move to/from special registers (386+)

 operands    bytes                           386     486     Pentium
 r32, cr32    3                               6       4       4    NP
 cr32, r32    3                              4/10*   4/16*  12/22* NP
 r32, dr32    3                              14/22*  10      2/12* NP
 dr32, r32    3                              16/22*  11     11/12* NP
 r32, tr32    3                              12      3/4*     -    NP
 tr32, r32    3                              12      4/6*     -    NP

 * = cycles depend on which special register


  Example
 

 mov eax, ebx           ; EAX = EBX (General move)
 mov ds, ax             ; DS = AX   (Segment register move)
 mov cr0, eax           ; CR0 = EAX (Special register move)


 { Back to contents screen:hcContents}


.topic MOVD

  MOVD - Move 32 bits (MMX)
 

  Description
 

Copies doubleword from the source operand (second operand) to the destination
operand (first operand). Source and destination operands can be MMX registers,
memory locations, or 32-bit general-purpose registers; however, data cannot
be transferred from an MMX register to an MMX register, from one memory
location to another memory location, or from one general-purpose register to
another general-purpose register.

When the destination operand is an MMX register, the 32-bit source value is
written to the low-order 32 bits of the 64-bit MMX register and zero-extended
to 64 bits. When the source operand is an MMX register, the low-order 32 bits
of the MMX register are written to the 32-bit general-purpose register or
32-bit memory location selected with the destination operand.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.


  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic MOVQ

  MOVQ - Move 64 bits (MMX)
 

  Description
 

Copies quadword from the source operand (second operand) to the destination
operand (first operand). A source or destination operand can be either an
MMX register or a memory location; however, data cannot be transferred from
one memory location to another memory location. Data can be transferred from
one MMX register to another MMX register.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.


  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic MOVS

  MOVS/MOVSB/MOVSW/MOVSD - Move string
 

  Description
 

Moves the byte, word, or doubleword specified with the second operand (source
operand) to the location specified with the first operand (destination operand).
Both the source and destination operands are located in memory. The address
of the source operand is read from the DS:ESI or the DS:SI registers (depending
on the address-size attribute of the instruction, 32 or 16, respectively).
The address of the destination operand is read from the ES:EDI or the ES:DI
registers (again depending on the address-size attribute of the instruction).

The DS segment may be overridden with a segment override prefix, but the ES
segment cannot be overridden.

At the assembly-code level, two forms of this instruction are allowed: the
"explicit-operands" form and the "no-operands" form. The explicit-operands
form (specified with the MOVS mnemonic) is not supported by NASM.

The no-operands form provides "short forms" of the byte, word, and doubleword
versions of the MOVS instructions. Here also DS:(E)SI and ES:(E)DI are
assumed to be the source and destination operands, respectively. The size of
the source and destination operands is selected with the mnemonic: MOVSB
(byte move), MOVSW (word move), or MOVSD (doubleword move).

After the move operation, the (E)SI and (E)DI registers are incremented or
decremented automatically according to the setting of the DF flag in the
EFLAGS register. (If the DF flag is 0, the (E)SI and (E)DI register are
incremented; if the DF flag is 1, the (E)SI and (E)DI registers are
decremented.) The registers are incremented or decremented by 1 for byte
operations, by 2 for word operations, or by 4 for doubleword operations.

The MOVS, MOVSB, MOVSW, and MOVSD instructions can be preceded by the {REP}
prefix for block moves of ECX bytes, words, or doublewords.


  Flags affected
 

None.


  Instruction size and timings
 

 variations  bytes   8088    186     286     386     486     Pentium
 movsb        1      18       9       5       7       7       4   NP
 movsw        1      26       9       5       7       7       4   NP
 movsd        1       -       -       -       7       7       4   NP
 rep movsb    2      9+17n   8+8n    5+4n    7+4n   12+3n*   3+n  NP
 rep movsw    2      9+25n   8+8n    5+4n    7+4n   12+3n*   3+n  NP
 rep movsd    2       -       -       -      7+4n   12+3n*   3+n  NP

 * = 5 if n=0, 13 if n=1 (where n = count of bytes, words or dwords)


  Example
 

 rep movsb              ; Move CX bytes from DS:SI to ES:DI


 { Back to contents screen:hcContents}


.topic MOVSX

  MOVSX - Move with sign-extension (386+)
 

  Description
 

Copies the contents of the source operand (register or memory location) to
the destination operand (register) and sign extends the value to 16 or 32
bits. The size of the converted value depends on the operand-size attribute.


  Flags affected
 

None.


  Instruction size and timings
 

 operands  bytes                           386     486     Pentium
 reg, reg   3                               3       3       3   NP
 reg, mem   3+d(0,1,2,4)                    6       3       3   NP

 Note: destination register is 16 or 32-bits; source is 8 or 16 bits


  Example
 

 movsx  ebx, ax         ; EBX = sign extended AX


 { Back to contents screen:hcContents}


.topic MOVZX

  MOVZX - Move with zero-extend (386+)
 

  Description
 

Copies the contents of the source operand (register or memory location) to
the destination operand (register) and zero extends the value to 16 or 32
bits. The size of the converted value depends on the operand-size attribute.


  Flags affected
 

None.


  Instruction size and timings
 

 operands  bytes                           386     486     Pentium
 reg, reg   3                               3       3       3   NP
 reg, mem   3+d(0,1,2,4)                    6       3       3   NP

 Note: destination register is 16 or 32-bits; source is 8 or 16 bits


  Example
 

 movzx ebx, ax          ; EBX = zero extended AX


 { Back to contents screen:hcContents}


.topic MUL

  MUL - Unsigned multiply
 

  Description
 

Performs an unsigned multiplication of the first operand (destination operand)
and the second operand (source operand) and stores the result in the destination
operand. The destination operand is an implied operand located in register AL,
AX or EAX (depending on the size of the operand); the source operand is
located in a general-purpose register or a memory location. The action of
this instruction and the location of the result depends on the opcode and
the operand size as shown in the following table.

 Operand Size    Source 1     Source 2      Destination
 Byte            AL           r/m8          AX
 Word            AX           r/m16         DX:AX
 Doubleword      EAX          r/m32         EDX:EAX

The result is stored in register AX, register pair DX:AX, or register pair
EDX:EAX (depending on the operand size), with the high-order bits of the
product contained in register AH, DX, or EDX, respectively. If the high-order
bits of the product are 0, the CF and OF flags are cleared; otherwise, the
flags are set.


  Flags affected
 

The OF and CF flags are cleared to 0 if the upper half of the result is 0;
otherwise, they are set to 1. The SF, ZF, AF, and PF flags are undefined.


  Instruction size and timings
 

 operand     bytes   8088     186    286     386     486     Pentium
 r8           2     70-77    26-28   13      9-14   13-18    11   NP
 r16          2    118-133   35-37   21      9-22   13-26    11   NP
 r32          2       -        -      -      9-38   13-42    10   NP
 mem8    2+d(0-2)  76-83+EA  32-34   16     12-17   13-18    11   NP
 mem16   2+d(0-2) 124-139+EA 41-43   24     12-25   13-26    11   NP
 mem32   2+d(0-2)     -        -      -     12-41   13-42    10   NP


  Example
 

 mul ebx        ; EDX:EAX = EAX * EBX


 { Back to contents screen:hcContents}


.topic NEG

  NEG - Two's complement negation
 

  Description
 

Replaces the value of operand (the destination operand) with its two's
complement. (This operation is equivalent to subtracting the operand from 0.)
The destination operand is located in a general-purpose register or a memory
location.


  Flags affected
 

The CF flag cleared to 0 if the source operand is 0; otherwise it is set to 1.
The OF, SF, ZF, AF, and PF flags are set according to the result.


  Instruction size and timings
 

 operand     bytes   8088    186     286     386     486     Pentium
 reg          2       3       3       2       2       1       1   NP
 mem       2+d(0-2)  24+EA   13       7       6       3       3   NP


  Example
 

 neg eax        ; EAX = 0 - EAX


 { Back to contents screen:hcContents}


.topic NOP

  NOP - No operation
 

  Description
 

Performs no operation. This instruction is a one-byte instruction that takes
up space in the instruction stream but does not affect the machine context,
except the EIP register.

The NOP instruction is an alias mnemonic for the XCHG (E)AX, (E)AX instruction.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       3       3       3       3       1       1   UV


  Example
 

 nop       ; No operation


 { Back to contents screen:hcContents}


.topic NOT

  NOT - One's complement negation
 

  Description
 

Performs a bitwise NOT operation (each 1 is cleared to 0, and each 0 is set
to 1) on the destination operand and stores the result in the destination
operand location. The destination operand can be a register or a memory
location.


  Flags affected
 

None.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 reg          2       3       3       2       2       1       1   NP
 mem       2+d(0-2)  24+EA   13       7       6       3       3   NP


  Example
 

 not eax        ; Toggle each bit in EAX


 { Back to contents screen:hcContents}


.topic OR

  OR - Logical inclusive OR
 

  Description
 

Performs a bitwise inclusive OR operation between the destination (first)
and source (second) operands and stores the result in the destination
operand location. The source operand can be an immediate, a register, or a
memory location; the destination operand can be a register or a memory
location. (However, two memory operands cannot be used in one instruction.)
Each bit of the result of the OR instruction is 0 if both corresponding bits
of the operands are 0; otherwise, each bit is 1.


  Flags affected
 

The OF and CF flags are cleared; the SF, ZF, and PF flags are set according
to the result. The state of the AF flag is undefined.


  Instruction size and timings
 

 operands     bytes   8088    186     286     386     486     Pentium
 reg, reg      2       3       3       2       2       1       1   UV
 mem, reg   2+d(0,2)  24+EA   10       7       7       3       3   UV
 reg, mem   2+d(0,2)  13+EA   10       7       6       2       2   UV
 reg, imm   2+i(1,2)   4       4       3       2       1       1   UV
 mem, imm   2+d(0,2)  23+EA   16       7       7       3       3   UV*
             +i(1,2)
 acc, imm   1+i(1,2)   4       4       3       2       1       1   UV

 * = not pairable if there is a displacement and immediate


  Example
 

 or eax, ebx    ; Perform logical or storing result in EAX


 { Back to contents screen:hcContents}


.topic OUT

  OUT - Output to port
 

  Description
 

Copies the value from the second operand (source operand) to the I/O port
specified with the destination operand (first operand). The source operand
can be register AL, AX, or EAX, depending on the size of the port being
accessed (8, 16, or 32 bits, respectively); the destination operand can be a
byte-immediate or the DX register. Using a byte immediate allows I/O port
addresses 0 to 255 to be accessed; using the DX register as a source operand
allows I/O ports from 0 to 65,535 to be accessed.

The size of the I/O port being accessed is determined by the opcode for an
8-bit I/O port or by the operand-size attribute of the instruction for a
16- or 32-bit I/O port.

At the machine code level, I/O instructions are shorter when accessing 8-bit
I/O ports. Here, the upper eight bits of the port address will be 0.


  Flags affected
 

None.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 imm8, al     2      14       9       3      10      16      12   NP
 imm8, ax     2      14       9       3      10      16      12   NP
 imm8, eax    2       -       -       -      10      16      12   NP
 dx, al       1      12       7       3      11      16      12   NP
 dx, ax       1      12       7       3      11      16      12   NP
 dx, eax      1       -       -       -      11      16      12   NP

 Protected Mode

 operands    bytes                           386     486     Pentium
 imm8, acc    2                            4/24/24 11/31/29 9/26/24 NP
 dx, acc      1                            5/25/25 10/30/29 9/26/24 NP

 Cycles for: CPL <= IOPL / CPL > IOPL / V86


  Example
 

 out dx, al     ; Send AL to port DX


 { Back to contents screen:hcContents}


.topic OUTS

  OUTS - Output string to port
 

  Description
 

Copies data from the source operand (second operand) to the I/O port
specified with the destination operand (first operand). The source operand
is a memory location, the address of which is read from either the DS:EDI or
the DS:DI registers (depending on the address-size attribute of the
instruction, 32 or 16, respectively). (The DS segment may be overridden with
a segment override prefix.) The destination operand is an I/O port address
(from 0 to 65,535) that is read from the DX register. The size of the I/O
port being accessed (that is, the size of the source and destination operands)
is determined by the opcode for an 8-bit I/O port or by the operand-size
attribute of the instruction for a 16- or 32-bit I/O port.

At the assembly-code level, two forms of this instruction are allowed: the
"explicit-operands" form and the "no-operands" form. The explicit-operands
form (specified with the OUTS mnemonic) is not supported by NASM.

The no-operands form provides "short forms" of the byte, word, and doubleword
versions of the OUTS instructions. Here also DS:(E)SI is assumed to be the
source operand and DX is assumed to be the destination operand. The size of
the I/O port is specified with the choice of mnemonic: OUTSB (byte), OUTSW
(word), or OUTSD (doubleword).

After the byte, word, or doubleword is transferred from the memory location
to the I/O port, the (E)SI register is incremented or decremented
automatically according to the setting of the DF flag in the EFLAGS register.
(If the DF flag is 0, the (E)SI register is incremented; if the DF flag is 1,
the (E)SI register is decremented.) The (E)SI register is incremented or
decremented by 1 for byte operations, by 2 for word operations, or by 4 for
doubleword operations.

The OUTS, OUTSB, OUTSW, and OUTSD instructions can be preceded by the {REP}
prefix for block input of ECX bytes, words, or doublewords.


  Flags affected
 

None.


  Instruction size and timings
 

 variations  bytes           186     286     386     486     Pentium
 outsb        1              14       5      14      17      13   NP
 outsw        1              14       5      14      17      13   NP
 outsd        1               -       -      14      17      13   NP

 Protected Mode

 bytes                           386     486     Pentium
  1                           8/28/28 10/32/30 10/27/25 NP

 Cycles for: CPL <= IOPL / CPL > IOPL / V86


  Example
 

 rep outsb      ; Output CX bytes from DS:DI to port DX


 { Back to contents screen:hcContents}


.topic PACKSSWB

  PACKSSWB/PACKSSDW - Pack with signed saturation (MMX)
 

  Description
 

Packs and saturates signed words into bytes (PACKSSWB) or signed doublewords
into words (PACKSSDW). The PACKSSWB instruction packs 4 signed words from
the destination operand (first operand) and 4 signed words from the source
operand (second operand) into 8 signed bytes in the destination operand. If
the signed value of a word is beyond the range of a signed byte (that is,
greater than 7FH or less than 80H), the saturated byte value of 7FH or 80H,
respectively, is stored into the destination.

The PACKSSDW instruction packs 2 signed doublewords from the destination
operand (first operand) and 2 signed doublewords from the source operand
(second operand) into 4 signed words in the destination operand. If the
signed value of a doubleword is beyond the range of a signed word (that is,
greater than 7FFFH or less than 8000H), the saturated word value of 7FFFH or
8000H, respectively, is stored into the destination.

The destination operand for either the PACKSSWB or PACKSSDW instruction must
be an MMX register; the source operand may be either an MMX register or a
quadword memory location.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PACKUSWB

  PACKUSWB - Pack with unsigned saturation (MMX)
 

  Description
 

Packs and saturates 4 signed words from the destination operand (first
operand) and 4 signed words from the source operand (second operand) into 8
unsigned bytes in the destination operand. If the signed value of a word is
beyond the range of an unsigned byte (that is, greater than FFH or less than
00H), the saturated byte value of FFH or 00H, respectively, is stored into
the destination.

The destination operand must be an MMX register; the source operand may be
either an MMX register or a quadword memory location.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PADDB

  PADDB/PADDW/PADDD - Packed add (MMX)
 

  Description
 

Adds the individual data elements (bytes, words, or doublewords) of the source
operand (second operand) to the individual data elements of the destination
operand (first operand). If the result of an individual addition exceeds the
range for the specified data type (overflows), the result is wrapped around,
meaning that the result is truncated so that only the lower (least significant)
bits of the result are returned (that is, the carry is ignored).

The destination operand must be an MMX register; the source operand can be
either an MMX register or a quadword memory location.

The PADDB instruction adds the bytes of the source operand to the bytes of
the destination operand and stores the results to the destination operand.
When an individual result is too large to be represented in 8 bits, the lower
8 bits of the result are written to the destination operand and therefore the
result wraps around.

The PADDW instruction adds the words of the source operand to the words of the
destination operand and stores the results to the destination operand. When
an individual result is too large to be represented in 16 bits, the lower 16
bits of the result are written to the destination operand and therefore the
result wraps around.

The PADDD instruction adds the doublewords of the source operand to the doublewords
of the destination operand and stores the results to the destination operand.
When an individual result is too large to be represented in 32 bits, the lower
32 bits of the result are written to the destination operand and therefore the
result wraps around.

Note that like the integer ADD instruction, the PADDB, PADDW, and PADDD
instructions can operate on either unsigned or signed (two's complement
notation) packed integers. Unlike the integer instructions, none of the MMX
instructions affect the EFLAGS register. With MMX instructions, there are no
carry or overflow flags to indicate when overflow has occurred, so the software
must control the range of values or else use the "with saturation" MMX
instructions.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PADDSB

  PADDSB/PADDSW - Packed add with saturation (MMX)
 

  Description
 

Adds the individual signed data elements (bytes or words) of the source
operand (second operand) to the individual signed data elements of the
destination operand (first operand). If the result of an individual addition
exceeds the range for the specified data type, the result is saturated. The
destination operand must be an MMX register; the source operand can be either
an MMX register or a quadword memory location.

The PADDSB instruction adds the signed bytes of the source operand to the
signed bytes of the destination operand and stores the results to the
destination operand. When an individual result is beyond the range of a
signed byte (that is, greater than 7FH or less than 80H), the saturated byte
value of 7FH or 80H, respectively, is written to the destination operand.

The PADDSW instruction adds the signed words of the source operand to the
signed words of the destination operand and stores the results to the
destination operand. When an individual result is beyond the range of a
signed word (that is, greater than 7FFFH or less than 8000H), the saturated
word value of 7FFFH or 8000H, respectively, is written to the destination
operand.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PADDUSB

  PADDUSB/PADDUSW - Packed add with unsigned saturation (MMX)
 

  Description
 

Adds the individual unsigned data elements (bytes or words) of the packed
source operand (second operand) to the individual unsigned data elements of
the packed destination operand (first operand). If the result of an
individual addition exceeds the range for the specified unsigned data type,
the result is saturated. The destination operand must be an MMX register; the
source operand can be either an MMX register or a quadword memory location.

The PADDUSB instruction adds the unsigned bytes of the source operand to the
unsigned bytes of the destination operand and stores the results to the
destination operand. When an individual result is beyond the range of an
unsigned byte (that is, greater than FFH), the saturated unsigned byte value
of FFH is written to the destination operand.

The PADDUSW instruction adds the unsigned words of the source operand to the
unsigned words of the destination operand and stores the results to the
destination operand. When an individual result is beyond the range of an
unsigned word (that is, greater than FFFFH), the saturated unsigned word
value of FFFFH is written to the destination operand.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}

.topic PAND

  PAND - Logical AND (MMX)
 

  Description
 

Performs a bitwise logical AND operation on the quadword source (second) and
destination (first) operands and stores the result in the destination operand
location. The source operand can be an MMX register or a quadword memory
location; the destination operand must be an MMX register. Each bit of the
result of the PAND instruction is set to 1 if the corresponding bits of the
operands are both 1; otherwise it is made zero


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PANDN

  PANDN - Logical AND NOT (MMX)
 

  Description
 

Performs a bitwise logical NOT on the quadword destination operand (first
operand). Then, the instruction performs a bitwise logical AND operation on
the inverted destination operand and the quadword source operand (second
operand). Each bit of the result of the AND operation is set to one if the
corresponding bits of the source and inverted destination bits are one;
otherwise it is set to zero. The result is stored in the destination operand
location.

The source operand can be an MMX register or a quadword memory location; the
destination operand must be an MMX register.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PCMPEQB

  PCMPEQB/PCMPEQW/PCMPEQD - Packed compare for equal (MMX)
 

  Description
 

Compares the individual data elements (bytes, words, or doublewords) in the
destination operand (first operand) to the corresponding data elements in the
source operand (second operand). If a pair of data elements are equal, the
corresponding data element in the destination operand is set to all ones;
otherwise, it is set to all zeros. The destination operand must be an MMX
register; the source operand may be either an MMX register or a 64- bit
memory location.

The PCMPEQB instruction compares the bytes in the destination operand to the
corresponding bytes in the source operand, with the bytes in the destination
operand being set according to the results.

The PCMPEQW instruction compares the words in the destination operand to the
corresponding words in the source operand, with the words in the destination
operand being set according to the results.

The PCMPEQD instruction compares the doublewords in the destination operand
to the corre-sponding doublewords in the source operand, with the doublewords
in the destination operand being set according to the results.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PCMPGTB

  PCMPGTB/PCMPGTW/PCMPGTD - Packed compare for greater than (MMX)
 

  Description
 

Compares the individual signed data elements (bytes, words, or doublewords) in
the destination operand (first operand) to the corresponding signed data
elements in the source operand (second operand). If a data element in the
destination operand is greater than its corresponding data element in the
source operand, the data element in the destination operand is set to all
ones; otherwise, it is set to all zeros. The destination operand must be an
MMX register; the source operand may be either an MMX register or a 64-bit
memory location.

The PCMPGTB instruction compares the signed bytes in the destination operand
to the corresponding signed bytes in the source operand, with the bytes in
the destination operand being set according to the results.

The PCMPGTW instruction compares the signed words in the destination operand
to the corresponding signed words in the source operand, with the words in
the destination operand being set according to the results.

The PCMPGTD instruction compares the signed doublewords in the destination
operand to the corresponding signed doublewords in the source operand, with
the doublewords in the destination operand being set according to the results.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}

.topic PMADDWD

  PMADDWD - Packed multiply and add (MMX)
 

  Description
 

Multiplies the individual signed words of the destination operand by the
corresponding signed words of the source operand, producing four signed,
doubleword results. The two doubleword results from the multiplication of
the high-order words are added together and stored in the upper doubleword
of the destination operand; the two doubleword results from the multiplication
of the low-order words are added together and stored in the lower doubleword
of the destination operand. The destination operand must be an MMX register;
the source operand may be either an MMX register or a 64-bit memory location.

The PMADDWD instruction wraps around to 80000000H only when all four words of
both the source and destination operands are 8000H.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PMULHW

  PMULHW - Packed multiply high (MMX)
 

  Description
 

Multiplies the four signed words of the source operand (second operand) by
the four signed words of the destination operand (first operand), producing
four signed, doubleword, intermediate results. The high-order word of each
intermediate result is then written to its corresponding word location in
the destination operand. The destination operand must be an MMX register; the
source operand may be either an MMX register or a 64-bit memory location.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}

.topic PMULLW

  PMULLW - Packed multiply low (MMX)
 

  Description
 

Multiplies the four signed or unsigned words of the source operand (second
operand) with the four signed or unsigned words of the destination operand
(first operand), producing four doubleword, intermediate results. The
low-order word of each intermediate result is then written to its
corresponding word location in the destination operand. The destination
operand must be an MMX register; the source operand may be either an MMX
register or a 64- bit memory location.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic POP

  POP - Pop a value from the stack
 

  Description
 

Loads the value from the top of the stack to the location specified with the
destination operand and then increments the stack pointer. The destination
operand can be a general-purpose register, memory location, or segment
register.

The address-size attribute of the stack segment determines the stack pointer
size (16 bits or 32 bits - the source address size), and the operand-size
attribute of the current code segment determines the amount the stack pointer
is incremented (2 bytes or 4 bytes). For example, if these address- and
operand-size attributes are 32, the 32-bit ESP register (stack pointer) is
incre-mented by 4 and, if they are 16, the 16-bit SP register is incremented
by 2. (The B flag in the stack segment's segment descriptor determines the
stack's address-size attribute, and the D flag in the current code segment's
segment descriptor, along with prefixes, determines the operand-size
attribute and also the address-size attribute of the destination operand.)

If the destination operand is one of the segment registers DS, ES, FS, GS,
or SS, the value loaded into the register must be a valid segment selector.
In protected mode, popping a segment selector into a segment register
automatically causes the descriptor information associated with that segment
selector to be loaded into the hidden (shadow) part of the segment register
and causes the selector and the descriptor information to be validated.

A null value (0000-0003) may be popped into the DS, ES, FS, or GS register
without causing a general protection fault. However, any subsequent attempt
to reference a segment whose corresponding segment register is loaded with a
null value causes a general protection exception. In this situation, no
memory reference occurs and the saved value of the segment register is null.

The POP instruction cannot pop a value into the CS register. To load the CS
register from the stack, use the {RET} instruction.

If the ESP register is used as a base register for addressing a destination
operand in memory, the POP instruction computes the effective address of the
operand after it increments the ESP register.

The POP ESP instruction increments the stack pointer (ESP) before data at the
old top of stack is written into the destination.

A POP SS instruction inhibits all interrupts, including the NMI interrupt,
until after execution of the next instruction. This action allows sequential
execution of POP SS and MOV ESP, EBP instructions without the danger of
having an invalid stack during an interrupt. However, use of the {LSS:LDS}
instruction is the preferred method of loading the SS and ESP registers.


  Flags affected
 

None.


  Instruction size and timings
 

 operand     bytes   8088    186     286     386     486     Pentium
 reg          1      12      10       5       4       1       1   UV
 mem       2+d(0-2)  25+EA   20       5       5       6       3   NP
 seg          1      12       8       5       7       3       3   NP
 FS/GS        2       -       -       -       7       3       3   NP

 Protected Mode

 operand     bytes                   286     386     486     Pentium
 CS/DS/ES     1                      20      21       9     3-12  NP
 SS           1                      20      21       9     8-17  NP
 FS/GS        2                       -      21       9     3-12  NP


  Example
 

 pop eax        ; Pop dword off stack into EAX


 { Back to contents screen:hcContents}


.topic POPA

  POPA/POPAD - Pop all general purpose registers (186+/386+)
 

  Description
 

Pops doublewords (POPAD) or words (POPA) from the stack into the general-purpose
registers. The registers are loaded in the following order: EDI, ESI, EBP,
EBX, EDX, ECX, and EAX (if the operand-size attribute is 32) and DI, SI, BP,
BX, DX, CX, and AX (if the operand-size attribute is 16). (These instructions
reverse the operation of the PUSHA/PUSHAD instructions.) The value on the
stack for the ESP or SP register is ignored. Instead, the ESP or SP register
is incremented after each register is loaded.

The POPA (pop all) and POPAD (pop all double) mnemonics reference the same
opcode. The POPA instruction is intended for use when the operand-size
attribute is 16 and the POPAD instruction for when the operand-size attribute
is 32. Some assemblers may force the operand size to 16 when POPA is used and
to 32 when POPAD is used (using the operand-size override prefix [66H] if
necessary). Others may treat these mnemonics as synonyms (POPA/POPAD) and
use the current setting of the operand-size attribute to determine the size
of values to be popped from the stack, regardless of the mnemonic used.
(The D flag in the current code segment's segment descriptor determines the
operand-size attribute.)


  Flags affected
 

None.


  Instruction size and timings
 

 variations  bytes           186     286     386     486     Pentium
 popa         1              51      19      24       9       5   NP
 popad        1               -       -      24       9       5   NP

  Example
 

 popa           ; Pop all registers


 { Back to contents screen:hcContents}



.topic POPF

  POPF/POPFD - Pop flags/pop flags double (386+)
 

  Description
 

Pops a doubleword (POPFD) from the top of the stack (if the current
operand-size attribute is 32) and stores the value in the EFLAGS register or
pops a word from the top of the stack (if the operand-size attribute is 16)
and stores it in the lower 16 bits of the EFLAGS register (that is, the
FLAGS register). (These instructions reverse the operation of the {PUSHF/PUSHFD:PUSHF}
instructions.)

The POPF (pop flags) and POPFD (pop flags double) mnemonics reference the
same opcode. The POPF instruction is intended for use when the operand-size
attribute is 16 and the POPFD instruction for when the operand-size attribute
is 32. Some assemblers may force the operand size to 16 when POPF is used and
to 32 when POPFD is used. Others may treat these mnemonics as synonyms
(POPF/POPFD) and use the current setting of the operand-size attribute to
determine the size of values to be popped from the stack, regardless of the
mnemonic used.

The effect of the POPF/POPFD instructions on the EFLAGS register changes
slightly, depending on the mode of operation of the processor. When the
processor is operating in protected mode at privilege level 0 (or in real-address
mode, which is equivalent to privilege level 0), all the non-reserved flags
in the EFLAGS register except the VIP, VIF, and VM flags can be modified.
The VIP and VIF flags are cleared, and the VM flag is unaffected.

When operating in protected mode, with a privilege level greater than 0, but
less than or equal to IOPL, all the flags can be modified except the IOPL
field and the VIP, VIF, and VM flags. Here, the IOPL flags are unaffected,
the VIP and VIF flags are cleared, and the VM flag is unaf-fected. The
interrupt flag (IF) is altered only when executing at a level at least as
privileged as the IOPL. If a POPF/POPFD instruction is executed with
insufficient privilege, an exception does not occur, but the privileged bits
do not change.

When operating in virtual-8086 mode, the I/O privilege level (IOPL) must be
equal to 3 to use POPF/POPFD instructions and the VM, RF, IOPL, VIP, and VIF
flags are unaffected. If the IOPL is less than 3, the POPF/POPFD instructions
cause a general-protection exception.


  Flags affected
 

All flags except the reserved bits and the VM bit.


  Instruction size and timings
 

 variations  bytes   8088    186     286     386     486     Pentium
 popf         1      12       8       5       5       9       6   NP
 popfd        1       -       -       -       5       9       6   NP

 Protected Mode

             bytes                   286     386     486     Pentium
 popf         1                       5       5       6       4   NP
 popfd        1                       -       5       6       4   NP


  Example
 

 popfd          ; Pop EFLAGS off the stack


 { Back to contents screen:hcContents}

.topic POR

  POR - Bitwise logical OR (MMX)
 

  Description
 

Performs a bitwise logical OR operation on the quadword source (second) and
destination (first) operands and stores the result in the destination operand
location. The source operand can be an MMX register or a quadword memory
location; the destination operand must be an MMX register. Each bit of the
result is made 0 if the corresponding bits of both operands are 0; otherwise
the bit is set to 1.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PSLLW

  PSLLW/PSLLD/PSLLQ - Packed shift left logical (MMX)
 

  Description
 

Shifts the bits in the data elements (words, doublewords, or quadword) in the
destination operand (first operand) to the left by the number of bits
specified in the unsigned count operand (second operand). The result of the
shift operation is written to the destination operand. As the bits in the
data elements are shifted left, the empty low-order bits are cleared (set to
zero). If the value specified by the count operand is greater than 15 (for
words), 31 (for doublewords), or 63 (for a quadword), then the destination
operand is set to all zeros.

The destination operand must be an MMX register; the count operand can be
either an MMX register, a 64-bit memory location, or an 8-bit immediate.

The PSLLW instruction shifts each of the four words of the destination
operand to the left by the number of bits specified in the count operand;
the PSLLD instruction shifts each of the two doublewords of the destination
operand; and the PSLLQ instruction shifts the 64-bit quadword in the
destination operand. As the individual data elements are shifted left, the
empty low-order bit positions are filled with zeros.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PSRAW

  PSRAW/PSRAD - Packed shift right arithmetic (MMX)
 

  Description
 

Shifts the bits in the data elements (words or doublewords) in the
destination operand (first operand) to the right by the amount of bits
specified in the unsigned count operand (second operand). The result of the
shift operation is written to the destination operand. The empty high-order
bits of each element are filled with the initial value of the sign bit of the
data element. If the value specified by the count operand is greater than 15
(for words) or 31 (for doublewords), each destination data element is filled
with the initial value of the sign bit of the element.

The destination operand must be an MMX register; the count operand (source
operand) can be either an MMX register, a 64-bit memory location, or an 8-bit
immediate.

The PSRAW instruction shifts each of the four words in the destination
operand to the right by the number of bits specified in the count operand;
the PSRAD instruction shifts each of the two doublewords in the destination
operand. As the individual data elements are shifted right, the empty
high-order bit positions are filled with the sign value.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}

.topic PSRLW

  PSRLW/PSRLD/PSRLQ - Packed shift right logical (MMX)
 

  Description
 

Shifts the bits in the data elements (words, doublewords, or quadword) in the
destination operand (first operand) to the right by the number of bits
specified in the unsigned count operand (second operand). The result of the
shift operation is written to the destination operand. As the bits in the
data elements are shifted right, the empty high-order bits are cleared (set
to zero). If the value specified by the count operand is greater than 15
(for words), 31 (for doublewords), or 63 (for a quadword), then the
destination operand is set to all zeros.

The destination operand must be an MMX register; the count operand can be
either an MMX register, a 64-bit memory location, or an 8-bit immediate.

The PSRLW instruction shifts each of the four words of the destination
operand to the right by the number of bits specified in the count operand;
the PSRLD instruction shifts each of the two doublewords of the destination
operand; and the PSRLQ instruction shifts the 64-bit quadword in the
destination operand. As the individual data elements are shifted right, the
empty high-order bit positions are filled with zeros.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PSUBB

  PSUBB/PSUBW/PSUBD - Packed subtract (MMX)
 

  Description
 

Subtracts the individual data elements (bytes, words, or doublewords) of the
source operand (second operand) from the individual data elements of the
destination operand (first operand). If the result of a subtraction exceeds
the range for the specified data type (overflows), the result is wrapped
around, meaning that the result is truncated so that only the lower (least
significant) bits of the result are returned (that is, the carry is ignored).

The destination operand must be an MMX register; the source operand can be
either an MMX register or a quadword memory location.

The PSUBB instruction subtracts the bytes of the source operand from the
bytes of the destination operand and stores the results to the destination
operand. When an individual result is too large to be represented in 8 bits,
the lower 8 bits of the result are written to the destination operand and
therefore the result wraps around.

The PSUBW instruction subtracts the words of the source operand from the
words of the desti-nation operand and stores the results to the destination
operand. When an individual result is too large to be represented in 16 bits,
the lower 16 bits of the result are written to the destination operand and
therefore the result wraps around.

The PSUBD instruction subtracts the doublewords of the source operand from
the doublewords of the destination operand and stores the results to the
destination operand. When an individual result is too large to be represented
in 32 bits, the lower 32 bits of the result are written to the destination
operand and therefore the result wraps around.

Note that like the integer SUB instruction, the PSUBB, PSUBW, and PSUBD
instructions can operate on either unsigned or signed (two's complement
notation) packed integers. Unlike the integer instructions, none of the MMX
instructions affect the EFLAGS register. With MMX instructions, there are no
carry or overflow flags to indicate when overflow has occurred, so the
software must control the range of values or else use the "with saturation"
MMX instructions.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PSUBSB

  PSUBSB/PSUBSW - Packed subtract with saturation (MMX)
 

  Description
 

Subtracts the individual signed data elements (bytes or words) of the source
operand (second operand) from the individual signed data elements of the
destination operand (first operand). (See Figure 3-23.) If the result of a
subtraction exceeds the range for the specified data type, the result is
saturated. The destination operand must be an MMX register; the source
operand can be either an MMX register or a quadword memory location.

The PSUBSB instruction subtracts the signed bytes of the source operand from
the signed bytes of the destination operand and stores the results to the
destination operand. When an individual result is beyond the range of a
signed byte (that is, greater than 7FH or less than 80H), the saturated byte
value of 7FH or 80H, respectively, is written to the destination operand.

The PSUBSW instruction subtracts the signed words of the source operand from
the signed words of the destination operand and stores the results to the
destination operand. When an individual result is beyond the range of a
signed word (that is, greater than 7FFFH or less than 8000H), the saturated
word value of 7FFFH or 8000H, respectively, is written to the destination
operand.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PSUBUSB

  PSUBUSB/PSUBUSW - Packed subtract with unsigned saturation (MMX)
 

  Description
 

Subtracts the individual unsigned data elements (bytes or words) of the
source operand (second operand) from the individual unsigned data elements
of the destination operand (first operand). If the result of an individual
subtraction exceeds the range for the specified unsigned data type, the
result is saturated. The destination operand musts be an MMX register; the
source operand can be either an MMX register or a quadword memory location.

The PSUBUSB instruction subtracts the unsigned bytes of the source operand
from the unsigned bytes of the destination operand and stores the results to
the destination operand. When an indi-vidual result is less than zero
(a negative value), the saturated unsigned byte value of 00H is written to
the destination operand.

The PSUBUSW instruction subtracts the unsigned words of the source operand
from the unsigned words of the destination operand and stores the results to
the destination operand. When an individual result is less than zero (a
negative value), the saturated unsigned word value of 0000H is written to
the destination operand.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PUNPCKHBW

  PUNPCKHBW/PUNPCKHWD/PUNPCKHDQ - Unpack high packed data (MMX)
 

  Description
 

Unpacks and interleaves the high-order data elements (bytes, words, or
doublewords) of the destination operand (first operand) and source operand
(second operand) into the destination operand. The low-order data elements
are ignored. The destination operand must be an MMX register; the source
operand may be either an MMX register or a 64-bit memory location. When the
source data comes from a memory operand, the full 64-bit operand is accessed
from memory, but the instruction uses only the high-order 32 bits.

The PUNPCKHBW instruction interleaves the four high-order bytes of the source
operand and the four high-order bytes of the destination operand and writes
them to the destination operand.

The PUNPCKHWD instruction interleaves the two high-order words of the source
operand and the two high-order words of the destination operand and writes
them to the destination operand.

The PUNPCKHDQ instruction interleaves the high-order doubleword of the source
operand and the high-order doubleword of the destination operand and writes
them to the destination operand.

If the source operand is all zeros, the result (stored in the destination
operand) contains zero extensions of the high-order data elements from the
original value in the destination operand. With the PUNPCKHBW instruction the
high-order bytes are zero extended (that is, unpacked into unsigned words),
and with the PUNPCKHWD instruction, the high-order words are zero extended
(unpacked into unsigned doublewords).


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PUNPCKLBW

  PUNPCKLBW/PUNPCKLWD/PUNPCKLDQ - Unpack low packed data (MMX)
 

  Description
 

Unpacks and interleaves the low-order data elements (bytes, words, or
doublewords) of the destination and source operands into the destination
operand (see Figure 3-26). The destination operand must be an MMX register;
the source operand may be either an MMX register or a memory location. When
source data comes from an MMX register, the upper 32 bits of the register are
ignored. When the source data comes from a memory, only 32-bits are accessed
from memory.

The PUNPCKLBW instruction interleaves the four low-order bytes of the source
operand and the four low-order bytes of the destination operand and writes
them to the destination operand.

The PUNPCKLWD instruction interleaves the two low-order words of the source
operand and the two low-order words of the destination operand and writes
them to the destination operand.

The PUNPCKLDQ instruction interleaves the low-order doubleword of the source
operand and the low-order doubleword of the destination operand and writes
them to the destination operand.

If the source operand is all zeros, the result (stored in the destination
operand) contains zero extensions of the high-order data elements from the
original value in the destination operand. With the PUNPCKLBW instruction the
low-order bytes are zero extended (that is, unpacked into unsigned words),
and with the PUNPCKLWD instruction, the low-order words are zero extended
(unpacked into unsigned doublewords).


  Flags affected
 

None.


  Instruction size and timings
 

Not available.

  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic PUSH

  PUSH - Push word/dword onto the stack
 

  Description
 

Decrements the stack pointer and then stores the source operand on the top of
the stack. The address-size attribute of the stack segment determines the
stack pointer size (16 bits or 32 bits), and the operand-size attribute of
the current code segment determines the amount the stack pointer is
decremented (2 bytes or 4 bytes). For example, if these address- and
operand-size attributes are 32, the 32-bit ESP register (stack pointer) is
decremented by 4 and, if they are 16, the 16-bit SP register is decremented
by 2.(The B flag in the stack segment's segment descriptor determines the
stack's address-size attribute, and the D flag in the current code segment's
segment descriptor, along with prefixes, determines the operand-size
attribute and also the address-size attribute of the source operand.)

Pushing a 16-bit operand when the stack address-size attribute is 32 can
result in a misaligned the stack pointer (that is, the stack pointer is not
aligned on a doubleword boundary).

The PUSH ESP instruction pushes the value of the ESP register as it existed
before the instruction was executed. Thus, if a PUSH instruction uses a
memory operand in which the ESP register is used as a base register for
computing the operand address, the effective address of the operand is
computed before the ESP register is decremented.

In the real-address mode, if the ESP or SP register is 1 when the PUSH
instruction is executed, the processor shuts down due to a lack of stack
space. No exception is generated to indicate this condition.

For processors from the 286 on, the PUSH ESP instruction pushes the value of
the ESP register as it existed before the instruction was executed. (This is
also true in the real-address and virtual-8086 modes.) For the 8086 processor,
the PUSH SP instruction pushes the new value of the SP register (that is the
value after it has been decremented by 2).


  Flags affected
 

None.


  Instruction size and timings
 

 operand    bytes   8088    186     286     386     486     Pentium
 reg         1      15      10       3       2       1       1   UV
 mem      2+d(0-2)  24+EA   16       5       5       4       2   NP
 seg         1      14       9       3       2       3       1   NP
 imm     1+i(1,2)    -       -       3       2       1       1   NP
 FS/GS       2       -       -       -       2       3       1   NP


  Example
 

 push eax       ; Push EAX onto the stack


 { Back to contents screen:hcContents}


.topic PUSHA

  PUSHA/PUSHAD - Push all general purpose registers (186+/386+)
 

  Description
 

Pushes the contents of the general-purpose registers onto the stack. The
registers are stored on the stack in the following order: EAX, ECX, EDX, EBX,
EBP, ESP (original value), EBP, ESI, and EDI (if the current operand-size
attribute is 32) and AX, CX, DX, BX, SP (original value), BP, SI, and DI (if
the operand-size attribute is 16). (These instructions perform the reverse
operation of the {POPA/POPAD:POPA} instructions.) The value pushed for the
ESP or SP register is its value before prior to pushing the first register.

The PUSHA (push all) and PUSHAD (push all double) mnemonics reference the
same opcode. The PUSHA instruction is intended for use when the operand-size
attribute is 16 and the PUSHAD instruction for when the operand-size
attribute is 32. Some assemblers may force the operand size to 16 when PUSHA
is used and to 32 when PUSHAD is used. Others may treat these mnemonics as
synonyms (PUSHA/PUSHAD) and use the current setting of the operand-size
attribute to determine the size of values to be pushed from the stack,
regardless of the mnemonic used.

In the real-address mode, if the ESP or SP register is 1, 3, or 5 when the
PUSHA/PUSHAD instruction is executed, the processor shuts down due to a lack
of stack space. No exception is generated to indicate this condition.


  Flags affected
 

None.


  Instruction size and timings
 

 variations  bytes           186     286     386     486     Pentium
 pusha        1              36      17      18      11       5   NP
 pushad       1               -       -      18      11       5   NP


  Example
 

 pusha          ; Push all general-purpose registers onto the stack


 { Back to contents screen:hcContents}


.topic PUSHF

  PUSHF/PUSHFD - Push flags register onto the stack
 

  Description
 

Decrements the stack pointer by 4 (if the current operand-size attribute is
32) and pushes the entire contents of the EFLAGS register onto the stack, or
decrements the stack pointer by 2 (if the operand-size attribute is 16) and
pushes the lower 16 bits of the EFLAGS register (that is, the FLAGS register)
onto the stack. (These instructions reverse the operation of the POPF/POPFD
instructions.) When copying the entire EFLAGS register to the stack, the VM
and RF flags (bits 16 and 17) are not copied; instead, the values for these
flags are cleared in the EFLAGS image stored on the stack.

The PUSHF (push flags) and PUSHFD (push flags double) mnemonics reference the
same opcode. The PUSHF instruction is intended for use when the operand-size
attribute is 16 and the PUSHFD instruction for when the operand-size
attribute is 32. Some assemblers may force the operand size to 16 when PUSHF
is used and to 32 when PUSHFD is used. Others may treat these mnemonics as
synonyms (PUSHF/PUSHFD) and use the current setting of the operand-size
attribute to determine the size of values to be pushed from the stack,
regardless of the mnemonic used.

When in virtual-8086 mode and the I/O privilege level (IOPL) is less than 3,
the PUSHF/PUSHFD instruction causes a general protection exception.

In the real-address mode, if the ESP or SP register is 1, 3, or 5 when the
PUSHA/PUSHAD instruction is executed, the processor shuts down due to a lack
of stack space. No exception is generated to indicate this condition.


  Flags affected
 

None.


  Instruction size and timings
 

 variations  bytes   8088    186     286     386     486     Pentium
 pushf        1      14       9       3       4       4       9   NP
 pushfd       1       -       -       -       4       4       9   NP

 Protected Mode

             bytes                   286     386     486     Pentium
 pushf        1                       3       4       3       3   NP
 pushfd       1                       -       4       3       3   NP


  Example
 

 pushf          ; Push EFLAGS onto the stack


 { Back to contents screen:hcContents}


.topic PXOR

  PXOR - Logical exclusive OR (MMX)
 

  Description
 

Performs a bitwise logical exclusive-OR (XOR) operation on the quadword
source (second) and destination (first) operands and stores the result in the
destination operand location. The source operand can be an MMX register or a
quadword memory location; the destination operand must be an MMX register.
Each bit of the result is 1 if the corresponding bits of the two operands are
different; each bit is 0 if the corresponding bits of the operands are the
same.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.


  Example
 

Not available.


 { Back to contents screen:hcContents}


.topic RCL

  RCL/RCR/ROL/ROR - Rotate
 

  Description
 

Shifts (rotates) the bits of the first operand (destination operand) the
number of bit positions specified in the second operand (count operand) and
stores the result in the destination operand. The destination operand can be
a register or a memory location; the count operand is an unsigned integer
that can be an immediate or a value in the CL register. The processor
restricts the count to a number between 0 and 31 by masking all the bits in
the count operand except the 5 least-significant bits.

The rotate left (ROL) and rotate through carry left (RCL) instructions shift
all the bits toward more-significant bit positions, except for the
most-significant bit, which is rotated to the least-significant bit location.

  ROL operation

        Ŀ     Ŀ
        C<7 < 0<Ŀ
              
             

  RCL operation

           Ŀ     Ŀ
        C<7 < 0<Ŀ
                 
        

The rotate right (ROR) and rotate through carry right (RCR) instructions
shift all the bits toward less significant bit positions, except for the
least-significant bit, which is rotated to the most-significant bit location.

  ROR operation

           Ŀ     Ŀ
        >7 > 0>C
              
        


  RCR operation

           Ŀ     Ŀ
        >7 > 0>CĿ
                 
        

The RCL and RCR instructions include the CF flag in the rotation. The RCL
instruction shifts the CF flag into the least-significant bit and shifts the
most-significant bit into the CF flag. The RCR instruction shifts the CF flag
into the most-significant bit and shifts the least-significant bit into the
CF flag. For the ROL and ROR instructions, the original value of the CF flag
is not a part of the result, but the CF flag receives a copy of the bit that
was shifted from one end to the other.

The OF flag is defined only for the 1-bit rotates; it is undefined in all
other cases (except that a zero-bit rotate does nothing, that is affects no
flags). For left rotates, the OF flag is set to the exclusive OR of the CF
bit (after the rotate) and the most-significant bit of the result. For right
rotates, the OF flag is set to the exclusive OR of the two most-significant
bits of the result.

The 8086 does not mask the rotation count. However, all other processors
(starting with the 286 processor) do mask the rotation count to 5 bits,
resulting in a maximum count of 31. This masking is done in all operating
modes (including the virtual-8086 mode) to reduce the maximum execution time
of the instructions.


  Flags affected
 

The CF flag contains the value of the bit shifted into it. The OF flag is
affected only for single-bit rotates (see above); it is undefined for
multi-bit rotates. The SF, ZF, AF, and PF flags are not affected.


  Instruction size and timings
 

  RCL and RCR

 operands    bytes   8088    186     286     386     486     Pentium
 reg, 1       2       2       2       2       9       3       1   PU
 mem, 1    2+d(0,2)  23+EA   15       7      10       4       3   PU
 reg, cl      2       8+4n    5+n    5+n      9      8-30    7-24 NP
 mem, cl   2+d(0,2) 28+EA+4n 17+n    8+n     10      9-31    9-26 NP
 reg, imm     3       -       5+n    5+n      9      8-30    8-25 NP
 mem, imm  3+d(0,2)   -      17+n    8+n     10      9-31   10-27 NP

  ROL and ROR

 operands    bytes   8088    186     286     386     486     Pentium
 reg, 1       2       2       2       2       3       3       1   PU
 mem, 1    2+d(0,2)  23+EA   15       7       7       4       3   PU
 reg, cl      2       8+4n    5+n    5+n      3       3       4   NP
 mem, cl   2+d(0,2) 28+EA+4n 17+n    8+n      7       4       4   NP
 reg, imm     3       -       5+n    5+n      3       2       1   PU
 mem, imm  3+d(0,2)   -      17+n    8+n      7       4       3   PU*

 * = not pairable if there is a displacement and immediate


  Example
 

 ror eax, 16    ; Rotate EAX by 16 bits


 { Back to contents screen:hcContents}


.topic RDMSR

  RDMSR - Read from Model Specific Register (Pentium+)
 

  Description
 

Loads the contents of a 64-bit model specific register (MSR) specified in the
ECX register into registers EDX:EAX. The EDX register is loaded with the
high-order 32 bits of the MSR and the EAX register is loaded with the
low-order 32 bits. If less than 64 bits are implemented in the MSR being read,
the values returned to EDX:EAX in unimplemented bit locations are undefined.

This instruction must be executed at privilege level 0 or in real-address
mode; otherwise, a general protection exception will be generated. Specifying
a reserved or unimplemented MSR address in ECX will also cause a general
protection exception.

The MSRs control functions for testability, execution tracing,
performance-monitoring and machine check errors.

The {CPUID} instruction should be used to determine whether MSRs are supported
(EDX[5]=1) before using this instruction.

The MSRs and the ability to read them with the RDMSR instruction were
introduced into the Intel Architecture with the Pentium processor. Execution
of this instruction by an Intel Architecture processor earlier than the
Pentium processor results in an invalid opcode exception.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes                                           Pentium
  2                                              20-24 NP


  Example
 

 rdmsr          ; Read MSR addressed by ECX


 { Back to contents screen:hcContents}


.topic RDPMC

  RDPMC - Read Performance-Monitoring Countersion (MMX/Pentium Pro+)
 

  Description
 

Loads the contents of the 40-bit performance-monitoring counter specified in
the ECX register into registers EDX:EAX. The EDX register is loaded with the
high-order 8 bits of the counter and the EAX register is loaded with the
low-order 32 bits. The Pentium Pro processor has two performance-monitoring
counters (0 and 1), which are specified by placing 0000H or 0001H,
respectively, in the ECX register.

The RDPMC instruction allows application code running at a privilege level
of 1, 2, or 3 to read the performance-monitoring counters if the PCE flag in
the CR4 register is set. This instruction is provided to allow performance
monitoring by application code without incurring the overhead of a call to
an operating-system procedure.

The performance-monitoring counters are event counters that can be programmed
to count events such as the number of instructions decoded, number of
interrupts received, or number of cache loads.

The RDPMC instruction does not serialize instruction execution. That is, it
does not imply that all the events caused by the preceding instructions have
been completed or that events caused by subsequent instructions have not
begun. If an exact event count is desired, software must use a serializing
instruction (such as the {CPUID} instruction) before and/or after the
execution of the RDPMC instruction.

The RDPMC instruction can execute in 16-bit addressing mode or virtual-8086
mode; however, the full contents of the ECX register are used to determine
the counter to access and a full 40-bit result is returned (the low-order 32
bits in the EAX register and the high-order 9 bits in the EDX register).

The RDPMC instruction was introduced into the Intel Architecture in the
Pentium Pro processor and the Pentium processor with MMX technology. The
other Pentium processors have performance-monitoring counters, but they must
be read with the RDMSR instruction.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.


  Example
 

 rdpmc     ; Read performance-monitoring counter addressed by ECX


 { Back to contents screen:hcContents}


.topic RDTSC

  RDTSC - Read Time-Stamp Counter (Pentium+)
 

  Description
 

Loads the current value of the processor's time-stamp counter into the EDX:EAX
registers. The time-stamp counter is contained in a 64-bit MSR. The high-order
32 bits of the MSR are loaded into the EDX register, and the low-order 32 bits
are loaded into the EAX register. The processor increments the time-stamp
counter MSR every clock cycle and resets it to 0 whenever the processor is
reset.

The time stamp disable (TSD) flag in register CR4 restricts the use of the
RDTSC instruction. When the TSD flag is clear, the RDTSC instruction can be
executed at any privilege level; when the flag is set, the instruction can
only be executed at privilege level 0. The time-stamp counter can also be
read with the {RDMSR} instruction, when executing at privilege level 0.

The RDTSC instruction is not a serializing instruction. Thus, it does not
necessarily wait until all previous instructions have been executed before
reading the counter. Similarly, subsequent instructions may begin execution
before the read operation is performed.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.


  Example
 

 rdtsc          ; EDX:EAX = Time-Stamp Counter value


 { Back to contents screen:hcContents}


.topic REP

  REP/REPE/REPZ/REPNE/REPNZ - Repeat string operation prefix
 

  Description
 

Repeats a string instruction the number of times specified in the count
register ((E)CX) or until the indicated condition of the ZF flag is no longer
met. The REP (repeat), REPE (repeat while equal), REPNE (repeat while not
equal), REPZ (repeat while zero), and REPNZ (repeat while not zero) mnemonics
are prefixes that can be added to one of the string instructions. The REP
prefix can be added to the {INS}, {OUTS}, {MOVS}, {LODS}, and {STOS}
instructions, and the REPE, REPNE, REPZ, and REPNZ prefixes can be added to
the {CMPS} and {SCAS} instructions. (The REPZ and REPNZ prefixes are
synonymous forms of the REPE and REPNE prefixes, respectively.) The behavior
of the REP prefix is undefined when used with non-string instructions.

The REP prefixes apply only to one string instruction at a time. To repeat a
block of instructions, use the {LOOP} instruction or another looping
construct.

All of these repeat prefixes cause the associated instruction to be repeated
until the count in register (E)CX is decremented to 0 (see the following
table). (If the current address-size attribute is 32, register ECX is used as
a counter, and if the address-size attribute is 16, the CX register is used.)
The REPE, REPNE, REPZ, and REPNZ prefixes also check the state of the ZF flag
after each iteration and terminate the repeat loop if the ZF flag is not in
the specified state. When both termination conditions are tested, the cause
of a repeat termination can be determined either by testing the (E)CX register
with a {JECXZ:Jcc} instruction or by testing the ZF flag with a {JZ:Jcc},
{JNZ:Jcc}, and {JNE:Jcc} instruction.

  Repeat Conditions

 Repeat Prefix      Termination Condition 1      Termination Condition 2
 REP                ECX=0                        None
 REPE/REPZ          ECX=0                        ZF=0
 REPNE/REPNZ        ECX=0                        ZF=1


When the REPE/REPZ and REPNE/REPNZ prefixes are used, the ZF flag does not
require initialization because both the {CMPS} and {SCAS} instructions affect
the ZF flag according to the results of the comparisons they make.

A repeating string operation can be suspended by an exception or interrupt.
When this happens, the state of the registers is preserved to allow the
string operation to be resumed upon a return from the exception or interrupt
handler. The source and destination registers point to the next string
elements to be operated on, the EIP register points to the string instruction,
and the ECX register has the value it held following the last successful
iteration of the instruction. This mechanism allows long string operations to
proceed without affecting the interrupt response time of the system.

When a fault occurs during the execution of a {CMPS} or {SCAS} instruction
that is prefixed with REPE or REPNE, the EFLAGS value is restored to the
state prior to the execution of the instruction. Since the {SCAS} and {CMPS}
instructions do not use EFLAGS as an input, the processor can resume the
instruction after the page fault handler.

Use the REP {INS} and REP {OUTS} instructions with caution. Not all I/O ports
can handle the rate at which these instructions execute.

A REP {STOS} instruction is the fastest way to initialize a large block of
memory.


  Flags affected
 

None; however, the {CMPS} and {SCAS} instructions do set the status flags in
the EFLAGS register.


  Instruction size and timings
 

  REP

See {MOVS} and {STOS}.

  REPE/REPZ/REPNE/REPNZ

See {CMPS} and {SCAS}.


  Example
 

 rep movsb      ; Repeatedly move bytes


 { Back to contents screen:hcContents}


.topic RET

  RET/RETN/RETF - Return from procedure
 

  Description
 

Transfers program control to a return address located on the top of the stack.
The address is usually placed on the stack by a {CALL} instruction, and the
return is made to the instruction that follows the {CALL} instruction.

The optional source operand specifies the number of stack bytes to be released
after the return address is popped; the default is none. This operand can be
used to release parameters from the stack that were passed to the called
procedure and are no longer needed. It must be used when the CALL instruction
used to switch to a new procedure uses a call gate with a non-zero word count
to access the new procedure. Here, the source operand for the RET instruction
must specify the same number of bytes as is specified in the word count field
of the call gate.

The RET instruction can be used to execute three different types of returns:

  Near return (RETN) - a return to a calling procedure within the current
                        code segment (the segment currently pointed to by the
                        CS register), sometimes referred to as an
                        intrasegment return.

  Far return (RETF)  - a return to a calling procedure located in a different
                        segment than the current code segment, sometimes
                        referred to as an intersegment return.

  Inter-privilege-level far return - a far return to a different privilege
                                      level than that of the currently
                                      executing program or procedure.

The inter-privilege-level return type can only be executed in protected mode.

When executing a near return, the processor pops the return instruction
pointer (offset) from the top of the stack into the EIP register and begins
program execution at the new instruction pointer. The CS register is
unchanged.

When executing a far return, the processor pops the return instruction
pointer from the top of the stack into the EIP register, then pops the
segment selector from the top of the stack into the CS register. The
processor then begins program execution in the new code segment at the new
instruction pointer.

The mechanics of an inter-privilege-level far return are similar to an
intersegment return, except that the processor examines the privilege levels
and access rights of the code and stack segments being returned to determine
if the control transfer is allowed to be made. The DS, ES, FS, and GS segment
registers are cleared by the RET instruction during an inter-privilege-level
return if they refer to segments that are not allowed to be accessed at the
new privilege level. Since a stack switch also occurs on an inter-privilege
level return, the ESP and SS registers are loaded from the stack.

If parameters are passed to the called procedure during an inter-privilege
level call, the optional source operand must be used with the RET instruction
to release the parameters on the return. Here, the parameters are released
both from the called procedure's stack and the calling procedure's stack
(that is, the stack being returned to).


  Flags affected
 

None.


  Instruction size and timings
 

 variations/
 operands     bytes   8088    186     286     386     486     Pentium
 retn         1       20      16      11+m    10+m     5       2   NP
 retn imm16   1+d(2)  24      18      11+m    10+m     5       3   NP
 retf         1       34      22      15+m    18+m    13       4   NP
 retf imm16   1+d(2)  33      25      15+m    18+m    14       4   NP


  Example
 

 ret            ; Return to calling code


 { Back to contents screen:hcContents}


.topic RSM

  RSM - Resume from System Management Mode (Pentium+)
 

  Description
 

Returns program control from system management mode (SMM) to the application
program or operating-system procedure that was interrupted when the processor
received an SSM interrupt. The processor's state is restored from the dump
created upon entering SMM. If the processor detects invalid state information
during state restoration, it enters the shutdown state. The following invalid
information can cause a shutdown:

  Any reserved bit of CR4 is set to 1.
  Any illegal combination of bits in CR0, such as (PG=1 and PE=0) or
   (NW=1 and CD=0).
  (Intel Pentium  and 486 processors only.) The value stored in the state
   dump base field is not a 32-KByte aligned address.

The contents of the model-specific registers are not affected by a return
from SMM.


  Flags affected
 

All.


  Instruction size and timings
 

 bytes                                           Pentium
  2                                              83   NP


  Example
 

 rsm            ; Resume from System Management Mode


 { Back to contents screen:hcContents}


.topic SAHF

  SAHF - Store AH into flags
 

  Description
 

Loads the SF, ZF, AF, PF, and CF flags of the EFLAGS register with values
from the corresponding bits in the AH register (bits 7, 6, 4, 2, and 0,
respectively). Bits 1, 3, and 5 of register AH are ignored; the corresponding
reserved bits (1, 3, and 5) in the EFLAGS register remain unchanged.


  Flags affected
 

The SF, ZF, AF, PF, and CF flags are loaded with values from the AH register.
Bits 1, 3, and 5 of the EFLAGS register are unaffected, with the values
remaining 1, 0, and 0, respectively.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       4       3       2       3       2       2   NP


  Example
 

 sahf      ; Store AH in EFLAGS


 { Back to contents screen:hcContents}


.topic SAL

  SAL/SAR/SHL/SHR - Shift bits
 

  Description
 

Shifts the bits in the first operand (destination operand) to the left or
right by the number of bits specified in the second operand (count operand).
Bits shifted beyond the destination operand boundary are first shifted into
the CF flag, then discarded. At the end of the shift operation, the CF flag
contains the last bit shifted out of the destination operand.

The destination operand can be a register or a memory location. The count
operand can be an immediate value or register CL. The count is masked to 5
bits, which limits the count range to 0 to 31. A special opcode encoding is
provided for a count of 1.

The shift arithmetic left (SAL) and shift logical left (SHL) instructions
perform the same operation; they shift the bits in the destination operand to
the left (toward more significant bit locations). For each shift count, the
most significant bit of the destination operand is shifted into the CF flag,
and the least significant bit is cleared.

  SAL/SHL operation

        Ŀ     Ŀ     Ŀ
        C<7 < 0<0
                  


The shift arithmetic right (SAR) and shift logical right (SHR) instructions
shift the bits of the destination operand to the right (toward less
significant bit locations). For each shift count, the least significant bit
of the destination operand is shifted into the CF flag, and the most
significant bit is either set or cleared depending on the instruction type.
The SHR instruction clears the most significant bit.

  SAR operation

           Ŀ     Ŀ
        7 > 0>C
               
        ^

  SHR operation

        Ŀ     Ŀ     Ŀ
        0>7 > 0>C
                  


The SAR and SHR instructions can be used to perform signed or unsigned
division, respectively, of the destination operand by powers of 2. For
example, using the SAR instruction to shift a signed integer 1 bit to the
right divides the value by 2.

Using the SAR instruction to perform a division operation does not produce
the same result as the IDIV instruction. The quotient from the IDIV
instruction is rounded toward zero, whereas the "quotient" of the SAR
instruction is rounded toward negative infinity. This difference is apparent
only for negative numbers. For example, when the IDIV instruction is used to
divide -9 by 4, the result is -2 with a remainder of -1. If the SAR
instruction is used to shift -9 right by two bits, the result is -3 and the
"remainder" is +3; however, the SAR instruction stores only the most
significant bit of the remainder (in the CF flag).

The OF flag is affected only on 1-bit shifts. For left shifts, the OF flag is
cleared to 0 if the most-significant bit of the result is the same as the CF
flag (that is, the top two bits of the original operand were the same);
otherwise, it is set to 1. For the SAR instruction, the OF flag is cleared
for all 1-bit shifts. For the SHR instruction, the OF flag is set to the
most-significant bit of the original operand.

The 8086 does not mask the shift count. However, all other Intel Architecture
processors (starting with the 286 processor) do mask the shift count to 5 bits,
resulting in a maximum count of 31. This masking is done in all operating
modes (including the virtual-8086 mode) to reduce the maximum execution time
of the instructions.


  Flags affected
 

The CF flag contains the value of the last bit shifted out of the destination
operand; it is undefined for SHL and SHR instructions where the count is
greater than or equal to the size (in bits) of the destination operand. The
OF flag is affected only for 1-bit shifts (see above); otherwise, it is
undefined. The SF, ZF, and PF flags are set according to the result. If the
count is 0, the flags are not affected. For a non-zero count, the AF flag is
undefined.


  Instruction size and timings
 

 operands     bytes   8088    186     286     386     486     Pentium
 reg, 1        2       2       2       2       3       3       1   PU
 mem, 1     2+d(0,2)  23+EA   15       7       7       4       3   PU
 reg, cl       2       8+4n    5+n    5+n      3       3       4   NP
 mem, cl    2+d(0,2) 28+EA+4n 17+n    8+n      7       4       4   NP
 reg, imm      3       -       5+n    5+n      3       2       1   PU
 mem, imm   3+d(0,2)   -      17+n    8+n      7       4       3   PU*

 * = not pairable if there is a displacement and immediate


  Example
 

 shl eax, 1     ; EAX = EAX * 2


 { Back to contents screen:hcContents}


.topic SBB

  SBB - Integer subtraction with borrow
 

  Description
 

Adds the source operand (second operand) and the carry (CF) flag, and
subtracts the result from the destination operand (first operand). The
result of the subtraction is stored in the destination operand. The
destination operand can be a register or a memory location; the source
operand can be an immediate, a register, or a memory location. (However, two
memory operands cannot be used in one instruction.) The state of the CF flag
represents a borrow from a previous subtraction.

When an immediate value is used as an operand, it is sign-extended to the
length of the destination operand format.

The SBB instruction does not distinguish between signed or unsigned operands.
Instead, the processor evaluates the result for both data types and sets the
OF and CF flags to indicate a borrow in the signed or unsigned result,
respectively. The SF flag indicates the sign of the signed result.

The SBB instruction is usually executed as part of a multibyte or multiword
subtraction in which a SUB instruction is followed by a SBB instruction.


  Flags affected
 

The OF, SF, ZF, AF, PF, and CF flags are set according to the result.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 reg, reg     2       3       3       2       2       1       1   PU
 mem, reg  2+d(0,2)  24+EA   10       7       7       3       3   PU
 reg, mem  2+d(0,2)  13+EA   10       7       6       2       2   PU
 reg, imm  2+i(1,2)   4       4       3       2       1       1   PU
 mem, imm  2+d(0,2)  23+EA   16       7       7       3       3   PU*
            +i(1,2)
 acc, imm  1+i(1,2)   4       4       3       2       1       1   PU

 * = not pairable if there is a displacement and immediate


  Example
 

 sbb eax, ebx   ; EAX = EAX - (EBX + CF)


 { Back to contents screen:hcContents}


.topic SCAS

  SCAS - Scan string
 

  Description
 

Compares the byte, word, or double word specified with the memory operand
with the value in the AL, AX, or EAX register, and sets the status flags in
the EFLAGS register according to the results. The memory operand address is
read from either the ES:EDI or the ES:DI registers (depending on the
address-size attribute of the instruction, 32 or 16, respectively). The ES
segment cannot be overridden with a segment override prefix.

At the assembly-code level, two forms of this instruction are allowed: the
"explicit-operands" form and the "no-operands" form. The explicit-operand
form (specified with the SCAS mnemonic) is not supported by NASM.

The no-operands form provides "short forms" of the byte, word, and doubleword
versions of the SCAS instructions. Here also ES:(E)DI is assumed to be the
memory operand and the AL, AX, or EAX register is assumed to be the register
operand. The size of the two operands is selected with the mnemonic: SCASB
(byte comparison), SCASW (word comparison), or SCASD (doubleword comparison).

After the comparison, the (E)DI register is incremented or decremented
automatically according to the setting of the DF flag in the EFLAGS register.
(If the DF flag is 0, the (E)DI register is incremented; if the DF flag is 1,
the (E)DI register is decremented.) The (E)DI register is incremented or
decremented by 1 for byte operations, by 2 for word operations, or by 4 for
double-word operations.

The SCAS, SCASB, SCASW, and SCASD instructions can be preceded by the {REP}
prefix for block comparisons of ECX bytes, words, or doublewords. More often,
however, these instructions will be used in a LOOP construct that takes some
action based on the setting of the status flags before the next comparison is
made.


  Flags affected
 

The OF, SF, ZF, AF, PF, and CF flags are set according to the temporary
result of the comparison.


  Instruction size and timings
 

 variations  bytes   8088    186     286     386     486     Pentium
 scasb        1      19      15       7       7       6       4   NP
 scasw        1      19      15       7       7       6       4   NP
 scasd        1       -       -       -       7       6       4   NP
 repX scasb   2      9+15n   5+15n   5+8n    5+8n    7+5n*   8+4n NP
 repX scasw   2      9+19n   5+15n   5+8n    5+8n    7+5n*   8+4n NP
 repX scasd   2       -       -       -      5+8n    7+5n*   8+4n NP

 repX = repe or repz or repne or repnz
 * = 5 if n=0 (where n = count of bytes, words or dwords)


  Example
 

 repne scasb    ; Repeat comparison


 { Back to contents screen:hcContents}


.topic SETcc

  SETcc - Set byte on condition (386+)
 

  Description
 

Set the destination operand to 0 or 1 depending on the settings of the status
flags (CF, SF, OF, ZF, and PF) in the EFLAGS register. The destination operand
points to a byte register or a byte in memory. The condition code suffix (cc)
indicates the condition being tested for.

The terms "above" and "below" are associated with the CF flag and refer to
the relationship between two unsigned integer values. The terms "greater"
and "less" are associated with the SF and OF flags and refer to the
relationship between two signed integer values.

Many of the SETcc instruction opcodes have alternate mnemonics. For example,
the SETG (set byte if greater) and SETNLE (set if not less or equal) both
have the same opcode and test for the same condition: ZF equals 0 and SF
equals OF. These alternate mnemonics are provided to make code more
intelligible.

Some languages represent a logical one as an integer with all bits set. This
representation can be obtained by choosing the logically opposite condition
for the SETcc instruction, then decrementing the result. For example, to test
for overflow, use the SETNO instruction, then decrement the result.

 Instruction  Description
 SETA         Set byte if above (CF=0 and ZF=0)
 SETAE        Set byte if above or equal (CF=0)
 SETB         Set byte if below (CF=1)
 SETBE        Set byte if below or equal (CF=1 or ZF=1)
 SETC         Set if carry (CF=1)
 SETE         Set byte if equal (ZF=1)
 SETG         Set byte if greater (ZF=0 and SF=OF)
 SETGE        Set byte if greater or equal (SF=OF)
 SETL         Set byte if less (SF<>OF)
 SETLE        Set byte if less or equal (ZF=1 or SF<>OF)
 SETNA        Set byte if not above (CF=1 or ZF=1)
 SETNAE       Set byte if not above or equal (CF=1)
 SETNB        Set byte if not below (CF=0)
 SETNBE       Set byte if not below or equal (CF=0 and ZF=0)
 SETNC        Set byte if not carry (CF=0)
 SETNE        Set byte if not equal (ZF=0)
 SETNG        Set byte if not greater (ZF=1 or SF<>OF)
 SETNGE       Set if not greater or equal (SF<>OF)
 SETNL        Set byte if not less (SF=OF)
 SETNLE       Set byte if not less or equal (ZF=0 and SF=OF)
 SETNO        Set byte if not overflow (OF=0)
 SETNP        Set byte if not parity (PF=0)
 SETNS        Set byte if not sign (SF=0)
 SETNZ        Set byte if not zero (ZF=0)
 SETO         Set byte if overflow (OF=1)
 SETP         Set byte if parity (PF=1)
 SETPE        Set byte if parity even (PF=1)
 SETPO        Set byte if parity odd (PF=0)
 SETS         Set byte if sign (SF=1)
 SETZ         Set byte if zero (ZF=1)


  Flags affected
 

None.


  Instruction size and timings
 

 operand   bytes                           386     486     Pentium
 r8         3                               4      4/3     1/2  NP
 mem8     3+d(0-2)                          5      3/4     1/2  NP


  Example
 

 setne al       ; AL = 1 if ZF = 0


 { Back to contents screen:hcContents}


.topic SGDT

  SGDT/SIDT - Store Global/Interrupt Descriptor Table (286+)
 

  Description
 

Stores the contents of the global descriptor table register (GDTR) or the
interrupt descriptor table register (IDTR) in the destination operand. The
destination operand specifies a 6-byte memory location. If the operand-size
attribute is 32 bits, the 16-bit limit field of the register is stored in the
lower 2 bytes of the memory location and the 32-bit base address is stored in
the upper 4 bytes. If the operand-size attribute is 16 bits, the limit is
stored in the lower 2 bytes and the 24-bit base address is stored in the
third, fourth, and fifth byte, with the sixth byte filled with 0s.

The SGDT and SIDT instructions are only useful in operating-system software;
however, they can be used in application programs without causing an
exception to be generated.

See {LGDT/LIDT:LGDT} for information on loading the GDTR and IDTR.

The 16-bit forms of the SGDT and SIDT instructions are compatible with the
286 processor, if the upper 8 bits are not referenced. The 286 processor
fills these bits with 1s; the Pentium Pro, Pentium, 486, and 386 processors
fill these bits with 0s.


  Flags affected
 

None.


  Instruction size and timings
 


  SGDT

 operand    bytes                   286     386     486     Pentium
 mem48       5                      11       9      10       4   NP

  SIDT

 operand    bytes                   286     386     486     Pentium
 mem48       5                      12       9      10       4   NP


  Example
 

 sgdt descriptor[ebx]           ; Store given descriptor


 { Back to contents screen:hcContents}


.topic SHLD

  SHLD - Double precision shift left (386+)
 

  Description
 

Shifts the first operand (destination operand) to the left the number of bits
specified by the third operand (count operand). The second operand (source
operand) provides bits to shift in from the right (starting with bit 0 of the
destination operand). The destination operand can be a register or a memory
location; the source operand is a register. The count operand is an unsigned
integer that can be an immediate byte or the contents of the CL register.
Only bits 0 through 4 of the count are used, which masks the count to a value
between 0 and 31. If the count is greater than the operand size, the result
in the destination operand is undefined.

If the count is 1 or greater, the CF flag is filled with the last bit shifted
out of the destination operand. For a 1-bit shift, the OF flag is set if a
sign change occurred; otherwise, it is cleared. If the count operand is 0,
the flags are not affected.

The SHLD instruction is useful for multiprecision shifts of 64 bits or more.


  Flags affected
 

If the count is 1 or greater, the CF flag is filled with the last bit shifted
out of the destination operand and the SF, ZF, and PF flags are set according
to the value of the result. For a 1-bit shift, the OF flag is set if a sign
change occurred; otherwise, it is cleared. For shifts greater than 1 bit, the
OF flag is undefined. If a shift occurs, the AF flag is undefined. If the
count operand is 0, the flags are not affected. If the count is greater than
the operand size, the flags are undefined.


  Instruction size and timings
 

 operands        bytes                      386     486     Pentium
 reg, reg, imm    4                          3       2       4   NP
 mem, reg, imm   4+d(0-2)                    7       3       4   NP
 reg, reg, cl     4                          3       3       4   NP
 mem, reg, cl    4+d(0-2)                    7       4       5   NP


  Example
 

 shld eax, ebx, 16      ; Shift EAX 16 bits to the left with new bits coming
                        ; in from the right from EBX


 { Back to contents screen:hcContents}


.topic SHRD

  SHRD - Double precision shift right (386+)
 

  Description
 

Shifts the first operand (destination operand) to the right the number of
bits specified by the third operand (count operand). The second operand
(source operand) provides bits to shift in from the left (starting with the
most significant bit of the destination operand). The destination operand can
be a register or a memory location; the source operand is a register. The
count operand is an unsigned integer that can be an immediate byte or the
contents of the CL register. Only bits 0 through 4 of the count are used,
which masks the count to a value between 0 and 31. If the count is greater
than the operand size, the result in the destination operand is undefined.

If the count is 1 or greater, the CF flag is filled with the last bit shifted
out of the destination operand. For a 1-bit shift, the OF flag is set if a
sign change occurred; otherwise, it is cleared. If the count operand is 0,
the flags are not affected.

The SHRD instruction is useful for multiprecision shifts of 64 bits or more.


  Flags affected
 

If the count is 1 or greater, the CF flag is filled with the last bit shifted
out of the destination operand and the SF, ZF, and PF flags are set according
to the value of the result. For a 1-bit shift, the OF flag is set if a sign
change occurred; otherwise, it is cleared. For shifts greater than 1 bit, the
OF flag is undefined. If a shift occurs, the AF flag is undefined. If the
count operand is 0, the flags are not affected. If the count is greater than
the operand size, the flags are undefined.


  Instruction size and timings
 

 operands        bytes                      386     486     Pentium
 reg, reg, imm    4                          3       2       4   NP
 mem, reg, imm   4+d(0-2)                    7       3       4   NP
 reg, reg, cl     4                          3       3       4   NP
 mem, reg, cl    4+d(0-2)                    7       4       5   NP


  Example
 

 shrd eax, ebx, 16      ; Shift EAX 16 bits to the right with new bits coming
                        ; in from the left from EBX


 { Back to contents screen:hcContents}


.topic SLDT

  SLDT - Store Local Descriptor Table register (286+)
 

  Description
 

Stores the segment selector from the local descriptor table register (LDTR)
in the destination operand. The destination operand can be a general-purpose
register or a memory location. The segment selector stored with this
instruction points to the segment descriptor (located in the GDT) for the
current LDT. This instruction can only be executed in protected mode.

When the destination operand is a 32-bit register, the 16-bit segment
selector is copied into the lower-order 16 bits of the register. The
high-order 16 bits of the register are cleared to 0s for the Pentium Pro
processor and are undefined for Pentium, 486, and 386 processors. When the
destination operand is a memory location, the segment selector is written to
memory as a 16-bit quantity, regardless of the operand size.

The SLDT instruction is only useful in operating-system software; however,
it can be used in application programs.


  Flags affected
 

None.


  Instruction size and timings
 

 operands   bytes                   286     386     486     Pentium
 r16         3                       2       2       2       2   NP
 mem16     3+d(0-2)                  3       2       3       2   NP


  Example
 

 sldt ax        ; Store LDT from AX


 { Back to contents screen:hcContents}


.topic SMSW

  SMSW - Store Machine Status Word (286+)
 

  Description
 

Stores the machine status word (bits 0 through 15 of control register CR0)
into the destination operand. The destination operand can be a 16-bit
general-purpose register or a memory location.

When the destination operand is a 32-bit register, the low-order 16 bits of
register CR0 are copied into the low-order 16 bits of the register and the
upper 16 bits of the register are unde-fined. When the destination operand is
a memory location, the low-order 16 bits of register CR0 are written to memory
as a 16-bit quantity, regardless of the operand size.

The SMSW instruction is only useful in operating-system software; however, it
is not a privileged instruction and can be used in application programs.

This instruction is provided for compatibility with the 286 processor.
Programs and procedures intended to run on the Pentium Pro, Pentium, 486, and
386 processors should use the {MOV} (control registers) instruction to load
the machine status word.


  Flags affected
 

None.


  Instruction size and timings
 

 operands   bytes                   286     386     486     Pentium
 r16         3                       2       2       2       4   NP
 mem16     3+d(0-2)                  3       3       3       4   NP


  Example
 

 smsw ax        ; Store Machine Status Word from AX


 { Back to contents screen:hcContents}


.topic STC

  STC - Set carry flag
 

  Description
 

Set the CF flag in the EFLAGS register.


  Flags affected
 

The CF flag is set. The OF, ZF, SF, AF, and PF flags are unaffected.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       2       2       2       2   NP


  Example
 

 stc       ; Set carry flag


 { Back to contents screen:hcContents}


.topic STD

  STD - Set direction flag
 

  Description
 

Sets the DF flag in the EFLAGS register. When the DF flag is set to 1, string
operations decrement the index registers (ESI and/or EDI).


  Flags affected
 

The DF flag is set. The CF, OF, ZF, SF, AF, and PF flags are unaffected.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       2       2       2       2   NP


  Example
 

 std       ; Set direction flag


 { Back to contents screen:hcContents}


.topic STI

  STI - Set interrupt flag
 

  Description
 

Sets the interrupt flag (IF) in the EFLAGS register. After the IF flag is set,
the processor begins responding to external, maskable interrupts after the
next instruction is executed. The delayed effect of this instruction is
provided to allow interrupts to be enabled just before returning from a
procedure (or subroutine). For instance, if an STI instruction is followed by
an RET instruction, the RET instruction is allowed to execute before external
interrupts are recognized. This behavior allows external interrupts to be
disabled at the beginning of a procedure and enabled again at the end of the
procedure. If the STI instruction is followed by a CLI instruction (which
clears the IF flag), the effect of the STI instruction is negated.

The IF flag and the STI and CLI instructions have no affect on the generation
of exceptions and NMI interrupts.


  Flags affected
 

The IF flag is set to 1.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1       2       2       2       3       5       7   NP


  Example
 

 sti       ; Set interrupt flag


 { Back to contents screen:hcContents}


.topic STOS

  STOS - Store string data
 

  Description
 

Stores a byte, word, or doubleword from the AL, AX, or EAX register,
respectively, into the destination operand. The destination operand is a
memory location, the address of which is read from either the ES:EDI or the
ES:DI registers (depending on the address-size attribute of the instruction,
32 or 16, respectively). The ES segment cannot be overridden with a segment
over-ride prefix.

At the assembly-code level, two forms of this instruction are allowed: the
"explicit-operands" form and the "no-operands" form. The explicit-operands
form (specified with the STOS mnemonic) is not supported by NASM.

The no-operands form provides "short forms" of the byte, word, and doubleword
versions of the STOS instructions. Here also ES:(E)DI is assumed to be the
destination operand and the AL, AX, or EAX register is assumed to be the
source operand. The size of the destination and source operands is selected
with the mnemonic: STOSB (byte read from register AL), STOSW (word from AX),
or STOSD (doubleword from EAX).

After the byte, word, or doubleword is transferred from the AL, AX, or EAX
register to the memory location, the (E)DI register is incremented or
decremented automatically according to the setting of the DF flag in the
EFLAGS register. (If the DF flag is 0, the (E)DI register is incremented;
if the DF flag is 1, the (E)DI register is decremented.) The (E)DI register
is incremented or decremented by 1 for byte operations, by 2 for word
operations, or by 4 for doubleword operations.

The STOSB, STOSW, and STOSD instructions can be preceded by the {REP} prefix
for block loads of ECX bytes, words, or doublewords. More often, however,
these instructions are used within a LOOP construct because data needs to be
moved into the AL, AX, or EAX register before it can be stored.


  Flags affected
 

None.


  Instruction size and timings
 

 variations  bytes   8088    186     286     386     486     Pentium
 stosb        1      11      10       3       4       5       3   NP
 stosw        1      15      10       3       4       5       3   NP
 stosd        1       -       -       -       4       5       3   NP
 rep stosb    2      9+10n   6+9n    4+3n    5+5n    7+4n*   3+n  NP
 rep stosw    2      9+14n   6+9n    4+3n    5+5n    7+4n*   3+n  NP
 rep stosd    2       -       -       -      5+5n    7+4n*   3+n  NP

 * = 5 if n=0, 13 if n=1 (where n = count of bytes, words or dwords)


  Example
 

 rep stosb      ; Store ECX bytes from ax ax in ES:DI


 { Back to contents screen:hcContents}


.topic STR

  STR - Store task register (286+)
 

  Description
 

Stores the segment selector from the task register (TR) in the destination
operand. The destination operand can be a general-purpose register or a
memory location. The segment selector stored with this instruction points to
the task state segment (TSS) for the currently running task.

When the destination operand is a 32-bit register, the 16-bit segment
selector is copied into the lower 16 bits of the register and the upper 16
bits of the register are cleared to 0s. When the destination operand is a
memory location, the segment selector is written to memory as a 16-bit
quantity, regardless of operand size.

The STR instruction is useful only in operating-system software. It can only
be executed in protected mode.


  Flags affected
 

None.


  Instruction size and timings
 

 operand     bytes                   286     386     486     Pentium
 r16          3                       2       2       2       2   NP
 mem16     3+d(0-2)                   3       2       3       2   NP


  Example
 

 str bx    ; Store task register to BX


 { Back to contents screen:hcContents}


.topic SUB

  SUB - Integer subtraction
 

  Description
 

Subtracts the second operand (source operand) from the first operand
(destination operand) and stores the result in the destination operand. The
destination operand can be a register or a memory location; the source
operand can be an immediate, register, or memory location. (However, two
memory operands cannot be used in one instruction.) When an immediate value
is used as an operand, it is sign-extended to the length of the destination
operand format.

The SUB instruction does not distinguish between signed or unsigned operands.
Instead, the processor evaluates the result for both data types and sets the
OF and CF flags to indicate a borrow in the signed or unsigned result,
respectively. The SF flag indicates the sign of the signed result.


  Flags affected
 

The OF, SF, ZF, AF, PF, and CF flags are set according to the result.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 reg, reg     2       3       3       2       2       1       1   UV
 mem, reg  2+d(0,2)  24+EA   10       7       7       3       3   UV
 reg, mem  2+d(0,2)  13+EA   10       7       6       2       2   UV
 reg, imm  2+i(1,2)   4       4       3       2       1       1   UV
 mem, imm  2+d(0,2)  23+EA   16       7       7       3       3   UV*
            +i(1,2)
 acc, imm  1+i(1,2)   4       4       3       2       1       1   UV

 * = not pairable if there is a displacement and immediate


  Example
 

 sub eax, ebx           ; EAX = EAX - EBX


 { Back to contents screen:hcContents}


.topic TEST

  TEST - Logical compare
 

  Description
 

Computes the bit-wise logical AND of first operand (source 1 operand) and the
second operand (source 2 operand) and sets the SF, ZF, and PF status flags
according to the result. The result is then discarded.


  Flags affected
 

The OF and CF flags are cleared to 0. The SF, ZF, and PF flags are set
according to the result. The state of the AF flag is undefined.


  Instruction size and timings
 

 operands   bytes   8088    186     286     386     486     Pentium
 reg, reg    2       3       3       2       2       1       1   UV
 mem, reg 2+d(0,2)  13+EA   10       6       5       2       2   UV
 reg, mem 2+d(0,2)  13+EA   10       6       5       2       2   UV
 reg, imm 2+i(1,2)   5       4       3       2       1       1   UV
 mem, imm 2+d(0,2)  11+EA   10       6       5       2       2   UV*
           +i(1,2)
 acc, imm 1+i(1,2)   4       4       3       2       1       1   UV

 * = not pairable if there is a displacement and immediate


  Example
 

 test eax, ebx          ; Test values in EAX and EBX and set flags accordingly


 { Back to contents screen:hcContents}


.topic UD2

  UD2 - Undefined instruction
 

  Description
 

Generates an invalid opcode. This instruction is provided for software testing
to explicitly generate an invalid opcode. The opcode for this instruction is
reserved for this purpose.

Other than raising the invalid opcode exception, this instruction is the same
as the {NOP} instruction.


  Flags affected
 

None.


  Instruction size and timings
 

Not available.


  Example
 

 UD2       ; Raise invalid opcode exception


 { Back to contents screen:hcContents}


.topic VERR

  VERR/VERW - Verify a segment for reading/writing (286+)
 

  Description
 

Verifies whether the code or data segment specified with the source operand
is readable (VERR) or writable (VERW) from the current privilege level (CPL).
The source operand is a 16-bit register or a memory location that contains
the segment selector for the segment to be verified. If the segment is
accessible and readable (VERR) or writable (VERW), the ZF flag is set;
otherwise, the ZF flag is cleared. Code segments are never verified as
writable. This check cannot be performed on system segments.

To set the ZF flag, the following conditions must be met:

  The segment selector is not null.
  The selector must denote a descriptor within the bounds of the descriptor
   table (GDT or LDT).
  The selector must denote the descriptor of a code or data segment (not
   that of a system segment or gate).
  For the VERR instruction, the segment must be readable.
  For the VERW instruction, the segment must be a writable data segment.
  If the segment is not a conforming code segment, the segment's DPL must be
   greater than or equal to (have less or the same privilege as) both the CPL
   and the segment selector's RPL.

The validation performed is the same as is performed when a segment selector
is loaded into the DS, ES, FS, or GS register, and the indicated access (read
or write) is performed. The segment selector's value cannot result in a
protection exception, enabling the software to anticipate possible segment
access problems.


  Flags affected
 

The ZF flag is set to 1 if the segment is accessible and readable (VERR) or
writable (VERW); otherwise, it is cleared to 0.


  Instruction size and timings
 

  VERR

 operand    bytes                   286     386     486     Pentium
 r16         3                      14      10      11       7   NP
 mem16    3+d(0,2)                  16      11      11       7   NP

  VERW

 operand    bytes                   286     386     486     Pentium
 r16         3                      14      15      11       7   NP
 mem16    3+d(0,2)                  16      16      11       7   NP


  Example
 

 verr ax        ; Verify segment given in AX is readable
 verw ax        ; Verify segment given in AX is writeable


 { Back to contents screen:hcContents}


.topic WAIT

  WAIT - Wait for coprocessor
 

  Description
 

Causes the processor to check for and handle pending, unmasked, floating-point
exceptions before proceeding.

This instruction is useful for synchronizing exceptions in critical sections
of code. Coding a WAIT instruction after a floating-point instruction insures
that any unmasked floating-point exceptions the instruction may raise are
handled before the processor can modify the instruction's results.


  Flags affected
 

The C0, C1, C2, and C3 FPU flags are undefined.


  Instruction size and timings
 

  bytes   8088    186     286     386     486     Pentium
   1       4       6       3       6      1-3      1   NP


  Example
 

 wait           ; Wait for coprocessor


 { Back to contents screen:hcContents}


.topic WBINVD

  WBINVD -  Write back and invalidate cache (486+)
 

  Description
 

Writes back all modified cache lines in the processor's internal cache to
main memory and inval-idates (flushes) the internal caches. The instruction
then issues a special-function bus cycle that directs external caches to also
write back modified data and another bus cycle to indicate that the external
caches should be invalidated.

After executing this instruction, the processor does not wait for the external
caches to complete their write-back and flushing operations before proceeding
with instruction execution. It is the responsibility of hardware to respond
to the cache write-back and flush signals.

The WDINVD instruction is a privileged instruction. When the processor is
running in protected mode, the CPL of a program or procedure must be 0 to
execute this instruction. This instruction is also a serializing instruction.

In situations where cache coherency with main memory is not a concern,
software can use the {INVD} instruction.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes                                   486     Pentium
  2                                       5    2000+  NP


  Example
 

 wbinvd          ; Write back and invalidate cache


 { Back to contents screen:hcContents}


.topic WRMSR

  WRMSR - Write to Model Specific Register (Pentium+)
 

  Description
 

Writes the contents of registers EDX:EAX into the 64-bit model specific
register (MSR) specified in the ECX register. The high-order 32 bits are
copied from EDX and the low-order 32 bits are copied from EAX. Always set the
undefined or reserved bits in an MSR to the values previously read.

This instruction must be executed at privilege level 0 or in real-address
mode; otherwise, a general protection exception will be generated. Specifying
a reserved or unimplemented MSR address in ECX will also cause a general
protection exception.

When the WRMSR instruction is used to write to an MTRR, the TLBs are
invalidated, including the global entries. (MTRRs are an
implementation-specific feature of the Pentium Pro processor.)

The MSRs control functions for testability, execution tracing,
performance monitoring and machine check errors. The WRMSR instruction is a
serializing instruction.

The {CPUID} instruction should be used to determine whether MSRs are supported
(EDX[5]=1) before using this instruction.

The MSRs and the ability to read them with the WRMSR instruction were
introduced into the Intel Architecture with the Pentium processor. Execution
of this instruction by an Intel Architecture processor earlier than the
Pentium processor results in an invalid opcode exception.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes                                           Pentium
  2                                             30-45 NP


  Example
 

 wrmsr          ; Write EDX:EAX to MSR specified by ECX


 { Back to contents screen:hcContents}


.topic XADD

  XADD - Exchange and add (486+)
 

  Description
 

Exchanges the first operand (destination operand) with the second operand
(source operand), then loads the sum of the two values into the destination
operand. The destination operand can be a register or a memory location; the
source operand is a register.

This instruction can be used with a {LOCK} prefix.

Intel Architecture processors earlier than the 486 processor do not recognize
this instruction. If this instruction is used, you should provide an
equivalent code sequence that runs on earlier processors.


  Flags affected
 

The CF, PF, AF, SF, ZF, and OF flags are set according to the result of the
addition, which is stored in the destination operand.


  Instruction size and timings
 

 operands   bytes                                   486     Pentium
 reg, reg    3                                       3       3   NP
 mem, reg   3+d(0-2)                                 4       4   NP


  Example
 

 xadd eax, ebx          ; Exchange and add EAX and EBX


 { Back to contents screen:hcContents}


.topic XCHG

  XCHG - Exchange register/memory with register
 

  Description
 

Exchanges the contents of the destination (first) and source (second) operands.
The operands can be two general-purpose registers or a register and a memory
location. If a memory operand is referenced, the processor's locking protocol
is automatically implemented for the duration of the exchange operation,
regardless of the presence or absence of the {LOCK} prefix or of the value of
the IOPL.

This instruction is useful for implementing semaphores or similar data
structures for process synchronization.

The XCHG instruction can also be used instead of the {BSWAP} instruction for
16-bit operands.


  Flags affected
 

None.


  Instruction size and timings
 

 operands   bytes   8088    186     286     386     486     Pentium
 reg, reg    2       4       4       3       3       3       3   NP
 reg, mem  2+d(0-2)  25+EA  17       5       5       5       3   NP
 mem, reg  2+d(0-2)  25+EA  17       5       5       5       3   NP
 acc, reg    1       3       3       3       3       3       2   NP
 reg, acc    1       3       3       3       3       3       2   NP

 Note: in this case acc = AX or EAX only


  Example
 

 xchg ax, dx    ; AX = DX and DX = original AX


 { Back to contents screen:hcContents}


.topic XLAT

  XLAT/XLATB - Table look-up translation
 

  Description
 

Locates a byte entry in a table in memory, using the contents of the AL
register as a table index, then copies the contents of the table entry back
into the AL register. The index in the AL register is treated as an unsigned
integer. The XLAT and XLATB instructions get the base address of the table in
memory from either the DS:EBX or the DS:BX registers (depending on the
address-size attribute of the instruction, 32 or 16, respectively). (The DS
segment may be overridden with a segment override prefix.)

At the assembly-code level, two forms of this instruction are allowed: the
"explicit-operand" form and the "no-operand" form. The explicit-operand form
(specified with the XLAT mnemonic) allows the base address of the table to be
specified explicitly with a symbol. This explicit-operands form is provided
to allow documentation; however, note that the documentation provided by this
form can be misleading. That is, the symbol does not have to specify the
correct base address. The base address is always specified by the DS:(E)BX
registers, which must be loaded correctly before the XLAT instruction is
executed.

The no-operands form (XLATB) provides a "short form" of the XLAT instructions.
Here also the processor assumes that the DS:(E)BX registers contain the base
address of the table.


  Flags affected
 

None.


  Instruction size and timings
 

 bytes   8088    186     286     386     486     Pentium
  1      11      11       5       5       4       4   NP


  Example
 

 xlat      ; Equivalent to {MOV} AL, [BX + AL]


 { Back to contents screen:hcContents}


.topic XOR

  XOR - Logical exclusive OR
 

  Description
 

Performs a bitwise exclusive OR (XOR) operation on the destination (first)
and source (second) operands and stores the result in the destination operand
location. The source operand can be an immediate, a register, or a memory
location; the destination operand can be a register or a memory location.
(However, two memory operands cannot be used in one instruction.) Each bit of
the result is 1 if the corresponding bits of the operands are different; each
bit is 0 if the corresponding bits are the same.


  Flags affected
 

The OF and CF flags are cleared; the SF, ZF, and PF flags are set according
to the result. The state of the AF flag is undefined.


  Instruction size and timings
 

 operands    bytes   8088    186     286     386     486     Pentium
 reg, reg     2       3       3       2       2       1       1   UV
 mem, reg  2+d(0,2)  24+EA   10       7       7       3       3   UV
 reg, mem  2+d(0,2)  13+EA   10       7       6       2       2   UV
 reg, imm  2+i(1,2)   4       4       3       2       1       1   UV
 mem, imm  2+d(0,2)  23+EA   16       7       7       3       3   UV*
            +i(1,2)
 acc, imm  1+i(1,2)   4       4       3       2       1       1   UV

 * = not pairable if there is a displacement and immediate


  Example
 

 xor eax, ebx           ; EAX = EAX xor EBX


 { Back to contents screen:hcContents}

