Programming/Assembly

From Dev Wiki
< Programming
Revision as of 14:21, 3 February 2020 by Brodriguez (talk | contribs) (Correct typo)
Jump to navigation Jump to search

Syntax Types

For 32 bit (x86) assembly, there are two main syntax types. AT&T is mostly used in Unix environments, while Intel is mostly used in MS-DOS and Windows.
The differences are as follows:

AT&T Intel
Signs Instructions need size definition suffix (see #Instruction Sizes).
Values need % prefix for registers, and $ prefix for constants.
Ex:%eax
Automatically detects size and type of value. Signs are not needed.
Ex:eax
Value Order Source first, destination second.
Ex:mov $5, %eax
Destination first, source second.
Ex:mov eax, 5
Value Size Size suffix (see #Instruction Sizes) must be added to instruction.
Ex:addl %eax, %ebx
Size automatically derived from register used. In instances where size is ambiguous, must use a size keyword (byte, word, dword, qword).
Ex:add eax, ebx
Effective Address Uses general memory address syntax.
Ex:(%ebx, %ecx, 4)
Uses arithmetic expressions in square brackets.
Ex:[ebx + ecx*4]

Registers

The following registers exist in 64 and 32 bit assembly.

Stack Pointer Stack Base Pointer Accumulator Base Counter Data Source Destination
64 Bit RSP RBP RAX RBX RCX RDX RSI RDI
32 Bit ESP EBP EAX EBX ECX EDX ESI EDI
16 Bit SP BP AX BX CX DX SI DI
8 Bit SPL BPL AH AL BH BL CH CL DH DL SIL DIL


The following registers only exist in 64 bit assembly.

Temp 1 Temp 2 Temp 3 Temp 4 Temp 5 Temp 6 Temp 7 Temp 8
64 Bit R8 R9 R10 R11 R12 R13 R14 R15
32 Bit R8D R9D R10D R11D R12D R13D R14D R15D
16 Bit R8W R9W R10W R11W R12W R13W R14W R15W
8 Bit R8B R9B R10B R11B R12B R13B R14B R15B

In-depth details of how assembly register and function calling should work: https://www.cs.princeton.edu/courses/archive/spring11/cos217/lectures/15AssemblyFunctions.pdf

Instruction Sizes

In 64 bit assembly, some assembly instructions will have letters appended to the end of the instruction, indicating the size of data being referenced. The letters are the following:

  • Byte (b) - A one-byte (8 bit) value.
  • Word (w) - A two-byte (16 bit) value.
  • DoubleWord (l) - A four-byte (32 bit) value.
  • QuadWord (q) - A eight-byte (64 bit) value.

Instructions

For all of the below, letters indicate what kind of value is accepted for each argument. The letters correspond to the following:

  • r - Register
  • m - Memory
  • c - Constant
  • l - Label

All of these instructions are written in Intel syntax format. For reference on how to convert to AT&T, see #Syntax Types.

Data Movement

  • mov <rm>, <rmc> - Copies second value to first value. Memory-to-memory moves are not possible.
  • push <rmc> - Pushes value to stack. Updates stack pointer register (rsp, esp) accordingly. Recall that stack grows "downward" so this subtracks from the stack pointer value.
  • pop <rm> - Pops from top of stack and puts into location. Similarly to push, this updates stack pointer register accordingly.
  • lea <r>, <m> - Pointer to address specified in second value is placed into register of first value.

Arithmatic and Logic

  • add <rm>, <rmc> - Add together both values. Store result in register of first value.
  • sub <rm>, <rmc> - Subtract second value from first value. Store result in register of first value.
  • inc <rm> - Increment value.
  • dec <rm> - Decriment value.
  • imul <r> <rm> - First syntax for imul. Multiplies values together, stores in first value.
  • imul <r> <rm> <c> - Second syntax for imul. Multiplies second and third values together, stores in register of first value.
  • idiv <rm> - Temporarily merges registers edx and eax into edx:eax. Divides this larger register by passed value. Result stored in eax while remainder stored in edx
  • and <rm> <rmc> - Performs logical binary AND operation on values. Puts result in location of first value.
  • or <rm> <rmc> - Performs logical binary OR operation on values. Puts result in location of first value.
  • xor <rm> <rmc> - Performs logical binary XOR operation on values. Puts result in location of first value.
  • not <rm> - Performs two's compliment negation on value.
  • shl <rm> <c> - Shift left. Does this a number of times equal to the second value. Puts result in locaiton of first value.
  • shr <rm> <c> - Shift right. Does this a number of times equal to the second value. Puts result in locaiton of first value.

Control Flow

  • cmp <rm> <rmc> - Compare two values. Set condition register values accordingly.
  • jmp <l> - Aka "jump". Moves program logic to memory location indicated by value.
  • je <l> - Jump when equal, based on condition of register status codes.
  • jne <l> - Jump when not equal, based on condition of register status codes.
  • jz <l> - Jump when last result was 0, based on condition of register status codes.
  • jg <l> - Jump when greater than, based on condition of register status codes.
  • jge <l> - Jump when greater than or equal, based on condition of register status codes.
  • jl <l> - Jump when less than, based on condition of register status codes.
  • jli <l> - Jump when less than or equal, based on condition of register status codes.
  • call <l> - Pushes current code location onto stack, then jumps to location indicated by value.
  • ret - Pops top code location from stack, then jumps to indicated location.