Introduction to assembly language

Introduction to assembly language

Learn the basic concepts necessary to analyze code in assembly language.


Assembly is the programming language closest to the machine code executed by computer processors. When reverse engineering software, especially when you don’t have access to the original source code, you often end up working with assembly-level code. Understanding assembler allows you to understand exactly what a program is doing at the hardware level. In this blog I summarized the knowledge I gained about assembly language from reading the book Practical Binary Analysis from the publisher No Starch Press.

Design of a program in assembly

Below is a comparison between C language code and assembly language code.

code in C language

#include <stdio.h>

int main(int argc, char *argvp[]){

     printf("Hello world\n");

     return 0;
}

assembly language code

     .file      "hello.c"
     .intel_syntax noprefix
     .section   .rodata
.LC0:
     .string    "Hello world\n"
     .text
     .global    main
     .type      main, @function
main:
     push       rbp
     mov        rbp, rsp
     sub        rsp, 16
     mov        DWORD PTR [rbp-4], edit
     mov        QWORD PTR [rbp-16], rsi
     mov        edit, OFFSET FLAT:.LC0
     call       puts
     mov        eax, 0
     leave
     ret
     .size      main, .-main
     .ident     "GCC: (Ubuntu 5.4.0-6ubuntu1..)"
     .section   .note.GNU-stack,""

The C language source code consists of a main function that makes a call to the printf function to display the message “Hello world” on the screen. At a higher level, the corresponding assembly program consists of 4 types of components: instructions, directives, tags and comments.

In assembly language source code, the .section .rodata directive tells the assembler to place the following content in the .rodata section, which is dedicated to storing read-only constant data. The .section directive, as mentioned, tells the assembler in which section to place the content that follows it, while .string is a directive that allows the definition of an ASCII string. There are also other directives to define other data types such as .byte .word .long and .quad .

The main function is located in the .text section, dedicated to storing the code. The .text directive is a short form of .section .text and main: introduces a symbolic tag for the main function.

After defining the main tag, continue with the instructions contained in main. These instructions can symbolically refer to information previously declared as .LC0 (.LC0 is the symbolic name that the gcc compiler chose for the string “Hello world”).


Assembly language instructions, directives, labels, and comments

TypeExampleDescription
Instructionmov eax, 0eax=0
directive.section .textLocate the following content in the section .text
directive.string “foobar”Defines an ASCII string which contains “foobar”
directive.long 0x12345678Defines a double word with the value 0x12345678
Labelfoo: .string “foobar”Defines the string “foobar” with the symbolic name foo
Comment\# This is a commentA comment

The instructions are the operations that the CPU executes. directives are commands that tell the assembler to produce a particular piece of data, place instructions or information in a particular section, etc. Finally tags are symbolic names that can be used to refer to instructions or data in the assembly program and comments are simply texts made to document the code to other people.


Separation between code and data

In the assembly source code seen above, you can distinguish the code from the data as they were separated into different sections. This makes it easier to inspect the code since you can see which bytes the code corresponds to and which bytes the data corresponds to, however this is not always the case since there is nothing in the x86 architecture that prevents the code from being mixed up. and data in the same section and in practice some hand-crafted compilers or assemblers do exactly this.


AT&T Syntax vs Intel Syntax

When analyzing code it is important to determine the syntax we are observing since the way in which we work with operands is different, among other things.

For example, the AT&T syntax explicitly prefixes registers with % and prefixes $. with constants.

The order of the operands also changes:

; at&t syntax
mov     $0x6, %edi

; intel syntax
mov     edit, 0x6

; operation performed
ed = 0x6

Structure of an x86 instruction

At the assembly level, instructions generally have the following form:

mnemonic destination, source

The mnemonic is the representation of a machine instruction, and the source and destination are the operands of the instruction.

Not all instructions have two operands, some instructions do not even have operands.


Machine-level structure of x86 instructions

The x86 ISA (Instruction Set Architecture) uses variable length instructions. That is, there are instructions that consist of only 1 byte, multibyte instructions, and there can also be instructions that reach up to 15 bytes.

In addition to this, instructions can start at any memory address. This means that the CPU does not force any alignment on the code, although some compilers sometimes align code to optimize performance.

PrefixOpcodeOffsetImmediateAddressing modeSIB Byte
0-4 bytes1-3 bytes0/1/2/4 bytes0/1/2/4 bytes0-10-1

An instruction consists of an optional prefix, an opcode, and zero or more operands. All parts are optional except the opcode.

The opcode is the main part of the instruction, while prefixes can modify the behavior of an instruction.

Some instructions have implicit operands. That is, they are not explicitly shown in the instruction since they are innate to the opcode. For example, the recipient operand of opcode 0x05 (an add instruction) is always rax and only the source operand (src) is variable and needs to be explicitly specified.

Another example of implicit operands is the push instruction which implicitly updates rsp (stack pointer register).

Instructions can have different types of operands:

  • register operands
  • memory operands
  • immediate operands

Register Operands

Registers are small but very fast pieces of storage stored in the CPU. Some registers have special purposes, such as the instruction pointer that stores the current address of execution or the stack pointer that stores the address of the top of the stack.

General purpose records:

In the 8086 instruction set, the registers were 16 bits. The x86 ISA extended registers to 32 bits, and x86-64 extended them further to 64 bits. To maintain compatibility, the registers used in new instruction sets are supersets of the old registers.

To specify a register operand in assembler, the register name is used. For example, what mov rax, 64 does is move the value 64 to the rax register. In the previous example we are using the 64-bit rax register, if we wanted to use the 32-bit part we would have to specify the name eax and if we wanted to use the 16-bit part we would use the name ax and finally, if we wanted to use the upper 8 bits of ax we would use the name ah and to access the lower 8 bits we would use the name al.

x86-64 rax registry subdivision

Other registries such as rbx, rcx, rdx among others, follow the same naming system to access their lower parts.

Registers r8-r15 were added in x86-64 and are not available in earlier variants of x86.

General Purpose Registers

Other records

In addition to the registers mentioned above, there are also other registers such as rip (eip in 32 bit x86 and ip in 8086) and rflags (called eflags or flags). The rip register always points to the address of the next instruction to be executed and is automatically updated by the CPU. The status flags register is used for comparisons and conditionals as well as tracking things like whether the last operation returned zero or resulted in overflow, etc.


Memory operands

Memory operands specify a memory address where the CPU should look for one or more bytes. The x86 ISA supports only one explicit memory operand per instruction. This means that you cannot move bytes from one memory location to another location in the same instruction. To do this, logs must be used as buffer storage.

On x86, memory operands are specified as follows:

[base + index*scale + displacement]

Base and index are 64-bit registers, scale is an integer with the value 1, 2, 4, or 8, and displacement is a 32-bit constant or a symbol.

For example, you can use a statement like mov eax, DWORD PTR [rax*4 + arr] to access an array, where arr is the offset containing the starting address of the array, rax contains the index of the element you want to access from the array and each element of the array is 4 bytes, that is why rax * 4 is multiplied. DWORD PTR tells the assembler that we want 4 bytes (a doubleword or DWORD) of memory.


Immediate operands

These operands are simply constants encoded in the instruction. For example, in the instruction add rax, 42 , the value 42 is the immediate one.

On x86, immediates are encoded in little-endian format.


Common x86 instructions

List of online instructions:

  • http://ref.x86asm.net/
  • https://software.intel.com/en-us/articles/intel-sdm/
+--------------------------------------------------------------------+
|                          Data Transfer                             | 
+----------------------+---------------------------------------------+
|    Instruction       |       Description                           |
+----------------------+---------------------------------------------+
|  mov dst, src        |  dst = src                                  |
|  xchg dst1, dst2     |  swap dst1 and dst2                         |
|  push src            |  push src onto stack and decrement rsp      |
|  pop dst             |  pop value from stack into dst and inc rsp  |
+--------------------------------------------------------------------+
|                 Arithmetic                                         | 
+----------------------+---------------------------------------------+
|    Instruction       |       Description                           |
+----------------------+---------------------------------------------+
|  add dst, src        |  dst += src                                 |
|  sub dst, src        |  dst -= src                                 |
|  inc dst             |  dst += 1                                   |
|  dec dst             |  dst -= 1                                   |
|  neg dst             |  dst = -dst                                 |
|  cmp src1, src2      |  set status flag based on src1 - src2       |
+--------------------------------------------------------------------+
|                 Logical/bitwise                                    | 
+----------------------+---------------------------------------------+
|    Instruction       |       Description                           |
+----------------------+---------------------------------------------+
|  and dst, src        |  dst &= src                                 |
|  or  dst, src        |  dst |= src                                 |
|  xor dst, src        |  dst ^= src                                 |
|  not dst             |  dst = ~dst                                 |
|  test src1, src2     |  set status flag based on src1 & src2       |
+--------------------------------------------------------------------+
|                 unconditional branches                             | 
+----------------------+---------------------------------------------+
|    Instruction       |       Description                           |
+----------------------+---------------------------------------------+
|  jmp addr            |  jump to addr                               |
|  call addr           |  push return address on stack, then return  |
|                      |  function at addr                           |
|  ret                 |  pop return address from stack and return   |
|                      |  to that address                            |
|  syscall             |  Enter the kernel to perform a system call  |
+--------------------------------------------------------------------+
|                 conditional branches                               | 
+----------------------+---------------------------------------------+
|    Instruction       |       Description                           |
+----------------------+---------------------------------------------+
|  je addr/jz addr     |  jump to addr if zero flag is set           |
|                      |  for example, operands were equal on the    |
|                      |  last cmp                                   |
|  ja addr             |  jump if dst is above src in the last cmp   |
|  jb addr             |  jump if dst is below src in the last cmp   |
|  jg addr             |  jump if dst is greater thn src in the cmp  |
|  jl addr             |  jump if dst is less than  src in the cmp   |
+----------------------+---------------------------------------------+


Comparing operands (status flags)

The cmp instruction subtracts the second operand from the first operand, and according to the result of this operation, sets several status flags in the rflags register that we can work with later. The most important flags are the following:

  • zero flag (ZT) if the result of the subtraction is zero, this means that the operands are equal or rather, they have equal values.
  • sign flag (SF) if the result was negative, this means that in the operation cmp src1, src2 the operand src2 is greater than src1.
  • overflow flag (OF) the result was an overflow

The test instruction does the same thing on rflags only instead of performing a subtraction, it performs an addition.


Implementing system calls

To make a system call, the syscall instruction is used, but before using it, the system call must be prepared by selecting a number and setting its operands as specified by the system. For example, to make a read system call in Linux, the value 0 is loaded in rax, then the file descriptor is loaded in (rdi), buffer address in (rsi) and number of bytes to read in (rdx)

section .data
     dbbuffer 256 ; Reserve a 256 byte buffer
     len equ 256 ; buffer length

section .text
global _start

_start:
     ; Read system call to read from stdin
     mov rax, 0         ; syscall number for read (0)
     mov rdi, 0         ; file descriptor 0 (stdin)
     mov rsi, buffer    ; buffer address
     mov rdx, len       ; number of bytes to read (buffer size)
     syscall            ; makes the read system call
     mov rbx, rax       ; saves the number of bytes read in rbx

     ; write system call to write to stdout
     mov rax, 1         ; syscall number for write (1)
     mov rdi, 1         ; file descriptor 1 (stdout)
     mov rsi, buffer    ; buffer address
     mov rdx, rbx       ; uses the number of bytes read as a limit
     syscall            ; makes the write system call

     ; Exit the program
     mov rax, 60        ; syscall number for exit (60)
     xor rdi, rdi       ; exit code 0
     syscall            ; program ends

The previous example is an assembly program that reads a message from the keyboard and displays it on the screen. To run it we use the following command:

# we use the nasm assembler
$ nasm -f elf64 -o syscall_example.o syscall_example.asm

# we link the program
$ld -o syscall_example syscall_example.o

# we execute it
./syscall_example

Implementing conditional jumps

As mentioned above, conditional jumps work thanks to state flags which are modified by instructions like cmp or test . These jumps are made to specific addresses or labels if the condition is met and if it is not met, the jump will simply be ignored and the instruction that follows it will be executed.

cmp rax, rbx
jb label

In the previous example, a cmp is performed and subsequently a jb (jump if below). This means that if rax < rbx (unsigned comparison) then the jump is performed.

In the following example, the jump jnz (jump if not zero) is performed if rax is not equal to 0

test rax, rax
jnz label

Loading memory addresses

The lea (load effective address) instruction computes the resulting address of a memory operator and stores it in a register. It is equivalent to the & operator in C/C++ language.

read r12, [rip+0x2000]

In the previous example the instruction lea loads the memory address resulting from rip+0x2000 into register r12


La pila (stack)

The stack is a region of memory reserved for storing data related to function calls, such as the return address, function arguments, and local variables. The stack got its name because of the way it is accessed. Instead of writing data to random places on the stack, data is written in last-in-first-out or LIFO order. In this way, values can be written by pushing from the top.

As data is pushed onto the stack, the rsp register (register pointing to the top of the stack) decreases and this is because the stack increases with lower memory addresses.

It should be noted that when we perform a push, as mentioned before, the value is stored at the lowest address of the stack (rsp) and when we perform a push, the rsp increments until it is at the memory address it had. Previously, however, the value that we popped is still in memory, so it is important to know that if we have sensitive information on the stack and we want to clean it completely, we must overwrite it or delete it explicitly since a pop will not do it. do.


Function calls and function frames

source code in C language

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]){
   printf("%s=%s\n", argv[1], getenv(argv[1]));
  
   return 0;
}

source code in assembly language

Contents of section .rodata:
.LC0:
        .string "%s=%s\n"

Contents of section .text:
main:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     DWORD PTR [rbp-4], edi
        mov     QWORD PTR [rbp-16], rsi
        mov     rax, QWORD PTR [rbp-16]
        add     rax, 8
        mov     rax, QWORD PTR [rax]
        mov     rdi, rax
        call    getenv
        mov     rdx, rax
        mov     rax, QWORD PTR [rbp-16]
        add     rax, 8
        mov     rax, QWORD PTR [rax]
        mov     rsi, rax
        mov     edi, OFFSET FLAT:.LC0
        mov     eax, 0
        call    printf
        mov     eax, 0
        leave
        ret

Two function calls are shown in the C language code. The first is getenv which is used to get the value of an environment variable specified by argv[1] . Then another call is made to the printf function.

Now if we compare the C language code with the assembly code, we can observe several things. First, the string “%s=%s” is stored as a constant in the .rodata (read-only data) section and the symbol .LC0 is used to refer to that constant.

Each function has its own frame on the stack, delimited by rbp pointing to the base of the function frame and rsp pointing to the top of the function frame.

In the example we see of the previous assembly code, the first thing the main function does is execute the prologue which basically what it does is establish the function frame.

push rbp        ; saving the contents of rbp on the stack
mov rbp, rsp    ; Now rbp has the same value as rsp

This prologue is so common that a shortening statement that does the same thing called enter has been created.

On Linux x86-64, the rbx and r12-r15 registers must not be contaminated during the execution of a function, that is, if a function makes use of these registers, the function must restore them to the original value before return (ret). This is achieved by storing the values of said registers that need to be restored on the stack at the beginning of the function execution, and popping said values to restore the registers before returning.

After executing the prologue of the function, the rbp register is decremented by 0x10 (16) bytes to reserve space for local variables on the stack (4 bytes for argc and 8 bytes for the argv pointer and the rest of bytes used for padding).

On x86-64 Linux systems, the first 6 arguments to a function are passed using the rdi, rsi, rdx, rcx, r8, and r9 registers. If the function receives more than 6 arguments or some arguments do not fit in the 64-bit registers, then the remaining arguments are stored on the stack in reverse order.

; storing parameters in variables before calling a function
mov rdi, param1
mov rsi, param2
mov rdx, param3
mov rcx, param4
mov r8, param5
mov r9, param6
push param9
push param8
push param7
...
call function

This can vary depending on the convention used to pass parameters to functions, that is, if the cdecl convention is used, all arguments are passed on the stack using reverse order without using any registers. Another convention is fastcall which passes some arguments into registers.


The red zone

The red zone is a 128-byte area (in the x86-64 ABI) below the stack pointer. Programmers and compilers can use this area to store temporary data without needing to modify the stack pointer. This can be useful to optimize certain operations and avoid additional instructions to adjust the stack.

A key characteristic of the red zone is that the operating system does not preserve it when handling interrupts or signals. This means that if an interrupt occurs, the interrupt handler could overwrite the data in the red zone.


Preparing arguments and calling functions

In the assembly source code at the beginning of #function-calls-y-function-frames, After running the function prologue, the following is done:

; the memory address where argv begins is stored in the rax register
mov rax, QWORD PTR [rbp-16] ; now rax points to argv[0]

; As in the code in C language what was used was the index argv[1] then
; we add 8 bytes to rax (8 bytes is the size of a pointer) so that it points to
; argv[1]
add rax, 8 ; now rax points to argv[1]

; rax points to argv[1], however what we want is the value at which it is
; pointing rax so the following is done
mov rax, QWORD PTR [rax] ; now rax stores the value in argv[1]

; Now you have to pass the value of rax to rdi since it is the register used to
; pass parameters to functions
mov rdi, rax ; now rdi contains the same value as rax

; The call to the getenv function is made and rdi (argv[1]) is passed to it as a parameter.
call getenv

Reading values returned by functions

If a function returns a value, this value will be stored in rax. In the assembly source code we see at the beginning of #function-calls-y-function-frames after the call The following happens to the getenv function:

call getenv

; the value returned by getenv is stored in the rdi register since it is the register
; which will be passed as an argument later to the printf function
mov rdx, rax

; The process seen above is repeated to store the value of argv[1] in rsi
mov rax, QWORD PTR [rbp-16] ; rax = &argv[0]
add rax, 8 ; rax = &argv[1]
mov rax, QWORD PTR [rax] ; rax = argv[1]
mov rsi, rax ; rsi = rax

; Finally, the value of the constant referenced by .LC0 is stored in edi
mov edit, OFFSET FLAT:.LC0

; When calling a variadic function, the rax register fulfills the function of specifying
; the number of float arguments passed to the function, however in this case it is not
; there are no float arguments so rax (eax) is equal to 0
mov eax, 0

; At this point, the following records look like this
; PARAMETER 1: rdi = "%s=%s\n"
; PARAMETER 2: rsi = argv[1]
; PARAMETER 3: rdx = getenv(argv[1])
; now the printf function is called
call printf ; printf(rdi, rsi, rdx); -> printf("%s=%s\n", argv[1], getenv(argv[1]));

Returning from a function

After the printf function call is completed, the value of rax is set to 0 since this register is the one used to return values as mentioned above.

After this, the leave statement is executed, which is a shorthand way of executing the following:

mov rsp, rbp
pop rbp

This is known as epilo of the function and is done to restore the stack to the initial state before the function call.

Finally, the ret instruction is executed, which does a pop to obtain the address where it should be returned and subsequently continue the execution of the program in said instruction.

In this way the execution of the main function ends.


Conditional branches

source code in C language

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]){

    if (argc > 5){
        printf("argc > 5\n");
    }else{
        printf("argc <= 5\n");
    }

    return 0;
}

assembly source code

.LC0:
        .string "argc > 5"
.LC1:
        .string "argc <= 5"
main:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     DWORD PTR [rbp-4], edi
        mov     QWORD PTR [rbp-16], rsi
        cmp     DWORD PTR [rbp-4], 5
        jle     .L2
        mov     edi, OFFSET FLAT:.LC0
        call    puts
        jmp     .L3
.L2:
        mov     edi, OFFSET FLAT:.LC1
        call    puts
.L3:
        mov     eax, 0
        leave
        ret

In the above code, after the prologue the values argc and argv are stored on the stack, after this a comparison is made between argc and the constant or immediate value 5 .

; Foreword
push rbp
mov rbp, rsp

; the values of edi and rsi are stored on the stack
mov DWORD PTR [rbp-4], edi ; argc is saved here
mov QWORD PTR [rbp-16], rsi ; argv is saved here

; the comparison is made
cmp DWORD PTR [rbp-4], 5 ; compare(argc, 5)

jle .L2 ; if argc <= 5 then jump to .L2

; If the condition is not met, continue with the rest of the code

; The value of .LC0 is saved in edi to pass it as an argument
; to the puts or printf function
mov edit, OFFSET FLAT:.LC0

call puts ; The puts function is called and the edit record is passed as an argument.
jmp .L3 ; jump to .L3

.L2:
; In case argc is greater than 5, this is the code that will be executed
mov edit, OFFSET FLAT:.LC1 ; the value "argc > 5" is stored in edi

call puts

.L3:
mov eax, 0 ; the return value (eax) is set to 0
leave ; the battery is restored
ret ; It returns to the function that called main

Loops

source code in C language

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]){

    while (argc > 0){
        printf("%s\n", argv[(unsigned)--argc]);
    }

    return 0;
}

source code in assembly language

main:
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     DWORD PTR [rbp-4], edi
        mov     QWORD PTR [rbp-16], rsi
        jmp     .L2
.L3:
        sub     DWORD PTR [rbp-4], 1
        mov     eax, DWORD PTR [rbp-4]
        mov     eax, eax
        lea     rdx, [0+rax*8]
        mov     rax, QWORD PTR [rbp-16]
        add     rax, rdx
        mov     rax, QWORD PTR [rax]
        mov     rdi, rax
        call    puts
.L2:
        cmp     DWORD PTR [rbp-4], 0
        jg      .L3
        mov     eax, 0
        leave
        ret

What the previous C language code does, in summary, is display all the arguments passed via CLI on the screen, from the last to the first in a loop.

$exec-cpp hello world test 1 2 3
3
2
1
proof
world
hello
./prog.out

Now the explanation of the assembly code is as follows:

main:
         ; epilogue of the function
         push   rbp
         mov    rbp, rsp
        
         ; reserving memory on the stack
         sub    rsp, 16
        
         ; storing the argc and argv values on the stack
         mov    DWORD PTR [rbp-4], edit
         mov    QWORD PTR [rbp-16], rsi
        
         ; unconditional jump to .L2
         jmp    .L2
.L3:
         ; The immediate value 1 is subtracted from argc
         sub    DWORD PTR [rbp-4], 1
        
         ; the value of argc is stored in eax
         mov    eax, DWORD PTR [rbp-4]
        
         ; ???
         mov    eax, eax
        
         ; The value of the result of the effective address of the device is stored in rdx.
         ; memory operand [0 + rax * 8]
         read   rdx, [0+rax*8]
        
         ; rax = &argv[0]
         mov    rax, QWORD PTR [rbp-16]
        
         ; rax += rdx
         add    rax, rdx
        
         ; rax = argv[rdx]
         mov    rax, QWORD PTR [rax]
        
         ; rdi = rax
         mov    rdi, rax
        
         ; puts(rdi);
         call   puts
.L2:
         ; comparing argc with the immediate value 0
         cmp    DWORD PTR [rbp-4], 0
        
         ; if argc > 0, then jump to .L3
         jg     .L3
        
         ; This code is executed only when the value of argc is equal to 0
         mov    eax, 0
         leave
         ret

With this this blog ends. As a final recommendation, it is good practice to compare the code that we make in C language with its part in assembly language to further understand what happens in our program.


© 2023. All rights reserved.