Table of Contents
Introduction
Why This Tutorial?
I have been tutoring a course called ‘Computer Architecture’ here at the Free University of Berlin more then 5 times over the past three years and most students had a common problem: getting into writing their very first assembly program. As we use Linux 64 bit this tutorial will focus on 64 Bit ELF.
Why Assembly?
I know of only one reason to write programs in assembly, it’s as close as you can get to the way your CPU works, being its biggest disadvantage as well: you’re programming from the perspective of a CPU, not that of a human brain. This prooves difficult for a wide range of students, who are not familiar with CPU architecture. Thus this tutorial will not only try to teach you some assembly basics and NASM syntax, it will additionally try to shed some light on the principles of a computer.
Why NASM?
NASM (Netwide Assembler) is an open source (80x86 and x86-64 architechture) assembler and a pritty good one at that. Compared to MASM, TASM or GAS it is rather easy to use and provides a solid amount of syntactic candy.
How?
When trying something new we usually want to get some positive feedback asap. The normal approcach for an assembly tutorial would be to list all the requirements, work through those step by step, and once they’re met introduce the audience to the actual coding. As I find this highly unsatisfying I’ll get started right away with having you produce your first few lines of code and provide you with the help necessary to fix any requirements on the road. For those of you preferring the more classic approach, here is the requirements section.
Some Coding
Hello World In NASM
So let’s get right to it! Here is the code for a simple Hello World program written in NASM. Go ahead and copy paste the code to a texteditor of your choice and save it as hw.asm
.
1
2
3
4
5
6
7
8
9
10
11
12
13
section .data
msg db "Hello World!", 10 ; db: data byte, 10: ASCII newline
section .text
global _start
_start:
mov rax, 1 ; write
mov rdi, 1 ; to stdout
mov rsi, msg ; starting at msg
mov rdx, 13 ; for len bytes
syscall
mov rax, 60 ; exit
mov rdi, 0 ; with success
syscall ;_
Once you have a folder containing the hw.asm
file with the above content opened in a shell of your choice type
nasm -f elf64 hw.asm && ld hw.o -o hw && ./hw
and hit Return. Now one of two things can happen. It will either work (you’ll see Hello World!
written on your console) or it won’t (everything else). I’ll assume it worked, otherwise take a look at the troubleshooting section.
HINT: Typing
program1 && program2
in a shell is equivalent to typingprogram1
and hitting the Return Button first and then typingprogram2
.
If you ever wrote a hallo world program before, you’ll find 13 lines of code alot for such a simple thing. That is because we’re used to using libraries. If we write assembler, we’ll just have to do everything manually (not true btw, later more ;) ). Let’s start in line one.
1 section .data
: this is where we can use memory with an initial value.2 msg db "Hello World!", 10
: we get some piece of memory and name its starting address msg, make it the size of so many data bytes as the stringHello World!
contains (12 that is) and one more for the ASCII newline being the10
. Everything preceeded by a;
is a comment.3 section .text
: this is where our actual code starts.4 global _start
: the label_start
shall be visible globally, thats a good thing, as it is the standart entrypoint for your Linux programs.- 5
_start:
: create the label_start
(don’t forget the colon). It’s basically telling your CPU to go to that location in your code when called. 6 mov rax, 1
: NASM syntax gets a Mnemonic (an instruction) at the beginning, followed by a number of arguments (number can be zero :p). Heremov rax 1
means move the number 1 into the CPU-RegisterRAX
.7 mov rdi, 1
: as above, just into CPU-RegisterRDI
.8 mov rsi, msg
: i think you have a solid idea what this does, just keep in mind that msg is an address to the point in memory where our text starts.9 mov rdx, 13
: yhe, repetitive ain’t it… We do indeed store a 13 into CPU-RegisterRDX
.10 syscall
: throws an interrupt that is a syscall, our operating systems will now do a couple of things. First it will look for the kind of syscall inRAX
. As we put a1
there it knows it’s a write syscall. The write syscall needs three additional parameters. The1
inRDI
tells it to write to stdout (your standard output, usually your console),msg
inRSI
tells it where the text to be written starts and finally the13
inRDX
tells the write syscall to write the first 13 bytes. So basically we will writeHello World!
with an additional newline ASCII character to stdout.
With that knowledge the next three lines are rather easy. All we need to know is that we need to manually terminate our program. A 60
in RAX tells the OS it shall perform the exit syscall, which only expects one parameter in RDI, a 0
in our case, being the return value of our program.
HINT: A list of syscalls can be found on Ryan A. Chapmans Blog.
Addition Function In NASM Used In C-Program
So in order to get things done in pure NASM we need to have some knowledge. We need not only the Mnemonics but the syscalls and a proper understanding for our CPU. To make thing a little more easy one can utalize a higher programming language like C. The following example will do an addition in assember and use that function from a C-Program to write the result to stdout.
Go ahead and save this as addit.asm
,
1
2
3
4
5
6
section .text ; code
global addit
addit:
mov rax, rdi ;param1 in rax
add rax, rsi ;additionsergebnis in rax
ret
this as addit.c
,
1
2
3
4
5
6
7
8
#include <stdio.h>
extern int addit(int, int);
int main(void)
{
printf("%d\n", addit(100, 78));
}
open the folder where you saved it in a shell and run it with this command:
nasm -f elf64 addit.asm && gcc -std=c11 addit.c addit.o -o addit && ./addit
This is a really nice thing to have, as we don’t need to worry about those syscalls. We’ll handle all the syscalls from within C (wich has libraries for it) and let our NASM function do the real work.
There are only two things our addit.asm needs to know:
- Where do my parameters come from? First parameter (the
100
) inRDI
and second parameter (the78
) inRSI
. - Where do i need to save the result? In
RAX
.
This also tells us that add rax, rsi
in line 5 computes like: take the value from RSI
, add it to the value in RAX
and store the result in RAX
. The ret
in line 6 ends the function call.
HINT: It is possible to write Mnemonics and Registers in upper- as well as lowercase in NASM.
If we take a look at the C-Programm the advantage really shows. Most of the functionality our 13 line Hello World NASM Program had is found in line 7 of the C-Proramm …
Requirements
There are a couple of things you need in order to get started with writing code in NASM, being able to assemble and link it and then finally execute it.
- You will need a 64 bit Linux OS; either native, in a VM or through ssh.
- You will need some sort of text editor.
- You will need a Shell.
- You will need to have NASM installed.
- You will need to have ld installed.
- You will need to have GCC installed.
Troubleshooting
If you landed in this section, you’re probably not familiar with Linux. This usually happens to people who mainly used Windows or MacOS. That is the majoraty of people, even for students in computer science, so you’re in good company ;). But don’t fret as i have tought hundrets of students starting with little or nothing how to code in NASM. So let us work you through those requirements.
Operatin System
For an OS you’ll need a 64 bit Linux. I recommend Linux Mint 64 bit Cinnamon if you have no idea where to start. As you probably want to read less and do more you might like this installation tutorial. Here you can get the full User Guide in your language. This will set you up with an OS feeling alot like Windows with alot of graphical tools to use. I will write a few words about Linux Distros soon, so keep an eye out.
HINT: A good place to get to know some Linux Distros is Distro Watch.
Texteditor
There are some really great texteditors out there and i’m not going to judge them here. If you don’t know where to start i recommend ATOM because its open source and intuitive to use if you come from Windows or MacOS.
In order to install ATOM in Linux Mint type
sudo add-apt-repository ppa:webupd8team/atom
sudo apt-get update
sudo apt-get install atom
in a shell of your choice.
Shell
As with editors there are alot of great shells out there. I myself am using ZSH with some extras i will cover in a future post. For beginners I recommend fish, as its easy to set up and use.
Additional Software
Install the required software by typing these lines in a terminal:
sudo apt-get install gcc
sudo apt-get install nasm