Written by: Derick Swanepoel (derick@maple.up.ac.za)
Version 1.0 - 2002-04-19, 01:50am
Download as zipfile
JMP Step-by-Step Guide
This Quickstart aims to show you the ropes on Linux assembly as quickly as possible. Basically, it just points out the differences between a Linux and DOS assembly program with just enough explanation not to confuse you. For more detail and why things are the way they are, see the Step-by-Step Guide.
Linux | DOS |
SECTION .DATA hello: db 'Hello world!',10 helloLen: equ $-hello SECTION .TEXT GLOBAL _START _START: ; Write 'Hello world!' to the screen mov eax,4 ; 'write' system call mov ebx,1 ; file descriptor 1 = screen mov ecx,hello ; string to write mov edx,helloLen ; length of string to write int 80h ; call the kernel ; Terminate program mov eax,1 ; 'exit' system call mov ebx,0 ; exit with error code 0 int 80h ; call the kernel |
DOSSEG .MODEL LARGE .STACK 200h .DATA hello db 'Hello world!',10,13,'$' helloLen db 14 .CODE ASSUME CS:@CODE, DS:@DATA START: mov ax,@data mov ds,ax ; Write 'Hello world!' to the screen mov ah,09h ; 'print' DOS service mov dx,offset hello ; string to write int 21h ; call DOS service ; Terminate program mov ah,4Ch ; 'exit' DOS service mov ax,0 ; exit with error code 0 int 21h ; call DOS service END START |
Compiling: nasm -f elf hello.asm Linking: ld -s -o hello hello.o |
Compiling: tasm hello.asm Linking: tlink hello.obj |
Lets compare each part in the two programs:
hello: db 'Hello world!',10we can put on the next line
helloLen: equ $-helloThis will make helloLen equal to (position at beginning of line) - (position of hello). If you look at those two lines in the program, you can see this will give us the length of 'Hello world!',10, which is 13 (12 characters plus the linefeed character).
There are six registers that are used for the arguments that a system call takes. The first argument goes in EBX, the second in ECX, then EDX, ESI, EDI, and finally EBP, if there are so many. If there are more than six arguments, EBX must contain the memory location where the list of arguments is stored.
All the syscalls are listed in /usr/include/asm/unistd.h, together with their numbers. However, for your convenience you can simply find them in this Linux System Call Table, together with some other useful information (eg. what arguments they take). The syscalls are fully documented in section 2 of the manual pages, so you can just go man 2 write to find out what the write syscall does, what arguments it takes, etc.
Linux |
section .data hello db 'Hello, world!',10 ; Our dear string helloLen equ $ - hello ; Length of our dear string section .text global _start _start: pop ebx ; argc (argument count) pop ebx ; argv[0] (argument 0, the program name) pop ebx ; The first real arg, a filename mov eax,8 ; The syscall number for creat() (we already have the filename in ebx) mov ecx,00644Q ; Read/write permissions in octal (rw_rw_rw_) int 80h ; Call the kernel ; Now we have a file descriptor in eax test eax,eax ; Lets make sure the file descriptor is valid js skipWrite ; If the file descriptor has the sign flag ; (which means it's less than 0) there was an oops, ; so skip the writing. Otherwise call the filewrite "procedure" call fileWrite skipWrite: mov ebx,eax ; If there was an error, save the errno in ebx mov eax,1 ; Put the exit syscall number in eax int 80h ; Bail out ; proc fileWrite - write a string to a file fileWrite: mov ebx,eax ; sys_creat returned file descriptor into eax, now move into ebx mov eax,4 ; sys_write ; ebx is already set up mov ecx,hello ; We are putting the ADDRESS of hello in ecx mov edx,helloLen ; This is the VALUE of helloLen because it's a constant (defined with equ) int 80h mov eax,6 ; sys_close (ebx already contains file descriptor) int 80h ret ; endp fileWrite |
DOS |
DOSSEG .MODEL LARGE .STACK 200h .DATA filename db 14 dup (0) filehandle dw hello db 'Hello World!',10,13,'$' helloLen db 12 .CODE ASSUME CS:@CODE, DS:@DATA START: mov AX,@DATA mov ES,AX ; Point ES to the data segment for now mov ah,62h int 21h ; Get the PSP mov ds,bx mov bx,81h ; Starting at the first printable character add bl, byte ptr [ds:80h] ; Get address of last character mov cl, byte ptr [ds:80h] ; Also put it in CL inc cl mov [ds:bx], word ptr 0 ; Null terminate the argument mov si,81h mov di,0 ; Copy the first argument into the data segment rep movsb ; into the filename variable mov AX,@DATA mov DS,AX ; Point DS to the data segment, like normal call fileCreate call fileWrite call fileClose mov AX,4C00h int 21h ; Bye-bye! END START proc fileCreate mov ah,3Ch ; Creat DOS service (yes, it is called 'creat') mov cx,0 ; File attributes mov dx,offset filename ; Put ADDRESS of filename in DX int 21h mov [filehandle],ax ; File handle is returned in AX, put in a variable ret endp fileCreate proc fileClose mov ah,3Eh mov bx,[filehandle] int 21h ret endp fileClose proc fileWrite mov ah, 40h mov bx, [filehandle] mov dx, offset hello ; ADDRESS of string to be written xor cx, cx ; If I don't do this, things blow up in my face mov cl, [helloLen] ; VALUE of length of string to be written int 21h ret endp fileWrite |
As you can see, the Linux program is much simpler than the DOS one (40 lines in Linux, with liberal commenting, vs. 66 for DOS). Everything makes sense in the Linux program, whereas a lot of the stuff in the DOS one still makes me go "Huh?" Lets check out the differences:
To compile a program with NASM:
nasm -f elf program.asmTo link the object file produced by NASM into an executable:
ld -s -o program program.oThe -f elf option tells NASM to compile this in Linux ELF format. This option is necessary because NASM can compile many different formats (even DOS .COM files if you're so inclined).
Writing a useful program with NASM
The NASM documentation
Introduction to UNIX assembly programming
Linux Assembler Tutorial by Robin Miyagi
Section 2 of the manpages