-1

I need to make a program that outputs a text file with an extension of .dna, I don't know if I can really do that, and if the text file will even be compatible with what I need to compare it afterwards. Anyway, I'm not really sure how to do this. I tried to look for some examples for NASM, but I didn't find much. I have an idea of what I'd need to do, but I just don't know what to call to generate a file.

Afterwards I'd need to write stuff into it, I'm not really sure on how to go on about that. Could anyone point me to some examples or something? I just need to see what is required to write my own thing.

Jester
  • 56,577
  • 4
  • 81
  • 125
Argus
  • 911
  • 3
  • 9
  • 20
  • 1
    32 or 64 bit? Also, are you willing to use the C library or just kernel system calls? – Adam D. Ruppe Nov 15 '14 at 22:32
  • 32 bit, and yes I can use libraries. – Argus Nov 15 '14 at 22:35
  • Then just link to libc and use the standard C functions. Also, it doesn't generally make much sense to use file I/O from assembly, maybe you can use C and just write some parts in asm. – Jester Nov 15 '14 at 22:41
  • I'm not even sure how to do that with assembly. I'm sorry, but my professor is a pencil pusher, we haven't seen anything. He just threw this project at us, and I'm pretty sure he doesn't even know how to do it himself. Besides, if I knew how to do that, do you really think I'd ask this? – Argus Nov 15 '14 at 22:45
  • I'm writing up a couple examples now to give you an idea... – Adam D. Ruppe Nov 15 '14 at 22:50

1 Answers1

1

Here's an example using system calls. Basically, you just open the file, write some data to it, then close and exit:

; nasm -f elf file.asm
; ld -m elf_i386 file.o
BITS 32
section .data
        ; don't forget the 0 terminator if it akes a C string!
        filename: db 'test.txt', 0

        ; an error message to be printed with write(). The function doesn't
        ; use a C string so no need for a 0 here, but we do need length.
        error_message: db 'Something went wrong.', 10 ; 10 == \n
        ; this next line means current location minus the error_message location
        ; which works out the message length.
        ; many of the system calls use pointer+length pairs instead of
        ; 0 terminated strings.
        error_message_length: equ $ - error_message

        ; a message we'll write to our file, same as the error message
        hello: db 'Hello, file!', 10 ; the 10 is a newline at the end
        hello_length: equ $ - hello

        fd: dd 0 ; this is like a global int variable in C
        ; global variables are generally a bad idea and there's other
        ; ways to do it, but for simplicity I'm using one here as the
        ; other ways are a bit more work in asm
section .text
        global _start
_start:
        ; first, open or create the file. in C it would be:
        ; // $ man 2 creat
        ; int fd = creat("file.txt", 0644); // the second argument is permission

        ; we get the syscall numbers from /usr/include/asm/unistd_32.h
        mov eax, 8 ; creat
        mov ebx, filename ; first argument
        mov ecx, 644O ; the suffix O means Octal in nasm, like the leading 0 in C. see: http://www.nasm.us/doc/nasmdoc3.html
        int 80h ; calls the kernel

        cmp eax, -1 ; creat returns -1 on error
        je error

        mov [fd], eax ; the return value is in eax - the file descriptor

        ; now, we'll write something to the file
        ; // man 2 write
        ; write(fd, hello_pointer, hello_length)
        mov eax, 4 ; write
        mov ebx, [fd],
        mov ecx, hello
        mov edx, hello_length
        int 80h

        cmp eax, -1
        ; it should also close the file in a normal program upon write error
        ; since it is open, but meh, since we just terminate the kernel
        ; will clean up after us
        je error

        ; and now we close the file
        ; // man 2 close
        ; close(fd);

        mov eax, 6 ; close
        mov ebx, [fd]
        int 80h

        ; and now close the program by calling exit(0);
        mov eax, 1 ; exit
        mov ebx, 0 ; return value
        int 80h
error:
        mov eax, 4 ; write
        mov ebx, 1 ; write to stdout - file #1
        mov ecx, error_message ; pointer to the string
        mov edx, error_message_length ; length of the string
        int 80h ; print it

        mov eax, 1 ; exit
        mov ebx, 1 ; return value
        int 80h

The file will be called a.out if you copied my link command above. The -o option to ld changes that.

We can also call C functions, which helps if you need to write out things like numbers.

; nasm -f elf file.asm
; gcc -m32 file.o -nostdlib -lc # notice that we're using gcc to link, makes things a bit easier
; # the options are: -m32, 32 bit, -nostdlib, don't try to use the C lib cuz it will look for main()
; # and finally, -lc to add back some of the C standard library we want
BITS 32

; docs here: http://www.nasm.us/doc/nasmdoc6.html
; we declare the C functions as external symbols. the leading underscore is a C thing.
extern fopen
extern fprintf
extern fclose

section .data
        ; don't forget the 0 terminator if it akes a C string!
        filename: db 'test.txt', 0

        filemode: db 'wt', 0 ; the mode for fopen in C

        format_string: db 'Hello with a number! %d is it.', 10, 0 ; new line and 0 terminator

        ; an error message to be printed with write(). The function doesn't
        ; use a C string so no need for a 0 here, but we do need length.
        error_message: db 'Something went wrong.', 10 ; 10 == \n
        ; this next line means current location minus the error_message location
        ; which works out the message length.
        ; many of the system calls use pointer+length pairs instead of
        ; 0 terminated strings.
        error_message_length: equ $ - error_message

        fp: dd 0 ; this is like a global int variable in C
        ; global variables are generally a bad idea and there's other
        ; ways to do it, but for simplicity I'm using one here as the
        ; other ways are a bit more work in asm
section .text
        global _start
_start:
        ; first, open or create the file. in C it would be:
        ; FILE* fp = fopen("text.txt", "wt");

        ; arguments for C functions are pushed on to the stack, right from left.
        push filemode ; "wt"
        push filename ; "text.txt"
        call fopen
        add esp, 8 ; we need to clean up our own stack. Since we pushed two four-byte items, we need to pop the 8 bytes back off. Alternatively, we could have called pop twice, but a single add instruction keeps our registers cleaner.

        ; the return value is in eax, store it in our fp variable after checking for errors
        ; in C: if(fp == NULL) goto error;
        cmp eax, 0 ; check for null
        je error
        mov [fp], eax;

        ; call fprintf(fp, "format string with %d", 55);
        ; the 55 is just a random number to print

        mov eax, 55
        push eax ; all arguments are pushed, right to left. We want a 4 byte int equal to 55, so eax is it
        push format_string
        mov eax, [fp] ; again using eax as an intermediate to store our 4 bytes as we push to the stack
        push eax
        call fprintf
        add esp, 12 ; 3 words this time to clean up

        ; fclose(fp);
        mov eax, [fp] ; again using eax as an intermediate to store our 4 bytes as we push to the stack
        push eax
        call fclose

        ; the rest is unchanged from the above example

        ; and now close the program by calling exit(0);
        mov eax, 1 ; exit
        mov ebx, 0 ; return value
        int 80h
error:
        mov eax, 4 ; write
        mov ebx, 1 ; write to stdout - file #1
        mov ecx, error_message ; pointer to the string
        mov edx, error_message_length ; length of the string
        int 80h ; print it

        mov eax, 1 ; exit
        mov ebx, 1 ; return value
        int 80h

There's a lot more that can be done here, like a few techniques to eliminate those global variables, or better error checking, or even writing a C style main() in assembly. But this should get you started in writing out a text file. Tip: Files are the same as writing to the screen, you just need to open/create them first!

BTW don't mix the system calls and the C library functions at the same time. The C library (fprintf etc) buffers data, the system calls don't. If you mix them, the data might end up written to the file in a surprising order.

The code is similar, but slightly different in 64 bit.

Finally, this same pattern can be used to translate almost any C code to asm - the C calling convention is the same with different functions, and the linux system call convention with the argument placement etc. follows a consistent pattern too.

Further reading: http://en.wikipedia.org/wiki/X86_calling_conventions#cdecl on the C calling convention

http://docs.cs.up.ac.za/programming/asm/derick_tut/syscalls.html on linux system calls

What is the purpose of EBP in the following code? is another SO answer I wrote up a while ago about local variables in asm - this will have hints as to one way to get rid of that global and describes how the C compile does it. (the other way to get rid of that global is to either keep the fd/fp in a register and push and pop it onto the stack when you need to free up the register for something else)

And the man pages referenced in the code for each function. From your linux prompt, do things like man 2 write or man 3 fprintf to see more. (System calls are in manual section 2 and C functions are in manual section 3).

Community
  • 1
  • 1
Adam D. Ruppe
  • 25,382
  • 4
  • 41
  • 60