I wrote a printint function in 64-bit NASM that prints an integer to STDOUT. It's really slow though, and after doing some benchmarks I determined that converting an integer to a string is the slowest part by far.
My current strategy for converting ints to strings goes like this:
- If the number is 0, skip to a special case.
- If the number is negative, print a negative sign, turn it into a positive integer, and continue
- Create a 10-byte buffer (enough to hold all 32-bit integers) and a pointer pointing to the back of the buffer
- Check if the number is 0; if it is, we're done.
- Divide the number by 10, convert the remainder into ASCII
- Put the remainder into the buffer (from back to front)
- Decrement the buffer pointer
- Loop back to step 4
I've tried Googling for how other people do it and it's more or less similar to what I do, dividing by 10 until the number is 0.
Here's the relevant code:
printint:                       ; num in edi
    push rbp                    ; save base pointer
    mov rbp, rsp                ; place base pointer on stack
    sub rsp, 20                 ; align stack to keep 20 bytes for buffering
    cmp edi, 0                  ; compare num to 0
    je _printint_zero           ; 0 is special case
    cmp edi, 0
    jg _printint_pos            ; don't print negative sign if positive
    ; print a negative sign (code not relevant)
    xor edi, -1                 ; convert into positive integer
    add edi, 1
_printint_pos:
    mov rbx, rsp                ; set rbx to point to the end of the buffer
    add rbx, 17
    mov qword [rsp+8], 0        ; clear the buffer
    mov word [rsp+16], 0        ; 10 bytes from [8,18)
_printint_loop:
    cmp edi, 0                  ; compare edi to 0
    je _printint_done           ; if edi == 0 then we are done
    xor edx, edx                ; prepare eax and edx for division
    mov eax, edi
    mov ecx, 10
    div ecx                     ; divide and remainder by 10
    mov edi, eax                ; move quotient back to edi
    add dl, 48                  ; convert remainder to ascii
    mov byte [rbx], dl          ; move remainder to buffer
    dec rbx                     ; shift 1 position to the left in buffer
    jmp _printint_loop
_printint_done:
    ; print the buffer (code not relevant)
    mov rsp, rbp                ; restore stack and base pointers
    pop rbp
    ret
How can I optimize it so that it can run much faster? Alternatively, is there a significantly better method to convert an integer to a string?
I do not want to use printf or any other function in the C standard library
 
    