I am new to assembly, and want to first try to get an intuitive feel for how printing a string to the terminal would work, without going through the operating system abstraction (Linux or OSX).
(Editor's note: the accepted answer only covers Linux. x86-64 MacOS uses a similar system-calling convention but different call numbers.)
tl;dr How do you write to stdout (print to the terminal) in x86-64 assembly with NASM on OSX, at the lowest level possible (i.e. without syscall)? How is BareMetal OS doing this?
Most examples show something like this:
global start
section .text
start:
  mov rax, 1
  mov rdi, 1
  mov rsi, message
  mov rdx, 13
  syscall
  mov eax, 60
  xor rdi, rdi
  syscall
message:
  db "Hello world", 10
In there, they are using syscall to print the string, which is relying on the operating system.  I am not looking for that, but for how to write a string to stdout directly, at the lowest level possible.
There is this exokernel project, BareMetal OS that I think is doing this. Though since I am new to assembly, I don't know enough yet to figure out how they accomplish this. From what it seems though, the two important files are:
It seems the relevant code to print is this (extracted from those two files):
;
; Display text in terminal.
;
;  IN:  RSI = message location (zero-terminated string)
; OUT:  All registers preserved
;
os_output:
  push rcx
  call os_string_length
  call os_output_chars
  
  pop rcx
  ret
; 
; Displays text.
;
;  IN:  RSI = message location (an ASCII string, not zero-terminated)
; RCX = number of chars to print
; OUT:  All registers preserved
;
os_output_chars:
  push rdi
  push rsi
  push rcx
  push rax
  cld ; Clear the direction flag.. we want to increment through the string
  mov ah, 0x07 ; Store the attribute into AH so STOSW can be used later on
;
; Return length of a string.
;
;  IN:  RSI = string location
; OUT:  RCX = length (not including the NULL terminator)
;
; All other registers preserved
;
os_string_length:
  push rdi
  push rax
  xor ecx, ecx
  xor eax, eax
  mov rdi, rsi
  not rcx
  cld
  repne scasb ; compare byte at RDI to value in AL
  not rcx
  dec rcx
  pop rax
  pop rdi
  ret
But that doesn't look complete to me (though I wouldn't know yet since I'm new).
So my question is, along the lines of that BareMetal OS snippet, how do you write to stdout (print to the terminal) in x86-64 assembly with NASM on OSX?
 
     
    