There's this unanswered question in the Igor Zhirkov's book Low-Level Programming :
"Try to rewrite print_newline without calling print_char or copying its code. Hint: read about tail call optimization.".
I've spent some times reading about tail call optimization, but I can't figure out what would be the proper way of doing it.
Original code (without tail-call): print_newline => print_char => print_string=>string_length=>print_string
string_length:
    xor rax, rax
.loop:
    cmp byte [rdi+rax], 0   ; if rdi =  0 (not an address), then seg fault
    je .end 
    inc rax
    jmp .loop 
.end:
    ret
print_string:
    push rdi
    call string_length
    pop rsi
    mov rdx, rax
    mov rax, 1
    mov rdi, 1
    syscall
    ret
print_char:
    push rdi
    mov rdi, rsp 
    call print_string 
    pop rdi
    ret
print_newline:
    mov rdi, 10
    jmp print_char
What could be a tail-call optimization of that ?
 
    