Passing 128 bit register to C function from Assembly

Question

I am attempting to test passing a floating point value to a C function from assembly on 64-bit Linux. The C file containing my C function looks like this:

#include <stdio.h>

extern void printer(double k){
  printf("%f\n",k);
}

Its expected behavior is to simply print the floating point number passed to it. I am trying to accomplish this from an AT&T-syntax assembly file. If I am not mistaken, in 64-bit linux, the calling convention is to pass floating point arguments on the XMM registers. My .s file is the following:

.extern printer

        .data
var:
        .double 120.1
        .global main
main:
        movups (var),%xmm0
        call printer
        mov $60,%rax
        syscall

What I'm hoping this could do is have a variable (var) with value 120.1. This is then moved to the xmm0 register, which I expect is what is used to pass the argument k. This understanding of the calling convention is also backed up by the assembly code generated from the C file, a portion of which is below:

printer:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        subq    $16, %rsp
        movsd   %xmm0, -8(%rbp)
        movq    -8(%rbp), %rax
        movq    %rax, -16(%rbp)
        movsd   -16(%rbp), %xmm0
        movl    $.LC0, %edi
        movl    $1, %eax
        call    printf
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc

My .s file assembles to an executable, but running it only gives a segmentation fault, and doesn't print the floating point value. I can only assume this is because I'm not properly moving the value to xmm0 and/or using the register to pass it to the function. Can somebody explain how I should pass the value to the function?

Likely an alignment issue. Part of the calling convention is that the stack needs to be 16 byte aligned (or 32-byte) at the point of a function call. The x86-64 C runtime library will 16-byte align the stack before calling `main`. The return address adds 8 bytes to the stack. When main's code starts executing it is now 8 bytes misaligned because of the return address is on the stack. Push any register (at the start of main) on the stack to realign the stack to a 16-byte boundary should do the trick. SInce you are printing a double float this is almost certainly your issue. — Michael Petch, Jan 30 '18 at 18:19
Alternative to pushing a register is to subtract 8 from RSP. — Michael Petch, Jan 30 '18 at 18:30
@MichaelPetch why would the stack be 8 bytes misaligned at the start of main's code? You state that it's because the return address adds 8 bytes, but which return address are you referring to? As this is before `printer` is called, shouldn't there not be a return address yet? It did work, though. — user2649681, Jan 30 '18 at 18:32
Just **Before** the _C_ runtime does `call main` it is 16-byte aligned. The CALL instruction itself then misaligns it by 8 because the return address is pushed on the stack as part of the CALL instruction itself. This means that when the actual execution of function `main` starts you are now 8 bytes misaligned. So yes, before the call it was aligned, but the CALL instruction misaligns the stack, and you need to realign it again before calling `printer`. The generated code (by the C compiler) for `printer` will do what it must to ensure the stack is still aligned before it actually calls printf — Michael Petch, Jan 30 '18 at 18:40
Also, you have a `double` (as you should) but then use `movups` which is a packed single (float). You should use `movsd` just as the C code does. — Jester, Jan 30 '18 at 18:47
The `p` is for `packed` meaning a vector. You only have one value, called a scalar, hence `s`. Loading a vector would work in this case, it would just access extra — Jester, Jan 30 '18 at 18:52
Wow, I'm surprised how much of a mess gcc's un-optimized asm output is for this function. I understand the spill/reload of `xmm0` because `-O0` has to make it possible for you to change `k` with a debugger when stopped at a breakpoint. But load / store into `rax` makes *no* sense. It seems to be copying it to a different temporary on the stack before loading it into the arg-passing register. The optimized version of that function is *much* easier to read, just 3 instructions: https://godbolt.org/g/Qnd6fT `movl $.LC0, %edi` ; ` movl $1, %eax` ; `jmp printf` — Peter Cordes, Jan 31 '18 at 04:55
And BTW, if you had written your code as `_start`, instead of `main`, it would have been executed with RSP aligned by 16, because the x86-64 System V ABI guarantees that. I spent about 5 minutes carefully reading comments after seeing MichaelPetch's comment before I realized you wrote a `main` instead of the ELF entry point >.< I assumed from using `sys_exit` directly that you weren't using the CRT startup code, or you could have just returned from `main`. (But you should still do that or `call exit`, to make sure the stdio buffer is flushed. Your program breaks if you pipe the output.) — Peter Cordes, Jan 31 '18 at 04:59

score 3 · Answer 1 · answered Jan 30 '18 at 18:20

3

You have defined main in the data section, which makes it non-executable. Add a .text directive before main.

answered Jan 30 '18 at 18:20

prl

11,716
2
13
31

2

Although this is correct (that the code should be in `.text)`, it won't be his issue. Since he didn't mark the top of his assembly file with `.section .note.GNU-stack,"",@progbits` the linker will have made the `.data` section executable since not all the objects files were marked otherwise. – Michael Petch Jan 30 '18 at 18:52
1

His issue is entirely related to a misaligned stack. `printf` will likely be executing some SIMD instructions that require aligned accesses when dealing with floats and doubles. – Michael Petch Jan 30 '18 at 18:53
@MichaelPetch that's probably the loader not the linker, at least here `readelf` says `.data` is not executable but when running the page is indeed executable. – Jester Jan 30 '18 at 19:11
Of course how all the flags are set, how the sections are marked and declared are only a hint. It is up to the loader to decide how to interpret them. That may differ from one Linux to the next. – Michael Petch Jan 30 '18 at 19:32
1

I tried this code and `.data` was not executable **in the binary**. I think it's the loader that looks at that note and decides it should nevertheless mark it as executable contrary to the actual flags. See [fs/binfmt_elf.c:805](https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/binfmt_elf.c#n805) – Jester Jan 30 '18 at 19:33
@Jester : I had removed the other comments (except for the first), just before before your last comment. I happened to look in the binary file and it wasn't marked executable, but the gnu stack marking is queried by the loader. So yes, it seems this is left up to the loader. Ultimately though, I'll keep my first comment including the small error - that this answer likely doesn't change the error the OP is seeing. It is very likely the OP's code despite it being in the data section was laoded into memory into an executable page. – Michael Petch Jan 30 '18 at 19:39
The linker marks the program header `STACK` as `rwx` if `.section .note.GNU-stack,"",@progbits` doesn't appear in **all** the object files it loads to build the executable Otherwise it marks it as `rw-` (non-exectuable). The loader interprets that flag. Has two side effects - determines whether the stack should be executable or not and whether to apply executability to the .data pages as well. – Michael Petch Jan 30 '18 at 19:52
Of course it doesn't look for literal `.section .note.GNU-stack,"",@progbits` . Just that there is a section by that name with progbits set. – Michael Petch Jan 30 '18 at 22:31

Passing 128 bit register to C function from Assembly

1 Answers1