Access assembly macro function / directives in a C file

Question

I wondered if assembly directives like .directive or macros like %macro my_macro can be accessible in another C file

file : macroasm.S

%macro my_macro 1
   mov rsp, 1
%endmacro

Is there any possible way to call and execute the my_macro macro in a C file and compiling them with nasm and gcc?

First of all, use `.S` for GAS sources (which uses different directives than NASM). Use `.asm` for NASM sources. gcc only knows how to emit GAS syntax, either AT&T or MASM syntax, so you can only assemble its output with GAS. You can't mix in NASM syntax code or macros even using a `.include` in inline asm. — Peter Cordes, Sep 07 '18 at 19:55
i have seen once accessing assembly function in a c file so i thought macros are accessible across assembly and C — mahmoud adel, Sep 07 '18 at 19:56
You can share preprocessor macros though. Stick them in a separate header file and make sure it compiles both as C and assembly. — Jester, Sep 07 '18 at 19:56
@mahmoudadel: You can of course call functions, because that's a *run-time* connection between code from different object files. A macro has to be compatible with GNU C inline-asm syntax, so it can't be NASM syntax. — Peter Cordes, Sep 07 '18 at 19:58
@PeterCordes so i cannot access assembly macros in C ! , thanks , i will try to get the same job done in one C file by inline asm — mahmoud adel, Sep 07 '18 at 20:01
You can use GAS's `.macro` if you want. (https://sourceware.org/binutils/docs/as/Macro.html). `asm(".include \"macro-defs.S\"");` at the top of a C file should work. — Peter Cordes, Sep 07 '18 at 20:02
@mahmoudadel Functions are compiled into machine code and are thus available from other modules if you make them global. Macros are resolved at preprocessing time and no longer exist in the machine code, so they are of course not available from other modules if you don't make arrangements. — fuz, Sep 07 '18 at 20:49

score 4 · Accepted Answer · answered Sep 07 '18 at 21:53

A macro is a compile-time substitution, unlike a runtime function call. asm and C are different languages, so the only way this question makes sense is for asm macros that you can use from inline-asm.

gcc's asm output has to be assembled by GAS or a compatible assembler that understands GAS directives. (https://sourceware.org/binutils/docs/as/). Inline asm lets you emit hand-written stuff directly into that asm compiler output, becoming part of one complete assembler source file that the compiler feeds to the assembler.

Using NASM syntax like %macro can't work in GNU C inline asm, because an assembler that can assemble regular gcc output won't understand NASM directives.

But you can use GAS .macro if you want. (https://sourceware.org/binutils/docs/as/Macro.html). I wouldn't recommend it; GAS macros aren't very nice to use. The syntax feels clunky compared to NASM. But since you asked, this is how you do it.

asm(".include \"macro-defs.S\""); at the top of a C will let you use those macros from inline asm later in that compilation unit. (Assuming gcc doesn't reorder things in the output asm.)

But of course you have to know what the macro does to be able to write correct constraints for the inline-asm statements, so it's really not super-useful.

Example

macro-defs.S (GAS syntax, not NASM). Maybe I should have called it .s, because we only .include it with asm directives, not #include with the C preprocessor. (That would be problematic for C: you can't #include something inside a double-quoted string.) So anyway, we can't use CPP macros here, only asm macros.

    #.altmacro  # needed for some things, makes other things harder
    # https://stackoverflow.com/questions/19776992/gas-altmacro-macro-with-a-percent-sign-in-a-default-parameter-fails-with-oper

# clobbers RDX and RAX
.macro fenced_rdtsc64  dst
    lfence                  # make sure earlier stuff is done
    rdtsc
    lfence                  # don't allow later stuff to start before time is read
    shl $32, %rdx           # allow OoO exec of these with the timed interval
    lea (%rax, %rdx), \dst
.endm

# repeats  pause  n times.  Probably not useful, just a silly example.
# for exponential backoff in a spinloop, you want a *runtime* repeat count.
.macro  pause_n  count
    pause                      # the machine instruction, not a macro
   .if     \count-1
      pause_n  "(\count-1)"    # recursion is GAS equivalent of NASM %rep
   .endif
.endm

These macros are usable from foo.S:

.include "macro-defs.S"

 # inefficient: the subtraction really only needs to use the low 32 bits of the count
 # so using a macro that merges the high half is a waste    
.globl foo
foo:
    fenced_rdtsc64  %rcx   # start
    pause_n         4
    fenced_rdtsc64  %rax   # end
    sub             %rcx, %rax
    ret

And via inline-asm from main.c (which also calls foo() the normal way).

#include <stdio.h>

asm(".include \"macro-defs.S\"");

long long foo(void);
int main(void) {
    long long start, end;
    asm volatile("fenced_rdtsc64 %[dst]"
             : [dst]"=r" (start)
             :
             : "rax", "rdx" // forces it to avoid these as output regs, unfortunately
        );

    printf("foo rdtsc ticks: call1 %lld  call2 %lld\n", foo(), foo());

    asm volatile("fenced_rdtsc64 %[dst]"
             : [dst]"=r" (end)
             :
             : "rax", "rdx");
    printf("printf rdtsc ticks: %lld\n", end-start);
}

Compile with gcc -O3 -Wall main.c foo.S (I used gcc7.3, with -fpie being the default).

Running it with for i in {1..50};do ./a.out;done gives output like this (on my i7-6700k, where pause takes ~100 core clock cycles, and hardware P-states ramp up the speed quickly when there's load):

... (variable number of lines before the frequency shift)
foo rdtsc ticks: call1 3006  call2 3014
printf rdtsc ticks: 727810
foo rdtsc ticks: call1 3006  call2 3022
printf rdtsc ticks: 707376
foo rdtsc ticks: call1 3006  call2 3017
printf rdtsc ticks: 746375
foo rdtsc ticks: call1 3006  call2 3029
printf rdtsc ticks: 684239
foo rdtsc ticks: call1 3006  call2 3010
printf rdtsc ticks: 652724
foo rdtsc ticks: call1 616  call2 620    # gcc chose to evalute from right to left
printf rdtsc ticks: 133282
foo rdtsc ticks: call1 618  call2 618    # so call1 is with it hot in uop cache
printf rdtsc ticks: 133984
foo rdtsc ticks: call1 616  call2 618
printf rdtsc ticks: 133284
foo rdtsc ticks: call1 614  call2 618

The asm for foo, if we disassemble (with objdump -drwC -Mintel a.out) to see how the macro expanded:

# I maybe should have used AT&T syntax disassembly like the source
# You can do that if you want, on your own desktop, leaving out -Mintel
00000000000006ba <foo>:
 6ba:   0f ae e8                lfence 
 6bd:   0f 31                   rdtsc  
 6bf:   0f ae e8                lfence 
 6c2:   48 c1 e2 20             shl    rdx,0x20
 6c6:   48 8d 0c 10             lea    rcx,[rax+rdx*1]   # macro expanded with RCX
 6ca:   f3 90                   pause      # pause_n   4  expanded to 4 pause instructions
 6cc:   f3 90                   pause  
 6ce:   f3 90                   pause  
 6d0:   f3 90                   pause  
 6d2:   0f ae e8                lfence 
 6d5:   0f 31                   rdtsc  
 6d7:   0f ae e8                lfence 
 6da:   48 c1 e2 20             shl    rdx,0x20
 6de:   48 8d 04 10             lea    rax,[rax+rdx*1]   # macro expanded with RAX
 6e2:   48 29 c8                sub    rax,rcx
 6e5:   c3                      ret

The compiler-generated asm (including our inline asm) is:

0000000000000540 <main>:
 540:   55                      push   rbp
 541:   53                      push   rbx
 542:   48 83 ec 08             sub    rsp,0x8

 546:   0f ae e8                lfence             # first inline asm
 549:   0f 31                   rdtsc  
 54b:   0f ae e8                lfence 
 54e:   48 c1 e2 20             shl    rdx,0x20
 552:   48 8d 1c 10             lea    rbx,[rax+rdx*1]    # The compiler picked RBX for the output operand
                                                 # and substituted fenced_rdtsc64 %rbx into the asm template


 556:   e8 5f 01 00 00          call   6ba <foo>
 55b:   48 89 c5                mov    rbp,rax    # save the return value, not a macro so it couldn't ask for a more convenient register
 55e:   e8 57 01 00 00          call   6ba <foo>
 563:   48 89 ea                mov    rdx,rbp
 566:   48 8d 3d 0b 02 00 00    lea    rdi,[rip+0x20b]        # 778 <_IO_stdin_used+0x8>  # the string literal
 56d:   48 89 c6                mov    rsi,rax
 570:   31 c0                   xor    eax,eax
 572:   e8 b9 ff ff ff          call   530 <printf@plt>

 577:   0f ae e8                lfence         # 2nd inline asm
 57a:   0f 31                   rdtsc  
 57c:   0f ae e8                lfence 
 57f:   48 c1 e2 20             shl    rdx,0x20
 583:   48 8d 34 10             lea    rsi,[rax+rdx*1]   # compiler picked RSI this time
 587:   48 8d 3d 1a 02 00 00    lea    rdi,[rip+0x21a]        # 7a8 <_IO_stdin_used+0x38>
 58e:   48 29 de                sub    rsi,rbx           # where it wanted it as the 2nd arg to printf(.., end-start)
 591:   31 c0                   xor    eax,eax
 593:   e8 98 ff ff ff          call   530 <printf@plt>
 598:   48 83 c4 08             add    rsp,0x8
 59c:   31 c0                   xor    eax,eax
 59e:   5b                      pop    rbx
 59f:   5d                      pop    rbp
 5a0:   c3                      ret

Access assembly macro function / directives in a C file

1 Answers1

Example