Do rvalue references have the same overhead as lvalue references?

Question

Consider this example:

#include <utility>

// runtime dominated by argument passing
template <class T>
void foo(T t) {}

int main() {
    int i(0);
    foo<int>(i); // fast -- int is scalar type
    foo<int&>(i); // slow -- lvalue reference overhead
    foo<int&&>(std::move(i)); // ???
}

Is foo<int&&>(i) as fast as foo<int>(i), or does it involve pointer overhead like foo<int&>(i)?

EDIT: As suggested, running g++ -S gave me the same 51-line assembly file for foo<int>(i) and foo<int&>(i), but foo<int&&>(std::move(i)) resulted in 71 lines of assembly code (it looks like the difference came from std::move).

EDIT: Thanks to those who recommended g++ -S with different optimization levels -- using -O3 (and making foo noinline) I was able to get output which looks like xaxxon's solution.

Well, semantically it is possible to do double copy in case of rvalue reference, but for real cases I would expect compiler to use pointers - after all, code with real (not made by std::move) rvalues (and big rvalues - say, std::vector constructed on-the-fly) would be better off with pass-by-non-const-pointer — Severin Pappadeux, Aug 14 '18 at 03:17
"running g++ -S gave me the same 51-line assembly" - try it with different optimization levels (`-O1` vs `--O2` vs `-O3` vs `-Os`). — Jesper Juhl, Aug 14 '18 at 05:21
I second @JesperJuhl: `-S` is virtually never meaningful without `-Os`, and especially not with `-O0` or `-O3`. Only `-S -Os` produces near-readable assembler code that shows what's actually going on. That said, your template `foo<>` is not even *trying* to actually use its parameter. The optimizer will throw out what you try to look at. For proper analysis, define three non-template functions like `int foo_noref(int arg) { return arg; }` **in a separate file** and compile with `-S -Os`. Then do the same for the calls `void bar_noref() { int i = 0; foo_noref(i); }`. — cmaster - reinstate monica, Aug 14 '18 at 11:42
Yes it adds the pointer, so all uses of the referred object will involve a pointer indirection. On the other hand, if the object you are referencing was bigger, then passing by value could invoke all the cost of making a copy, even if you then only accessed 1 member within the function. Pass the object on to another method by value and you make another copy. Sometimes this is what you want to do, but generally it is good policy to pass simple values by value and larger objects by const ref. — Gem Taylor, Aug 14 '18 at 15:10
@GemTaylor: Well put. I've been looking for some template metaprogramming tool to pass simple values by value and larger objects by const ref for arbitrary types, part of why I brought this question up. — Taylor Nichols, Aug 15 '18 at 19:43
@TaylorNichols When it comes to _TMP_ everything is inlined, and the compiler optimiser should reduce most reference parameters back to the original declaration, so it shouldn't matter whether you use const references or value copies. — Gem Taylor, Aug 15 '18 at 19:48
If it does make a difference, you can always add conditional SFINAE and have 2 versions, but I suspect mainly it won't make much difference. `is_integral` will be your friend here. — Gem Taylor, Aug 15 '18 at 19:51
@GemTaylor: I'm mainly thinking about non-temporary variables, such as class members, which won't get inlined. Also if functions take n parameters I'd have 2^n versions so I'm still considering the cleanest implementation. — Taylor Nichols, Aug 15 '18 at 19:51

xaxxon · Accepted Answer · 2018-08-14T03:43:23.627

In your specific situation, it's likely they are all the same. The resulting code from godbolt with gcc -O3 is https://godbolt.org/g/XQJ3Z4 for:

#include <utility>

// runtime dominated by argument passing
template <class T>
int foo(T t) { return t;}

int main() {
    int i{0};
    volatile int j;
    j = foo<int>(i); // fast -- int is scalar type
    j = foo<int&>(i); // slow -- lvalue reference overhead
    j = foo<int&&>(std::move(i)); // ???
}

is:

    mov     dword ptr [rsp - 4], 0 // foo<int>(i);
    mov     dword ptr [rsp - 4], 0 // foo<int&>(i);
    mov     dword ptr [rsp - 4], 0 // foo<int&&>(std::move(i)); 
    xor     eax, eax
    ret

The volatile int j is so that the compiler cannot optimize away all the code because it would otherwise know that the results of the calls are discarded and the whole program would optimize to nothing.

HOWEVER, if you force the function to not be inlined, then things change a bit int __attribute__ ((noinline)) foo(T t) { return t;}:

int foo<int>(int):                           # @int foo<int>(int)
        mov     eax, edi
        ret
int foo<int&>(int&):                          # @int foo<int&>(int&)
        mov     eax, dword ptr [rdi]
        ret
int foo<int&&>(int&&):                          # @int foo<int&&>(int&&)
        mov     eax, dword ptr [rdi]
        ret

above: https://godbolt.org/g/pbZ1BT

For questions like these, learn to love https://godbolt.org and https://quick-bench.com/ (quick bench requires you to learn how to properly use google test)

I like the `volatile int` trick. Theoretically, could the compiler also optimize away the calls to `foo(i)` because it knows the input is discarded? — Taylor Nichols, Aug 14 '18 at 03:28
@TaylorNichols without the volatile, the whole program is optimized to nothing: https://godbolt.org/g/e3n6BA. With volatile, it means the compiler doesn't know that something doesn't happen between each assignment (something that's not present in the code), so it has to actually do "the right thing" which means setting a value "as if" it had called the function. — xaxxon, Aug 14 '18 at 03:31
That makes sense, so `foo` never even gets called and we just get `j = i;` in each case, hence the `mov` statements. I suppose my question would be more relevant if `foo` was actually called. — Taylor Nichols, Aug 14 '18 at 03:39
well, then it looks like ref vs no-ref are a little different: https://godbolt.org/g/pbZ1BT — xaxxon, Aug 14 '18 at 03:42

score 5 · Answer 2 · edited Jun 20 '20 at 09:12

5

Efficiency of parameter passing depends on the ABI.

For example, on linux the Itanium C++ ABI specifies that references are passed as pointers to the referred object:

3.1.2 Reference Parameters

Reference parameters are handled by passing a pointer to the actual parameter.

This is independent of the reference category (rvalue/lvalue reference).

For a broader view, I have found this quote in a document from the Technical University of Denmark, calling convention, which analyzes most of the compilers:

References are treated as identical to pointers in all respects.

So rvalue and lvalue reference involve pointer overhead on all ABI.

edited Jun 20 '20 at 09:12

Community

1
1

answered Aug 14 '18 at 10:43

Oliv

17,610
1
29
72

Thanks -- I was hoping to find some official docs on this. I suppose rvalue references would have to use something like pointers, as moving from an object often involves resetting it's variables from a different scope. – Taylor Nichols Aug 15 '18 at 19:49

Do rvalue references have the same overhead as lvalue references?

2 Answers2