I've asked a few questions which have touched around this issue, but I've been getting differing responses, so I thought best to ask it directly.
Lets say we have the following code:
// Silly examples of A and B, don't take so seriously,
// just keep in mind they're big and not dynamically allocated.
struct A { int x[1000]; A() { for (int i = 0; i != 1000; ++i) { x[i] = i * 2; } };
struct B { int y[1000]; B() { for (int i = 0; i != 1000; ++i) { y[i] = i * 3; } };
struct C
{
A a;
B b;
};
A create_a() { return A(); }
B create_b() { return B(); }
C create_c(A&& a, B&& b)
{
C c;
c.a = std::move(a);
c.b = std::move(b);
return C;
};
int main()
{
C x = create_c(create_a(), create_b());
}
Now ideally create_c(A&&, B&&) should be a no-op. Instead of the calling convention being for A and B to be created and references to them passed on stack, A and B should created and passed in by value in the place of the return value, c. With NRVO, this will mean creating and passing them directly into x, with no further work for the function create_c to do.
This would avoid the need to create copies of A and B.
Is there any way to allow/encourage/force this behavior from a compiler, or do optimizing compilers generally do this anyway? And will this only work when the compiler inline the functions, or will it work across compilation units.
(How I think this could work across compilation units...)
If create_a() and create_b() took a hidden parameter of where to place the return value, they could place the results into x directly, which is then passed by reference to create_c() which needs to do nothing and immediately returns.