I have already searched for some answers on Google and Stack Overflow, and I am aware that compilers cannot assume that functions won't modify arguments passed through const references, as these functions might obtain a non-const reference via const_cast. However, doing this is undefined behavior when the original object itself is defined as const. From cppreference
Modifying a const object through a non-const access path and referring to a volatile object through a non-volatile glvalue results in undefined behavior.
For the following code
void fun(const int&);
int f1() {
const int i = 3;
fun(i);
return i;
}
static int bar(const int i) {
fun(i);
return i;
}
int f2() {
return bar(3);
}
Both GCC and Clang are capable of optimizing the function f1() to directly return 3, as the compilers consider that calling fun(i) won't modify the value of i, since such an action would result in undefined behavior. However, both GCC and Clang are unable to apply the same optimization to the function f2(). The compilers still generate code to load the value of i from memory. Below is the code for f1() and f2() generated by GCC. Compiler Explorer
f1():
subq $24, %rsp
leaq 12(%rsp), %rdi
movl $3, 12(%rsp)
call fun(int const&)
movl $3, %eax ! <-- Returns 3 directly.
addq $24, %rsp
ret
f2():
subq $24, %rsp
leaq 12(%rsp), %rdi
movl $3, 12(%rsp)
call fun(int const&)
movl 12(%rsp), %eax ! <-- Load the return value from memory.
addq $24, %rsp
ret
Even though the standard does not require that compilers must perform such optimizations, I believe compilers should have the capability to optimize f2() to directly return 3 as well. In my view, this would result in more efficient code (please correct me if I'm mistaken). When the compiler inlines the calling bar(3) into function f2(), it should be able to deduce that calling fun(i) will not modify the value of i.
Continuing with another example. When I replace the variable i in function f1() with a class type, Clang is still able to optimize it to return 3. However, GCC opts to load the return value from memory instead:
struct A {
int i;
};
void fun(const A&);
int f3() {
const A a{3};
fun(a);
return a.i;
}
Here is the code generated by GCC:
f3():
subq $24, %rsp
leaq 12(%rsp), %rdi
movl $3, 12(%rsp)
call fun(A const&)
movl 12(%rsp), %eax ! <-- Load the return value from memory.
addq $24, %rsp
ret
and Clang:
f3():
pushq %rax
movl $3, (%rsp)
movq %rsp, %rdi
callq fun(A const&)@PLT
movl $3, %eax ! <-- Returns 3 directly.
popq %rcx
retq
Why doesn't GCC optimize the function to directly return 3? Is it because GCC considers loading the return value from memory to be equally efficient as directly returning a constant?