I wrote this minimal code in C++ that takes 3 numbers: a,b and c and a bitmask r. Then, it has a result L which should be equal to c if second bit in r is set, otherwise equal to b if first bit in r is set and finally a if first 2 bits in r are both not set. I want to use assembly to optimize it and GCC (g++) to compile it and this is my code:
#include <cstdio>
#include <cstdlib>
int main(){
uint a=1;
uint b=2;
uint c=3;
uint r=1;
uint L;
asm(
"mov %2,%0;"
"bt $0,%1;"
"cmovc %3,%0;"
"bt $1,%1;"
"cmovc %4,%0;"
: "=r" (L)
: "r" (r), "r" (a), "r" (b), "r" (c)
);
printf("%d\n",L);
return 0;
}
In the setup above, L should be equal to b, however, no matter with what parameters I try to compile it with, the printed value is always 3, i.e. c. Why is that and how do I write this program correctly?
EDIT: This question is already answered here, but I still want to post an answer to this question because it can only help others. I will write it here since I am forbidden to post it as an actual answer, properly:
It turns out that the code is just fine unles I use -O3 flag, where when I use -O3, the compiler decides to mess up like this:
In this minimal example, it decides to store a and r in the same register, then it stores L to a or b, I am unsure. Anyway, it overwrites registers which it shouldn't.
In my actual code where I wanted to apply this assembly, the L variable is actually a reference given as an argument to a function. Now the compiler decided to store some of a,b or c to L as a way to optimize the code, ignoring completely that L already has a value.
This happens because my assembly snippet doesn't know that it should keep the value of L in its place because I told him that the value is "=r" (write-only) instead of "+r" (read-write).
Also, r should be moved to output operands, again with "+r" because even though bt won't change it, it still understands it as an output operand.