I discovered currently that GCC 10 will no more use the mov and mfence method and instead will use the implied lock by an xchg. Is this sufficient by the memory model to not break any stuff when using multithreading?
As an example I tried on godbolt first with gcc 9.3 and then with gcc 10.2 was the following Code (as optimization I used -O2):
#include <stdint.h>
#include <atomic>
std::atomic_int32_t idx;
int32_t increment(void)
{
return idx = (idx + 1);
}
The results were the following:
GCC 9.3:
increment():
mov eax, DWORD PTR idx[rip]
add eax, 1
mov DWORD PTR idx[rip], eax
mfence
ret
idx:
.zero 4
GCC 10.2:
increment():
mov eax, DWORD PTR idx[rip]
add eax, 1
mov edx, eax
xchg edx, DWORD PTR idx[rip]
ret
idx:
.zero 4
Could someone enlight me or just point me to the right point in the programming manual.
With best regards
Edit: Ok the part with the memory model is answered by the two mentioned threads.
But the other question was: Why it changed now with gcc 10? The issues mentioned about skylake etc. are also a few days old.