I have a question about the definition of the synchronises-with relation in the C++ memory model when relaxed and acquire/release accesses are mixed on one and the same atomic variable. Consider the following example consisting of a global initialiser and three threads:
int x = 0;
std::atomic<int> atm(0);
[thread T1]
x = 42;
atm.store(1, std::memory_order_release);
[thread T2]
if (atm.load(std::memory_order_relaxed) == 1)
atm.store(2, std::memory_order_relaxed);
[thread T3]
int value = atm.load(std::memory_order_acquire);
assert(value != 1 || x == 42); // Hopefully this is guaranteed to hold.
assert(value != 2 || x == 42); // Does this assert hold necessarily??
My question is whether the second assert in T3 can fail under the C++ memory model. Note that the answer to this SO question suggests that the assert could not fail if T2 used load/acquire and store/release; please correct me if I got this wrong. However, as stated above, the answer seems to depend on how exactly the synchronises-with relation is defined in this case. I was confused by the text on cppreference, and I came up with the following two possible readings.
The second assert fails. The store to
atminT1could be conceptually understood as storing1_releasewhere_releaseis annotation specifying how the value was stored; along the same lines, the store inT2could be understood as storing2_relaxed. Hence, if the load inT3returns2, the thread actually read2_relaxed; thus, the load inT3does not synchronise-with the store inT1and there is no guarantee thatT3seesx == 42. However, if the load inT3returns1, then1_releasewas read, and therefore the load inT3synchronises-with the store inT1andT3is guaranteed to seex == 42.The second assert success. If the load in
T3returns2, then this load reads a side-effect of the relaxed store inT2; however, this store ofT2is present in the modification order ofatmonly if the modification order ofatmcontains a preceding store with a release semantics. Therefore, the load/acquire inT3synchronises-with the store/release ofT1because the latter necessarily precedes the former in the modification order ofatm.
At first glance, the answer to this SO question seems to suggest that my reading 1 is correct. However, that answer seems to be different in a subtle way: all stores in the answer are release, and the crux of the question is to see that load/acquire and store/release establishes synchronises-with between a pair of threads. In contrast, my question is about how exactly synchronises-with is defined when memory orders are heterogeneous.
I actually hope that reading 2 is correct since this would make reasoning about concurrency easier. Thread T2 does not read or write any memory other than atm; therefore, T2 itself has no synchronisation requirements and should therefore be able to use relaxed memory order. In contrast, T1 publishes x and T3 consumes it -- that is, these two threads communicate with each other so they should clearly use acquire/release semantics. In other words, if interpretation 1 turns out to be correct, then the code T2 cannot be written by thinking only about what T2 does; rather, the code of T2 needs to know that it should not "disturb" synchronisation between T1 and T3.
In any case, knowing what exactly is sanctioned by the standard in this case seems absolutely crucial to me.