Assume there are two threads running on x86 CPU0 and CPU1 respectively. Thread running on CPU0 executes the following commands:
A=1
B=1
Cache line containing A initially owned by CPU1 and that containing B owned by CPU0.
I have two questions:
If I understand correctly, both stores will be put into CPU’s store buffer. However, for the first store
A=1the cache of CPU1 must be invalidated while the second storeB=1can be flushed immediately since CPU0 owns the cache line containing it. I know that x86 CPU respects store orders. Does that mean thatB=1will not be written to the cache beforeA=1?Assume in CPU1 the following commands are executed:
while (B=0);
print A
Is it enough to add only lfence between the while and print commands in CPU1 without adding a sfence between A=1 and B=1 in CPU0 to get 1 always printed out on x86?
while (B=0);
lfence
print A