I'm trying to understand how Ordering::SeqCst works. For that I have few code examples where this ordering is mandatory for obtaining consistent result. In first example we just want to increment counter variable:
let a: &'static _ = Box::leak(Box::new(AtomicBool::new(false)));
let b: &'static _ = Box::leak(Box::new(AtomicBool::new(false)));
let counter: &'static _ = Box::leak(Box::new(AtomicUsize::new(0)));
let _thread_a = spawn(move || a.store(true, Ordering::Release));
let _thread_b = spawn(move || b.store(true, Ordering::Release));
let thread_1 = spawn(move || {
    while !a.load(Ordering::Acquire) {} // prevents from reordering everything after
    if b.load(Ordering::Relaxed) { // no need of Acquire due to previous restriction
        counter.fetch_add(1, Ordering::Relaxed);
    }
});
let thread_2 = spawn(move || {
    while !b.load(Ordering::Acquire) {} // prevents from reordering everything after
    if a.load(Ordering::Relaxed) { // no need of Acquire due to previous restriction
        counter.fetch_add(1, Ordering::Relaxed);
    }
});
thread_1.join().unwrap();
thread_2.join().unwrap();
println!("{}", counter.load(Ordering::Relaxed));
Possible values of counter with this example are 1 or 2, depends on thread scheduling. But surprisingly 0 is also possible but I don't understand how.
If thread_1 has started and only a was set to true by _thread_a, counter could will be left untouched after thread_1 will exit.
If thread_2 will start after thread_1, counter will be incremented once, bcs thread_1 has finished (here we know that a is already true), so thread_2 have just to wait for b to become true.
Or if thread_2 will be first and b was set to true, counter will be incremented only once too.
There is also possibility that _thread_a and _thread_b will both run before thread_1 and thread_2 and both of them will increment counter. So that's why 1 and 2 are valid possible outcomes for counter. But as I previously said, there is also a 0 as possible valid result, only if I won't replace all loads and stores for a and b to Ordering::SeqCst:
let _thread_a = spawn(move || a.store(true, Ordering::SeqCst));
let _thread_b = spawn(move || b.store(true, Ordering::SeqCst));
let thread_1 = spawn(move || {
    while !a.load(Ordering::SeqCst) {}
    if b.load(ordering::SeqCst) {
        counter.fetch_add(1, Ordering::Relaxed);
    }
});
let thread_2 = spawn(move || {
    while !b.load(Ordering::SeqCst) {}
    if a.load(ordering::SeqCst) {
        counter.fetch_add(1, Ordering::Relaxed);
    }
});
thread_1.join().unwrap();
thread_2.join().unwrap();
println!("{}", counter.load(Ordering::SeqCst));
Now 0 isn't possible, but I don't know why.
Second example was taken from here:
static A: AtomicBool = AtomicBool::new(false);
static B: AtomicBool = AtomicBool::new(false);
static mut S: String = String::new();
fn main() {
    let a = thread::spawn(|| {
        A.store(true, SeqCst);
        if !B.load(SeqCst) {
            unsafe { S.push('!') };
        }
    });
    let b = thread::spawn(|| {
        B.store(true, SeqCst);
        if !A.load(SeqCst) {
            unsafe { S.push('!') };
        }
    });
    a.join().unwrap();
    b.join().unwrap();
}
Threads a and b could start at same time and modify A and B thus none of them will modify S. Or one of them could start before the other, and modify S, leaving other thread with unmodified S. If I understood correctly, there is no possibility for S to being modified in parallel by both threads? The only reason why Ordering::SeqCst is useful here, to prevent from reordering. But if I will replace all ordering like this:
let a = thread::spawn(|| {
    A.store(true, Release); // nothing can be placed before
    if !B.load(Acquire) { // nothing can be reordered after
        unsafe { S.push('!') };
    }
});
    
let b = thread::spawn(|| {
    B.store(true, Release); // nothing can be placed before
    if !A.load(Acquire) { // nothing can be reordered after
        unsafe { S.push('!') };
    }
});
Isn't it the same as original?
Also Rust docs refers to C++ docs on ordering, where Ordering::SeqCst is described as:
Atomic operations tagged memory_order_seq_cst not only order memory the same way as release/acquire ordering (everything that happened-before a store in one thread becomes a visible side effect in the thread that did a load), but also establish a single total modification order of all atomic operations that are so tagged.
What is single total modification order on concrete example?
 
     
    