The effect you're trying to create is not dependent out-of-order execution.  That's only one of the things that can create memory reordering.  Plus, modern x86 does out-of-order execution but uses its Memory Order Buffer to ensure that stores commit to L1d / become globally visible in program order.  (Because x86's memory model only allows StoreLoad reordering, not StoreStore.)
Memory-reordering is separate from instruction execution reordering, because even in-order CPUs use a store buffer to avoid stalling on cache-miss stores.
Out-of-order instruction execution: is commit order preserved?
Are loads and stores the only instructions that gets reordered?
A C implementation on an in-order ARM CPU could print either 11 or 33, if x and f ended up in different cache lines.
I assume you compiled with optimization disabled, so your compiler effectively treats all your variables volatile, i.e. volatile int x,f.  Otherwise the while(f==0); loop will compile to if(f==0) { infloop; }, only checking f once.  (Data race UB for non-atomic variables is what allows compilers to hoist loads out of loops, but volatile loads have to always be done.  https://electronics.stackexchange.com/questions/387181/mcu-programming-c-o2-optimization-breaks-while-loop#387478).
The stores in the resulting asm / machine code will appear in C source order.
You're compiling for x86, which has a strong memory model: x86 stores are release-stores, and x86 loads are acquire loads.  You don't get sequential-consistency, but you get acq_rel for free.  (And with un-optimized code, it happens even if you don't ask for it.)
Thus, when compiled without optimization for x86, your program is equivalent to
_Atomic int x, f;
int main(){
    ...
    pthread_create
    atomic_store_explicit(&x, 33, memory_order_release);
    atomic_store_explicit(&f, 1, memory_order_release);
    ...
}
And similarly for the load side.  The while(f==0){} is an acquire-load on x86, so having the read side wait until it sees non-zero f guarantees that it also sees x==33.
But if you compiled for a weakly-ordered ISA like ARM or PowerPC, the asm-level memory-ordering guarantees there do allow StoreStore and LoadLoad reordering, so it would be possible for your program to print 11 if compiled without optimization.
See also https://preshing.com/20120930/weak-vs-strong-memory-models/