Yes, both.
The C++ memory model requires that atomic operations follow certain semantics, which depend on the specified memory ordering parameter.  So the compiler has to emit code which, when executed, behaves according to those semantics.
For example, taking code like:
std::atomic<int> x;
int y, tmp;
if (x.load(std::memory_order_acquire) == 5) {
    tmp = y;
}
On a typical machine, the compiler would need to:
- Not reorder the loads of - xand- yat compile time.  In other words, it should emit a load instruction of- xand a load instruction of- y, such that the first is executed before the second in program order.
 
- Ensure that the loads of - xand- ybecome visible in that order.  If the machine is capable of out-of-order execution, speculative loads, or any other feature that could cause two loads to become visible out of program order, then the compiler must emit code that prevents it from happening in this instance.
 - What that code looks like, depends on the machine in question.  Possibilities include: - 
- Nothing special is needed, because the machine doesn't do this particular kind of reordering.  So - xand- ywill just be loaded by ordinary load instructions, with nothing extra.  This is the case on x86, for instance, where "all loads are acquire".
 
- Using a special form of the load instruction which inhibits reordering.  For instance, on AArch64, the load of - xwould be done with the- ldapror- ldarinstruction instead of the ordinary- ldr.
 
- Inserting a special memory barrier instruction between the two loads, like ARM's - dmb.
 
 
In the vast majority of code, the memory ordering parameter is specified as a compile-time constant, because the programmer knows statically what ordering is required, and so the compiler can emit the instructions appropriate to that particular ordering.
In the unusual case where the ordering parameter is not a constant, then the compiler has to emit code that will behave properly no matter what value is specified.  Usually what's done is that the compiler just treats the ordering parameter as being memory_order_seq_cst, since that is stronger than all the others: a seq_cst operation satisfies all the semantics required by the weaker orderings (and more besides).  This saves the cost of actually testing the value of the ordering parameter at runtime and branching accordingly, which likely outweighs the potential savings of doing the operation with a weaker ordering.
But if the compiler did choose to test and branch, it would typically have to assume "worst case" for the purposes of optimizing surrounding code.  For instance, on AArch64, for x.load(order) it might emit a chunk of code like the following:
int t;
if (order == std::memory_order_relaxed)
    LDR t, [x]
else if (order == std::memory_order_acquire)
    LDAPR t, [x]
else if (order == std::memory_order_seq_cst)
    LDAR t, [x]
else
    abort();
if (t == 5)
    LDR tmp, [y]
However, it would need to ensure that the load of y remained at the end of this chunk of code (in program order).  If order were equal to std::memory_order_relaxed, then it would be okay to execute the load of y before the load of x, but not if it were std::memory_order_acquire or stronger.
On the other hand, it could conceivably emit
int t, t2;
if (order == std::memory_order_relaxed) {
    LDR t2, [y]
    LDR t, [x]
} else if (order == std::memory_order_acquire) {
    LDAPR t, [x]
    LDR t2, [y]
} else if (order == std::memory_order_seq_cst) {
    LDAR t, [x]
    LDR t2, [y]
else
    abort();
if (t == 5)
    tmp = t2;
but we are now well outside the range of transformations that a real-world compiler would actually perform.