To flip a bit in an integer, use xor like this:  foo ^= 1.
gcc knows this optimization already for bool, so you can return !status; like a normal person without losing any efficiency.  gcc does compile status ^= 1 to an xor instruction as well.  In fact, all your ideas except the table lookup compile to a single xor instruction with bool input / return value.
Check it out on the Godbolt compiler explorer with gcc -O3, with asm output panes for bool and int.
MYTYPE func4(MYTYPE status) {
    status ^=1;
    return status;
}
  # same code for bool or int
  mov eax, edi
  xor eax, 1
  ret
vs.
MYTYPE func1(MYTYPE status) {
    status = !status;
    return status;
}
  # with -DMYTYPE=bool
  mov eax, edi
  xor eax, 1
  ret
  # with int
  xor eax, eax
  test edi, edi
  sete al
  ret
Semi-related: XOR is add-without-carry.
So if you only care about the low bit, you can copy-and-flip-low with lea eax, [rdi+1].
See Check if a number is even where that's useful in combination with and eax, 1 to get it done in 2 instructions.
Why is bool different from int?
The x86-64 System V ABI requires that callers passing a bool pass a 0 or 1 value, not just any non-zero integer.  Thus, the compiler can assume that about the input.
But with int foo, the C expression !foo requires "booleanizing" the value.  !foo has type _Bool / (aka bool if you #include <stdbool.h>), and converting that back to an integer must produce a value of 0 or 1.  If the compiler doesn't know that foo must be 0 or 1, it can't optimize !foo to foo^=1, and can't realize that foo ^= 1 flips a value between truthy / falsy.  (In the sense that if(foo) means if(foo != 0) in C).
This is why you get test/setcc (zero-extended into a 32-bit int by xor-zeroing a register before the test).
Related:  Boolean values as 8 bit in compilers. Are operations on them inefficient?.  Stuff like (bool1 && bool2) ? x : y isn't always compiled as efficiently as you might hope.  Compilers are pretty good, but do have missed-optimization bugs.
What about that extra mov instruction?
It will go away when inlining, if the compiler doesn't need / want to keep the old un-flipped value around for later.  But in a stand-alone function, the first arg is in edi, and the return value needs to be in eax (in the x86-64 System V calling convention).
Tiny functions like this are a close approximation to what you might get as part of a large function (if this flip couldn't be optimized into something else), but needing the result in a different register is a confounding factor.
x86 doesn't have a copy-and-xor integer instruction, so for a stand-alone function it will take at least a mov to copy from the arg-passing register to eax.
lea is special: it's one of the few integer ALU instructions that can write the result to a different register instead of destroying its input.  lea is a copy-and-shift/add instruction, but there's no copy-and-xor instruction in x86.  Many RISC instruction sets have 3-operand instructions, for example MIPS could do xor $t1, $t2, $t3.
AVX introduced non-destructive versions of vector instructions (saving a lot of movdqa / movups register-copying in a lot of code), but for integer there are only a few new instructions that do different things.  rorx eax, ecx, 16  for example does eax = rotate_right(ecx, 16), and uses the same VEX encoding that non-destructive AVX instructions use.