Whether a name used in a function is a global variable or a local one is determined at compile time, not at run time. Your functions that cause exceptions are trying to have it both ways, to either access a global, or provide their own local variable replacement, but Python's scoping rules don't allow that. It needs to be local or global, and can't be in a nebulous either/or state.
In your Test 1, the function raises an exception because the compiler saw that the code could assign to take_sum as a local variable, and so it makes all the references to take_sum in the code be local. You can no longer look up the global variable take_sum in the normal way once that determination has been made.
A global statement is in effect a compiler directive to change the assumption that an assignment makes a variable local. Subsequent assignments will be made globally, not locally. It's not something that executes at runtime, which is why your other two test cases are so confusing to you.
Test 2 fails because you're trying to tell the compiler that take_sum is a global after it has already seen some of your code make a local assignment to that name. In Test 3, the global statement comes first, so it makes the assignment (in the other branch!) assign to a global variable. It doesn't actually matter that the global statement was in a different branch than the assignment, the compiler interprets the global statement at compile time, not at runtime when the conditional logic of the ifs and elifs gets handled.
It might help your understanding of what is going on to disassemble some of the main functions you've written using the dis.dis function in the standard library. You'll see that there are two different sets of bytecodes used for loading and storing of variables, LOAD_GLOBAL/STORE_GLOBAL for global variables (used in all your functions to get names like print and globals), and LOAD_FAST/STORE_FAST which are used for local variables (like a, b and c in take_sum). The compiler behavior I talked about above boils down to which bytecode it chooses for each lookup or assignment.
If I rename the main function in Test 1 to test1, here's what I get when I disassemble it:
dis.dis(test1)
  2           0 LOAD_CONST               1 ('take_sum')
              2 LOAD_GLOBAL              0 (globals)
              4 CALL_FUNCTION            0
              6 CONTAINS_OP              1
              8 POP_JUMP_IF_FALSE       18
  3          10 LOAD_CONST               2 (<code object <lambda> at 0x0000019022A05F50, file "<ipython-input-23-0cc3c65f7038>", line 3>)
             12 LOAD_CONST               3 ('test1.<locals>.<lambda>')
             14 MAKE_FUNCTION            0
             16 STORE_FAST               0 (take_sum)
  5     >>   18 LOAD_GLOBAL              1 (print)
             20 LOAD_FAST                0 (take_sum)
             22 CALL_FUNCTION            1
             24 POP_TOP
             26 LOAD_CONST               0 (None)
             28 RETURN_VALUE
Disassembly of <code object <lambda> at 0x0000019022A05F50, file "<ipython-input-23-0cc3c65f7038>", line 3>:
  3           0 LOAD_FAST                0 (a)
              2 LOAD_FAST                1 (b)
              4 BINARY_ADD
              6 LOAD_FAST                2 (c)
              8 BINARY_ADD
             10 RETURN_VALUE
Notice that the lookup of take_sum on line 5 is on byte 20 in the bytecode, where it uses LOAD_FAST. This is the bytecode that causes the UnboundLocalError, since there has been no local assigned if the global function exists.
Now, lets look at Test 3:
dis.dis(test3)
  2           0 LOAD_CONST               1 ('take_sum')
              2 LOAD_GLOBAL              0 (globals)
              4 CALL_FUNCTION            0
              6 CONTAINS_OP              0
              8 POP_JUMP_IF_FALSE       12
  3          10 JUMP_FORWARD            18 (to 30)
  5     >>   12 LOAD_CONST               1 ('take_sum')
             14 LOAD_GLOBAL              0 (globals)
             16 CALL_FUNCTION            0
             18 CONTAINS_OP              1
             20 POP_JUMP_IF_FALSE       30
  6          22 LOAD_CONST               2 (<code object <lambda> at 0x0000019022A43500, file "<ipython-input-26-887b66de7e64>", line 6>)
             24 LOAD_CONST               3 ('test3.<locals>.<lambda>')
             26 MAKE_FUNCTION            0
             28 STORE_GLOBAL             1 (take_sum)
  8     >>   30 LOAD_GLOBAL              2 (print)
             32 LOAD_GLOBAL              1 (take_sum)
             34 CALL_FUNCTION            1
             36 POP_TOP
  9          38 LOAD_GLOBAL              2 (print)
             40 LOAD_GLOBAL              1 (take_sum)
             42 LOAD_CONST               4 (1)
             44 LOAD_CONST               5 (2)
             46 LOAD_CONST               6 (3)
             48 CALL_FUNCTION            3
             50 CALL_FUNCTION            1
             52 POP_TOP
             54 LOAD_CONST               0 (None)
             56 RETURN_VALUE
Disassembly of <code object <lambda> at 0x0000019022A43500, file "<ipython-input-26-887b66de7e64>", line 6>:
  6           0 LOAD_FAST                0 (a)
              2 LOAD_FAST                1 (b)
              4 BINARY_ADD
              6 LOAD_FAST                2 (c)
              8 BINARY_ADD
             10 RETURN_VALUE
This time the lookup of take_sum happens on bytecode 40, and it's a LOAD_GLOBAL (which succeeds since there is a global variable of that name).