I use gcc to compile a simple test code for ARM Cortex-M4, and it will optimize the usgae of the global variables which confused me. What are the rules that gcc optimizing the usage of global variables?
GCC compiler: gcc-arm-none-eabi-8-2019-q3-update/bin/arm-none-eabi-gcc
Optimization level: -Os
My test code:
The following code is in "foo.c", and the function foo1() and foo2() ard called in task A, the function global_cnt_add() is called in task B.
int g_global_cnt = 0;
void dummy_func(void);
void global_cnt_add(void)
{
    g_global_cnt++;
}
int foo1(void)
{
    while (g_global_cnt == 0) {
        // do nothing
    }
    return 0;
}
int foo2(void)
{
    while (g_global_cnt == 0) {
        dummy_func();
    }
    return 0;
}
The function dummy_func() is implemented in bar.c as following:
void dummy_func(void)
{
    // do nothing
}
The assembly code of function foo1() is shown below:
int foo1(void)
{
    while (g_global_cnt == 0) {
  201218:   4b02        ldr r3, [pc, #8]    ; (201224 <foo1+0xc>)
  20121a:   681b        ldr r3, [r3, #0]
  20121c:   b903        cbnz    r3, 201220 <foo1+0x8>
  20121e:   e7fe        b.n 20121e <foo1+0x6>
        // do nothing
    }
    return 0;
}
  201220:   2000        movs    r0, #0
  201222:   4770        bx  lr
  201224:   00204290    .word   0x00204290
The assembly code of function foo2() is shown below:
int foo2(void)
{
  201228:   b510        push    {r4, lr}
    while (g_global_cnt == 0) {
  20122a:   4c04        ldr r4, [pc, #16]   ; (20123c <foo2+0x14>)
  20122c:   6823        ldr r3, [r4, #0]
  20122e:   b10b        cbz r3, 201234 <foo2+0xc>
        dummy_func();
    }
    return 0;
}
  201230:   2000        movs    r0, #0
  201232:   bd10        pop {r4, pc}
        dummy_func();
  201234:   f1ff fcb8   bl  400ba8 <dummy_func>
  201238:   e7f8        b.n 20122c <foo2+0x4>
  20123a:   bf00        nop
  20123c:   00204290    .word   0x00204290
In the assembly code of function foo1(), the global variable "g_global_cnt" is loaded only once, and the while loop will never be broken. The compiler optimize the usage of "g_global_cnt", and I know I can add volatile to avoid this optimization.
In the assembly code of function foo2(), the global variable "g_global_cnt" is loaded and checked in each while loop, the while loop can be broken.
What are the gcc optimization rules make the difference?
 
     
    