I have built a simple application using GNU gcc (4.8.1429) for ARM (ATsam4LC4B: cortexM4 core, 256K flash) that:
- loads at 0x3F000 (linker option -Ttext=0x3F000) and initializes,
 - copies itself (mirrors) at 0
 - jumps to the mirror Reset_Handler() (near 0) using assembly
 - the mirror application begins to run as expected
 
The problem raises when the flash memory location of the mirror main() call (flash location 0x532) is reached:
The instruction ldr [pc, #32] that is located there, loads r3 with 0x3F565 (thus calling the original main()), instead of the expected 0x565 (so that to call the mirror main() as expected).
This happens even if all the registers containing 0x3F... are zeroed prior to the ldr instruction.
The debugger reports the PC register to be 0x534 as expected and in compliance to the flash region within the stepping is performed.
Can anybody please explain to me what is going on?
Thank you in advance.
The program is compiled with [-mthumb -nostdlib -nostartfiles -ffreestanding -msingle-pic-base -mno-pic-data-is-text-relative -mlong-calls -mpoke-function-name] flags and linked with the [-Wl,--gc-sections -mcpu=cortex-m4 -Ttext=0x3f000 -nostdlib -nostartfiles -ffreestanding -Wl,-dead-strip -Wl,-static -Wl,--entry=Reset_Handler -Wl,--cref -mthumb] flags. The linker script defines rom start=0.
The region of interest within the produced .lss file contains (omitted lines are marked as ...):
...
void Reset_Handler(void)
{
...
    /* Initialize the relocate segment */
...
    /* Clear the zero segment */
...
    /* Set the vector table base address */
    /* Initialize the C library */
//  __libc_init_array();
    /* Branch to main function */
    main();
   3f532:   4b08        ldr r3, [pc, #32]   ; (3f554 <Reset_Handler+0x5c>)
   3f534:   4798        blx r3
   3f536:   e7fe        b.n 3f536 <Reset_Handler+0x3e>
   3f538:   20000000    .word   0x20000000
   3f53c:   0003f59c    .word   0x0003f59c
   3f540:   20000004    .word   0x20000004
   3f544:   20000004    .word   0x20000004
   3f548:   20000008    .word   0x20000008
   3f54c:   e000ed00    .word   0xe000ed00
   3f550:   0003f000    .word   0x0003f000
   3f554:   0003f565    .word   0x0003f565
   3f558:   6e69616d    .word   0x6e69616d
   3f55c:   00          .byte   0x00
   3f55d:   00          .byte   0x00
   3f55e:   bf00        nop
   3f560:   ff000008    .word   0xff000008
0003f564 <main>:
int main (void)
    {
   3f564:   b510        push    {r4, lr}
...
        asm("bx %0"::"r"(*(unsigned*)0x3F004 - 0x3F000));
   3f580:   4b05        ldr r3, [pc, #20]   ; (3f598 <main+0x34>)
   3f582:   681b        ldr r3, [r3, #0]
   3f584:   f5a3 337c   sub.w   r3, r3, #258048 ; 0x3f000
   3f588:   4718        bx  r3
        }
    return 0;
    }
   3f58a:   2000        movs    r0, #0
   3f58c:   bd10        pop {r4, pc}
   3f58e:   bf00        nop
   3f590:   e0001000    .word   0xe0001000
   3f594:   0003f329    .word   0x0003f329
   3f598:   0003f004    .word   0x0003f004
...