I'm trying to create a coroutine implementation for fun to be used with C programs. I'm running on Windows 64bit, Intel x86-64. I'm compiling my function with CodeBlocks 17.12 which is using mingw32-gcc (MinGW.org GCC-6.3.0-1). My current implementation segfaults with 0xC0000005 (access violation) when I try and call printf in the coroutine. I'm at a loss for what the problem is and how to debug it.
I'm getting my inspiration from:
- Fast fibers/coroutines under x64 Windows which links to:
- https://the8bitpimp.wordpress.com/2014/10/21/coroutines-x64-and-visual-studio/
main.c
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
extern void swap32(void*, void*);
typedef struct {
    uintptr_t sp;
} Coro;
Coro mainco;
Coro otherco;
uint8_t costack[1024];
void cofunc(Coro* coro)
{
    printf("Hello");
    unsigned i;
    for(i = 10; i < 20; i++)
    {
        printf("coro: %d\n", i);
        swap32(&otherco, &mainco);
    }
}
void prepare(Coro* coro, void* stack, unsigned stacksize, void (*func)(Coro*))
{
    uint32_t* ptr;
    ptr    = (uint32_t*)((uint8_t*)stack + stacksize);
    ptr   -= 7;
    ptr[0] = 1;                 /* edi */
    ptr[1] = 2;                 /* esi */
    ptr[2] = 3;                 /* ebx */
    ptr[3] = (uint32_t)(ptr+5); /* ebp TODO: fix? */
    ptr[4] = (uint32_t)func;
    ptr[5] = (uint32_t)swap32;  /* return instruction address TODO: fix? */
    coro->sp = ptr;
}
int main()
{
    prepare(&otherco, costack, 1024, cofunc);
    unsigned i;
    for(i = 0; i < 10; i++)
    {
        printf("main:  %d\n", i);
        swap32(&mainco, &otherco);
    }
    return 0;
}
coro.S
Assembly in AT&T syntax.
.global _swap32
// ecx: context for the coroutine that is being paused.
// edx: context for the coroutine that is being resumed.
_swap32:
    mov 4(%esp), %ecx
    mov 8(%esp), %edx
    # Save registers
    push %ebp
    push %ebx
    push %esi
    push %edi
    # Save old stack
    mov %esp, (%ecx)
    # Load new stack
    mov (%edx), %esp
    # Restore registers
    pop %edi
    pop %esi
    pop %ebx
    pop %ebp
    ret
Here is my general understanding:
- When a coroutine yields/pauses, it needs to store its stack pointer, instruction pointer, and registers somewhere. 
- In my implementation, the return address (next instruction pointer) is stored in the C call to - swap32because the calling- swap32pushes the return address to the current stack.
- swap32saves the current stack pointer then swaps the stack pointer for the destination coroutine so that when- retis finally executed, the saved instruction pointer for the destination coroutine is called.
- A coroutine is initialized by setting up a stack frame as if it had called - swap32. In my implementation, the register state is saved in the stack. Therefore, the initialized coroutine's stack would look like this (with decreasing addresses):- function parameters <--- start of function's stack return address <--- where a normal ebp would point to swap32 return address <--- pushed because of call to swap32 saved context <--- pushed because swap32 saves context to current stack- That way, - swap32pops the saved context, followed by- retpopping the saved function and resuming the coroutine.
What I see is this: if I remove the calls to printf from cofunc, then the coroutine yields back to main successfully. If I add the calls to printf back to cofunc, then my program crashes. So here is where my understanding runs out.
Question:
Why does it crash with printf and not crash without? Is it because I haven't initialized the coroutine correctly or is there another reason for it to be crashing?
