Assembly Coroutines

Question

I'm trying to create a coroutine implementation for fun to be used with C programs. I'm running on Windows 64bit, Intel x86-64. I'm compiling my function with CodeBlocks 17.12 which is using mingw32-gcc (MinGW.org GCC-6.3.0-1). My current implementation segfaults with 0xC0000005 (access violation) when I try and call printf in the coroutine. I'm at a loss for what the problem is and how to debug it.

I'm getting my inspiration from:

main.c

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

extern void swap32(void*, void*);

typedef struct {
    uintptr_t sp;
} Coro;

Coro mainco;
Coro otherco;

uint8_t costack[1024];

void cofunc(Coro* coro)
{
    printf("Hello");

    unsigned i;
    for(i = 10; i < 20; i++)
    {
        printf("coro: %d\n", i);

        swap32(&otherco, &mainco);
    }
}

void prepare(Coro* coro, void* stack, unsigned stacksize, void (*func)(Coro*))
{
    uint32_t* ptr;
    ptr    = (uint32_t*)((uint8_t*)stack + stacksize);
    ptr   -= 7;
    ptr[0] = 1;                 /* edi */
    ptr[1] = 2;                 /* esi */
    ptr[2] = 3;                 /* ebx */
    ptr[3] = (uint32_t)(ptr+5); /* ebp TODO: fix? */
    ptr[4] = (uint32_t)func;
    ptr[5] = (uint32_t)swap32;  /* return instruction address TODO: fix? */
    coro->sp = ptr;
}

int main()
{
    prepare(&otherco, costack, 1024, cofunc);

    unsigned i;
    for(i = 0; i < 10; i++)
    {
        printf("main:  %d\n", i);

        swap32(&mainco, &otherco);
    }

    return 0;
}

coro.S

Assembly in AT&T syntax.

.global _swap32

// ecx: context for the coroutine that is being paused.
// edx: context for the coroutine that is being resumed.
_swap32:
    mov 4(%esp), %ecx
    mov 8(%esp), %edx
    # Save registers
    push %ebp
    push %ebx
    push %esi
    push %edi
    # Save old stack
    mov %esp, (%ecx)
    # Load new stack
    mov (%edx), %esp
    # Restore registers
    pop %edi
    pop %esi
    pop %ebx
    pop %ebp
    ret

Here is my general understanding:

When a coroutine yields/pauses, it needs to store its stack pointer, instruction pointer, and registers somewhere.
In my implementation, the return address (next instruction pointer) is stored in the C call to swap32 because the calling swap32 pushes the return address to the current stack.
swap32 saves the current stack pointer then swaps the stack pointer for the destination coroutine so that when ret is finally executed, the saved instruction pointer for the destination coroutine is called.
A coroutine is initialized by setting up a stack frame as if it had called swap32. In my implementation, the register state is saved in the stack. Therefore, the initialized coroutine's stack would look like this (with decreasing addresses):
```
function parameters    <--- start of function's stack
return address         <--- where a normal ebp would point to
swap32 return address  <--- pushed because of call to swap32
saved context          <--- pushed because swap32 saves context to current stack
```
That way, swap32 pops the saved context, followed by ret popping the saved function and resuming the coroutine.

What I see is this: if I remove the calls to printf from cofunc, then the coroutine yields back to main successfully. If I add the calls to printf back to cofunc, then my program crashes. So here is where my understanding runs out.

Question:

Why does it crash with printf and not crash without? Is it because I haven't initialized the coroutine correctly or is there another reason for it to be crashing?

Just a guess: If this is _64_ bit, are the `mov` instructions in `swap32` correct [and the amount of save/restore space]? You're using (e.g.) `%ecx` instead of `%rcx`, etc. The linked example is for 64 bit [and uses the "r" forms of the registers]. The ABI may be different, notably the "shadow space", for 32 bit mode. Also, I'm not sure if the asm syntax is intel (`mov dst,src`) or AT&T (`mov src,dst`). First line of `swap32` says intel (i.e. save `%ecx` to stack), but the `%esp` move insts say AT&T (i.e. they look reversed) — Craig Estey, Jul 07 '20 at 01:08
It's AT&T syntax. The linked example is 64 bit but I'm building it for 32 bit. I'll clarify `%ecx` and `%edx` in the question. `%ecx` is set to the current context that is being suspended and `%edx` is set to the destination context that is being resumed. For example, `main()` calls `swap32(&mainco, &otherco)` which sets `%ecx` to `&mainco` (current context) and `%edx` to `&otherco` (destination context). — thndrwrks, Jul 07 '20 at 01:59
@thndrwrks You can't just take a 64 bit example and build it for 32 bit because the calling conventions are different. Make sure to pay attention to that. — fuz, Jul 07 '20 at 09:22
I believe stacks have certain alignment requirements on x64 but I don't remember the requirements. Something like a multiple of 16 bytes plus 8? — user253751, Jul 07 '20 at 10:42
@fuz Yeah I've used the linked example as a hint for writing the code in the question for 32-bit. — thndrwrks, Jul 07 '20 at 18:19
@user253751 The program raises the SIGSEGV signal and returns error code 0xC0000005 which is access violation. — thndrwrks, Jul 07 '20 at 18:20
Did you use a debugger to find out which line of (assembly?) code crashes? — user253751, Jul 07 '20 at 18:48
The crashing instruction is `unlock+4062` which is `mov %eax,(%esp)`. I'm also trying to fix the alignment of the stack which is probably wrong in the code listed in the question. — thndrwrks, Jul 07 '20 at 19:04
Should the coroutine's stack be located in a particular section of memory? — thndrwrks, Jul 07 '20 at 20:02
Doh I'm dumb. I think I just figured it out. Alignment was not an issue. The stack for the coroutine was **way** too small. Resizing the coroutine's stack to 8192 bytes causes the program to not crash! — thndrwrks, Jul 07 '20 at 20:19

Assembly Coroutines

main.c

coro.S

Question:

0 Answers0