I had a university supervision assigment to write a function in assembler, which
would take three 32 bit unsigned numbers a, b, d and
return the result (a * b) / d
The c declaration of this function is:
unsigned int muldiv(const unsigned int a, 
                    const unsigned int b, 
                    const unsigned int d);
Note that we want to be sure that no unnecessary overflow or underflow happens. For example,
if a = 2^31, b = 2^31, d = 2^31,
the answer should be 2^31, despite the fact that a * b would overflow. (See more clarification on that below)
Now I wrote a simple function in c which works, then compiled it to machine code, then disassembled back to assembly code, finally removed some unnecessary instructions.
My final piece of assembly code is:
muldiv:
    imulq   %rsi, %rax
    xorl    %edx, %edx
    divq    %rcx
    ret  
Which works when compiled to executable code and checked on several test cases. However, I do not properly understand, what's happening in this code.
Hence, could anyone explain me why this code works (or perhaps it does not?), in particular:
- why divq %rcx instruction uses only one register? I assume this is the division part, so how does it know which two arguments to use?
- how does it know that when I call muldiv from another place, the arguments a, b and d are stored in the registers %rsi / %rax / e.t.c, not somewhere else?
- why xorl %edx, %edx is necessary? When removed, I get a runtime error. 
- How does it make multiplication on long long numbers using only one instruction, if machine can operate only on 32 bit numbers? 
Clarification of overflow and underflow: This function should return the result as if we're operating on unsigned 64bit numbers. The code in c is as follows:
// NOTE: when compiled to assembly code, I removed A LOT of instructions,
// but it still works
unsigned int muldiv(const unsigned int a, 
    const unsigned int b, 
    const unsigned int d) {
    const unsigned long long la = a;
    const unsigned long long lb = b;
    const unsigned long long ld = d;
    const unsigned long long ab = la * lb;
    const unsigned long long ab_over_d = ab / ld;
    return (unsigned int) ab_over_d;
}
And it worked, when called in such a way:
#include "muldiv.h"
int main(void) { 
   unsigned int a = (1 << 31);
   unsigned int b = (1 << 31);
   unsigned int d = (1 << 31);
   unsigned int result = muldiv(a, b, d);
   printf("%u\n", result);  // prints (1 << 31), which is correct.
   return 0;
}
 
     
     
    