I've seen that gcc does a great job in optimizing integer divisions with a constant divisor.
E.g.
uint64_t f(uint64_t x)
{
    return x / 1000;
}
uint64_t f2(uint64_t x)
{
    return x / 10000;
}
becomes:
f:
        push    rbp
        mov     rbp, rsp
        mov     QWORD PTR [rbp-8], rdi
        mov     rax, QWORD PTR [rbp-8]
        shr     rax, 3
        movabs  rdx, 2361183241434822607
        mul     rdx
        mov     rax, rdx
        shr     rax, 4
        pop     rbp
        ret
f2:
        push    rbp
        mov     rbp, rsp
        mov     QWORD PTR [rbp-8], rdi
        mov     rax, QWORD PTR [rbp-8]
        movabs  rdx, 3777893186295716171
        mul     rdx
        mov     rax, rdx
        shr     rax, 11
        pop     rbp
        ret
It seems all divisions can be reduced to a shift + multiplication + shift (not always are all of those operations included).
How is this obtained and how can I get manually the required numbers for those three operations? I specifically need to know this for uint64_t,
Thanks a lot
