Reason why DX and R10 are zero
According to the manpage of clone, these are used only when CLONE_PARENT_SETTID, CLONE_CHILD_SETTID is set.
CLONE_PARENT_SETTID (since Linux 2.5.49)
                Store the child thread ID at the location ptid in the parent's
                memory.  (In Linux 2.5.32-2.5.48 there was a flag CLONE_SETTID
                that did this.)  The store operation completes before clone()
                returns control to user space.
CLONE_CHILD_SETTID (since Linux 2.5.49)
                Store the child thread ID at the location ctid in the child's
                memory.  The store operation completes before clone() returns
                control to user space.
DX and R10 corresponds to ptid and ctid in this manpage (Reference).
Actually, this flag is not set when calling runtime.clone() from os_linux.go: Source.
The reason they don't need tid is maybe because it's not a library such as pthread which user does something complicated using tid. 
What R8, R9, and R12 are used for
In short, R8, R9 and R12 are not used by system call but used to construct the stack after it.
Note that R8 and R9 are passed as argument to system call, but not used by clone (see the reason below), and R12 is preserved after system call, it is safe to use these registers after system call. (Reference)
Let's see the detail. 
internally runtime.clone is called as follows: Source
func newosproc(mp *m) {
    stk := unsafe.Pointer(mp.g0.stack.hi)
    ....
    ret := clone(cloneFlags, stk, unsafe.Pointer(mp), unsafe.Pointer(mp.g0), unsafe.Pointer(funcPC(mstart)))
    ....
}
Reading Quick Guide to Go's Assembler, and the code OP posted, you can see that R8 is pointer to mp, and R9 is pointer to mp.g0 and R12 is pointer to some function which you want to call in the cloneed thread. (structure of m and g looks like this: Source and this: Source
).
R8 is argument to clone which indicates tls(thread local storage), but it is not used unless CLONE_SETTLS is set: Source
R9 is generally used as 6th argument to system call, but clone does not use it because it only uses 5 arguments(Source).
R12 is a register which is preserved after system call.
So finally let's see the source of runtime.clone. The important thing is after the SYSCALL. They are doing some stack setup using R8 and R9 in the child thread which is created, and finally calling R12.
// int32 clone(int32 flags, void *stk, M *mp, G *gp, void (*fn)(void));
TEXT runtime·clone(SB),NOSPLIT,$0
    MOVL    flags+0(FP), DI
    MOVQ    stk+8(FP), SI
    MOVQ    $0, DX
    MOVQ    $0, R10
    // Copy mp, gp, fn off parent stack for use by child.
    // Careful: Linux system call clobbers CX and R11.
    MOVQ    mp+16(FP), R8
    MOVQ    gp+24(FP), R9
    MOVQ    fn+32(FP), R12
    MOVL    $SYS_clone, AX
    SYSCALL
    // In parent, return.
    CMPQ    AX, $0
    JEQ 3(PC)
    MOVL    AX, ret+40(FP)
    RET
    // In child, on new stack.
    MOVQ    SI, SP
    // If g or m are nil, skip Go-related setup.
    CMPQ    R8, $0    // m
    JEQ nog
    CMPQ    R9, $0    // g
    JEQ nog
    // Initialize m->procid to Linux tid
    MOVL    $SYS_gettid, AX
    SYSCALL
    MOVQ    AX, m_procid(R8)
    // Set FS to point at m->tls.
    LEAQ    m_tls(R8), DI
    CALL    runtime·settls(SB)
    // In child, set up new stack
    get_tls(CX)
    MOVQ    R8, g_m(R9)
    MOVQ    R9, g(CX)
    CALL    runtime·stackcheck(SB)
nog:
    // Call fn
    CALL    R12
//(omitted)