movq  [rbp],xmm0 overwrites the saved RBP value that enter pushed.  This would be more obvious if you hadn't used enter, but [rbp+0] is not an address you can use in a function with a stack frame.
([rbp-8] is the highest address you can use for locals.  [rsp] would have worked, because you decremented RSP after enter set RBP=RSP, but you used RBP.)
When execution returns to main, gcc -O0 (anti-optimized for debugging) runs these instructions to store the function return value from xmm0 into stack space for d_2 instead of just passing it directly to printf while it's still in a register:
movq   rax,xmm0
mov    QWORD PTR [rbp-0x8],rax    # Using RBP after you clobbered it.
Un-optimized gcc output is really silly: copying FP data to an integer register instead of storing directly with movsd makes no sense.  But that's not the issue.
RBP holds the IEEE double precision bit-pattern for 1.22 (0x3ff3851eb851eb85) because that's what your func clobbered it with.
The address rbp-8 is not canonical: the high 16 bits don't match bit 47, so it's not a sign-extended 48-bit virtual address.  (See this ASCII-art diagram).
Using a non-canonical address on current x86-64 hardware generates a #GP(0) exception (according to Intel's manual entry for mov), and Linux maps this x86 exception to SIGBUS.
This is why you get a bus error instead of the usual segmentation fault for trying to access unmapped memory with a bogus pointer.
Your code is over-complicated and wrong
In both mainstream x86-64 calling conventions (Linux/OS X use x86-64 System V), double is returned in xmm0. Use addsd xmm0,xmm0 / ret like a normal person, like the answer on the question you linked shows.
func:
    addsd   xmm0,xmm0   ; first FP arg in (low 64 bits of) xmm0
    ret                 ; return value in (low 64 bits of) xmm0
Or if you insist on x87, then look how much code you have to write:
func:
    movsd  [rsp-8], xmm0      ; double arg in xmm0
    fld    qword [rsp-8]
    fadd   st0, st0           ; use x87 regs instead of uselessly loading twice.
    fstp   qword [rsp-8]      ; empty the x87 stack
    movsd  xmm0, [rsp-8]      ; return value in xmm0
    ret
That's using 8 bytes below RSP as scratch space, in the red-zone to store/reload to get data between SSE2 registers and x87, because the x86-64 calling conventions are designed around SSE2, using xmm registers.  Use sub rsp, 8 / add rsp, 8 if you don't want to use the red-zone.
Don't use x87 in x86-64 unless you need 80-bit floating-point precision.
(enter is slow and not recommended; make a stack frame with push rbp / mov rbp,rsp if you want one.  leave is fine, though.  Making a stack frame is optional; I left that out.)
printf doesn't need "%lf" to print a double, only scanf needs lf.  You can't printf a single-precision float, because C default promotion rules apply to args of variadic functions, and thus any float is promoted to double.
In most C implementations (including glibc), "%lf" works anyway, silently ignoring the meaningless l modifier on the %f conversion.
I mention this in case you try to do that with call printf with a  "%f" format string from asm later, and run into How to print a single-precision float with printf.