What is the purpose of xorps on the same register?

Question

I was looking at the following disassembled c++ code

    auto test2 = convert<years, weeks>(2.0);
00007FF6D6475ECC  mov         eax,16Dh  
00007FF6D6475ED1  xorps       xmm1,xmm1  
00007FF6D6475ED4  cvtsi2sd    xmm1,rax  
00007FF6D6475ED9  mulsd       xmm1,mmword ptr [__real@4000000000000000 (07FF6D64AFE38h)]  
00007FF6D6475EE1  divsd       xmm1,mmword ptr [__real@401c000000000000 (07FF6D64AFE58h)]

and was curious as to what the point of the xorps xmm1, xmm1 instruction was. It seems like any number xor itself would just give 0? If so, what's the purpose of clearing the register?

Note: I'm just asking this out of pure curiosity. I know very little about assembly language.

Yes, quickly setting all bits in xmm1 to 0 is the intention. The cvtsi2sd instruction only assigns bits 0..63 — Hans Passant, Jan 30 '16 at 15:55
@HansPassant ahh, is this an artifact of something like 'a c++ double (64 bits) is actually only a single-recision float on an x64 architecture'? I was kind of wondering as well why the single precision multiply/divide were being used for doubles. — Nicolas Holthaus, Jan 30 '16 at 15:57
@NicolasHolthaus: it's a *scalar* 64b double-precision variable, being operated on in a register wide enough to hold two doubles packed together. (`mulPd` / `divPd`.) — Peter Cordes, Jan 30 '16 at 19:18

score 8 · Accepted Answer · edited May 23 '17 at 12:33

8

The XMM register has 128 bits and using cvtsi2sd only fills up the low 64 bits. Therefore, the xorps instruction is used to clear the possible garbage values and/or dependency chains that would otherwise affect subsequent operations.

Basically, the sequence of operations you have is:

mov         eax, 16Dh       ; load 0x16D into lower 32 bits of RAX register
xorps       xmm1, xmm1      ; zero xmm1
cvtsi2sd    xmm1, rax       ; load lower 32 bits from RAX into xmm1
<do more stuff with xmm1>

The necessity of zeroing a register is very frequent in assembly when only loading parts of registers where the subsequent instructions operate on their full range. Doing xor x, x is one of the usual register clearing patterns.

See also this (very exhaustive and great, as per comments) answer for more details on why xor can be preferrable to other alternatives (mov x, 0, and x, 0).

edited May 23 '17 at 12:33

Community

1
1

answered Jan 30 '16 at 16:03

Zdeněk Jelínek

2,611
1
17
23

3

Most of the more-subtle reasons for using xor-zeroing only apply to integer registers (see my answer at http://stackoverflow.com/questions/33666617/which-is-best-way-to-set-a-register-to-zero-in-x86-assembly-xor-mov-or-and), but not consuming an execution unit or a physical register file entry (Intel SnB-family) still applies. Also, for a vector reg, there *is* no mov-immediate form. It might be nice if there was a `vpbroadcastd v, imm32`, but there isn't. You're right that `xorps` is better than `psubd same,same` or something, though. Prob. some CPUs don't break dep chains for psub. – Peter Cordes Jan 30 '16 at 19:11
Also, in this case clearing the reg first is probably more to break the dependency on the previous value of `xmm1`. Garbage in the upper64 won't cause slowdowns or faults as long as the code only uses further Scalar instructions, not ... PD (packed double). And there is a scalar version of every instruction (except shuffles of course). – Peter Cordes Jan 30 '16 at 19:13
Both of those links just link to the question, when I think you meant to link to some of the answers. – Puppy Jan 30 '16 at 21:09
1

@Puppy: If you're talking about my link: Yes, I was talking about the answer I wrote on it. I linked to the question because it's short and provides the context that the answer is replying to. Also, I sometimes feel a bit hubristic linking to just my answer and not the OP's question. I don't think the answer Zdeněk picked is the best one, BTW. I had to leave a comment on it. – Peter Cordes Jan 31 '16 at 12:30
@PeterCordes Fixed :) – Zdeněk Jelínek Jan 31 '16 at 12:53
Thanks :) Now if only all the other duplicate "why xor?" questions would also link to my answer there, more stackoverflow readers would be getting the full story. (And voting up my nerd-point totals, but honestly having only half the story in the answers on other questions, with no hints at the deeper picture, is what bothers me more. I only recently learned about xor solving the partial-register problem, for example.) I've left comments in some places on some of the many highly-voted questions. – Peter Cordes Jan 31 '16 at 12:59

What is the purpose of xorps on the same register?

1 Answers1