All the top rated answers are not actually definitive "facts"... they are people who are speculating!
You can definitively know for a fact which code takes less assembly instructions to execute because you can look at the output assembly generated by the compiler and see which executes in less assembly instructions!
Here is the c code I compiled with flags "gcc -std=c99 -S -O3 lookingAtAsmOutput.c":
#include <stdio.h>
#include <stdlib.h>
void swap_traditional(int * restrict a, int * restrict b)
{
    int temp = *a;
    *a = *b;
    *b = temp;
}
void swap_xor(int * restrict a, int * restrict b)
{
    *a ^= *b;
    *b ^= *a;
    *a ^= *b;
}
int main() {
    int a = 5;
    int b = 6;
    swap_traditional(&a,&b);
    swap_xor(&a,&b);
}
ASM output for swap_traditional() takes >>> 11 <<< instructions ( not including "leave", "ret", "size"):
.globl swap_traditional
    .type   swap_traditional, @function
swap_traditional:
    pushl   %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %edx
    movl    12(%ebp), %ecx
    pushl   %ebx
    movl    (%edx), %ebx
    movl    (%ecx), %eax
    movl    %ebx, (%ecx)
    movl    %eax, (%edx)
    popl    %ebx
    popl    %ebp
    ret
    .size   swap_traditional, .-swap_traditional
    .p2align 4,,15
ASM output for swap_xor() takes >>> 11 <<< instructions not including "leave" and "ret":
.globl swap_xor
    .type   swap_xor, @function
swap_xor:
    pushl   %ebp
    movl    %esp, %ebp
    movl    8(%ebp), %ecx
    movl    12(%ebp), %edx
    movl    (%ecx), %eax
    xorl    (%edx), %eax
    movl    %eax, (%ecx)
    xorl    (%edx), %eax
    xorl    %eax, (%ecx)
    movl    %eax, (%edx)
    popl    %ebp
    ret
    .size   swap_xor, .-swap_xor
    .p2align 4,,15
Summary of assembly output:
swap_traditional() takes 11 instructions
swap_xor() takes 11 instructions
Conclusion:
Both methods use the same amount of instructions to execute and therefore are approximately the same speed on this hardware platform.
Lesson learned:
When you have small code snippets, looking at the asm output is helpful to rapidly iterate your code and come up with the fastest ( i.e. least instructions ) code.  And you can save time even because you don't have to run the program for each code change. You only need to run the code change at the end with a profiler to show that your code changes are faster.  
I use this method a lot for heavy DSP code that needs speed.