On my x86_64 machine,
I used objdump -d to check the encoding of the following two instructions:
movzbl (%rdi),%eax: encoded in 3 bytes (0f b6 07)movzbq (%rdi),%rax: encoded in 4 bytes (48 0f b6 07)
Because of implicit zero extension of upper 32 bits for 32-bit operands,
movzbl would achieve the same data movement task as movzbq but with 1 less byte of encoding.
When would the compiler prefer to use movzbq over movzbl despite that movzbq takes up an extra byte ?