So I am learning how x86 works and have come across people saying that it is byte-addressable, yet can read words, double words, etc. How does the processor decide which method to use and when? E.g. for accessing the next instruction and when a user wants to read/write to memory, which addressing mode is used?
1 Answers
Every memory access has an operand-size specified by the machine-code instruction.  (Addressing mode isn't the right term: different addressing modes are different ways of specifying the lowest address of the chunk of memory to be accessed, like [rdi] vs. [rdi + rdx*8] vs. [RIP + rel32])
Encoding different operand-sizes is done with prefixes (for 16 vs. 32 vs. 64-bit for integer instructions) or a different opcode for the same mnemonic (8-bit integer). Or with bits in the VEX or EVEX prefix for AVX / AVX512 instructions that can use xmm, ymm, or zmm registers.
Decoding also depends on the current mode implying the default operand-size: 32 for 32 and 64-bit mode, or 16 for 16-bit mode.  A 66 opererand-size prefix implies the opposite size.
In 64-bit mode, the .W (width) bit in the REX prefix sets the operand-size to 64-bit.  (And some instructions like push/pop default to 64-bit operand-size with no prefix needed, but most instructions like add/sub/mov still default to 32-bit)
There's also a 0x67 address-size prefix which swaps addressing modes to the other size.  (16 vs. 32 or in 64-bit mode 64 -> 32.)
For example, mov [rdi], eax is a dword store, and the machine-code encoding will specify that by using no special prefixes on the opcode for 16/32/64-bit operand-size.  (see https://www.felixcloutier.com/x86/mov for the available encodings.  But note that Intel's manual doesn't mention 66 operand-size prefixes in each entry: it has 2 identical encodings with different sizes.  You have to know which one needs a 66 prefix based on the current mode's default.)
16-bit operand-size like mov [rdi], ax will have the same machine code by with a 66 operand-size prefix.
8-bit operand-size (mov [rdi], al) has its own opcode, no prefixes needed.
movzx / movsx are interesting cases: the memory access size is different from the destination register.  The memory-access size (byte or word) is specified by the opcode.  Operand-size prefixes only affect the destination size.  Except x86-64 63 /r movsxd (dword->qword sign-extension) where a 66 operand-size prefix does shrink the memory-access size down to m16 to match the destination.
Similarly for SIMD instructions; the instruction encoding uniquely determines the memory-access size, along with the registers read or written.
- 328,167
 - 45
 - 605
 - 847
 
- 
                    Can we demonstrate by means of a simple example? If I say MOV AL, [0110], can I safely say it is a "word instruction" because the operand, 0110, is two bytes large? – Dean P Oct 27 '20 at 16:44
 - 
                    @DeanP: For that instruction, the *address-size* is "word", assuming you're in 16-bit mode so a bare `0110` is interpreted as a 16-bit (hex?) number. The *operand-size* is "byte", because `mov` is transferring 1 byte from memory to a byte register, AL. – Peter Cordes Oct 27 '20 at 21:31