In ATT assembly language, when using leaq instruction, must its first operand be a memory address instead of a register or a constant (prefixed $)? Must its second operand be a register? I got that impression from reading Computer Systems: a Programmer's Perspective, and have never seen an example different from my guess. Thanks.
-
Yes, that is correct. When in doubt, consult the official instruction set reference. Note that a memory address may of course use a register or a displacement (without the `$` as that would be an immediate). – Jester Oct 25 '18 at 22:11
-
Thanks. I was looking for a reference document on the Internet, but I am not sure where it is. – Oct 25 '18 at 22:12
-
[Intel® 64 and IA-32 Architectures Software Developer Manuals](https://software.intel.com/en-us/articles/intel-sdm) PS: uses intel syntax not at&t so you have to do some mental work :) – Jester Oct 25 '18 at 22:13
-
@Jester Thanks. What kinds of mental work? – Oct 25 '18 at 22:21
-
Translating from intel to at&t. E.g. From `[base+index*scale+displacement]` to `displacement(base, index, scale)` – Jester Oct 25 '18 at 22:37
-
1@fuz, you might want to remove that comment and try again. – prl Oct 26 '18 at 03:26
-
Also flipping the order of operands (Intel is *dest, src* while AT&T is *src, dest*). – fuz Oct 26 '18 at 09:13
2 Answers
Yes, that is correct. While a lea with two register operands can technically be encoded, such an encoding is invalid and leads to a #UD exception. See this reference or this one for details.
- 88,405
- 25
- 200
- 352
Even if it was encodable, you'd never want to use it.
If you want to put a constant in a register, you should never use lea. mov $1234, %eax is shorter and more efficient than lea 1234, %eax (absolute address in a disp32 addressing mode).
The only use-case for LEA for static addresses is 64-bit code with RIP-relative addressing modes, like lea symbol(%rip), %rax (7 bytes), in cases where mov $symbol, %eax (5 bytes) is not usable because you need position-independent code, and/or the address doesn't fit in a 32-bit zero-extended immediate.
See Difference between movq and movabsq in x86-64 for more about why mov $symbol, %rdi is not the best choice.
In 32-bit code, lea symbol, %edi is 6 bytes (opcode + modrm + disp32), and runs only one port 1 or port 5 on Intel Sandybridge-family CPUs. (https://agner.org/optimize/)
mov $symbol, %edi is 5 bytes (opcode + imm32 short form with no ModRM byte), and runs on any ALU port.
Same for 16-bit code: mov $symbol, %di is 3 bytes, while lea symbol, %di is 4 bytes, with the same execution-port differences. (Or in NASM syntax, lea di, [symbol] vs. mov di, symbol, or mov di, OFFSET symbol in GAS .intel_syntax or MASM.)
LEA is useful with base=register addressing modes, though. Like lea symbol(%rdi), %rax if an address fits in a 32-bit sign-extended disp32.
Or for arbitrary shift-and-add usage, like lea 123(%rdi, %rdi, 2), %eax to do eax = 3*edi + 123. Using LEA on values that aren't addresses / pointers?
- 328,167
- 45
- 605
- 847