0

In ATT assembly language, when using leaq instruction, must its first operand be a memory address instead of a register or a constant (prefixed $)? Must its second operand be a register? I got that impression from reading Computer Systems: a Programmer's Perspective, and have never seen an example different from my guess. Thanks.

  • Yes, that is correct. When in doubt, consult the official instruction set reference. Note that a memory address may of course use a register or a displacement (without the `$` as that would be an immediate). – Jester Oct 25 '18 at 22:11
  • Thanks. I was looking for a reference document on the Internet, but I am not sure where it is. –  Oct 25 '18 at 22:12
  • [Intel® 64 and IA-32 Architectures Software Developer Manuals](https://software.intel.com/en-us/articles/intel-sdm) PS: uses intel syntax not at&t so you have to do some mental work :) – Jester Oct 25 '18 at 22:13
  • @Jester Thanks. What kinds of mental work? –  Oct 25 '18 at 22:21
  • Translating from intel to at&t. E.g. From `[base+index*scale+displacement]` to `displacement(base, index, scale)` – Jester Oct 25 '18 at 22:37
  • 1
    @fuz, you might want to remove that comment and try again. – prl Oct 26 '18 at 03:26
  • Also flipping the order of operands (Intel is *dest, src* while AT&T is *src, dest*). – fuz Oct 26 '18 at 09:13

2 Answers2

1

Yes, that is correct. While a lea with two register operands can technically be encoded, such an encoding is invalid and leads to a #UD exception. See this reference or this one for details.

fuz
  • 88,405
  • 25
  • 200
  • 352
0

Even if it was encodable, you'd never want to use it.

If you want to put a constant in a register, you should never use lea. mov $1234, %eax is shorter and more efficient than lea 1234, %eax (absolute address in a disp32 addressing mode).

The only use-case for LEA for static addresses is 64-bit code with RIP-relative addressing modes, like lea symbol(%rip), %rax (7 bytes), in cases where mov $symbol, %eax (5 bytes) is not usable because you need position-independent code, and/or the address doesn't fit in a 32-bit zero-extended immediate.

See Difference between movq and movabsq in x86-64 for more about why mov $symbol, %rdi is not the best choice.


In 32-bit code, lea symbol, %edi is 6 bytes (opcode + modrm + disp32), and runs only one port 1 or port 5 on Intel Sandybridge-family CPUs. (https://agner.org/optimize/)

mov $symbol, %edi is 5 bytes (opcode + imm32 short form with no ModRM byte), and runs on any ALU port.

Same for 16-bit code: mov $symbol, %di is 3 bytes, while lea symbol, %di is 4 bytes, with the same execution-port differences. (Or in NASM syntax, lea di, [symbol] vs. mov di, symbol, or mov di, OFFSET symbol in GAS .intel_syntax or MASM.)


LEA is useful with base=register addressing modes, though. Like lea symbol(%rdi), %rax if an address fits in a 32-bit sign-extended disp32.

Or for arbitrary shift-and-add usage, like lea 123(%rdi, %rdi, 2), %eax to do eax = 3*edi + 123. Using LEA on values that aren't addresses / pointers?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847