0

I have problems executing the code below

unique proc
    
    invoke lstrlen, esi
    cmp eax, 1
    jle quit
    mov ebx, 0; previous iterator
    mov edx, 0; next iterator
    dec eax
    mov ecx, eax
    inc eax
next:
    inc edx
    cmp [esi][ebx], [esi][edx]
    je skip
    cmp 
    inc ebx
    cmp ebx, edx
    dec ebx
    je skip
    inc ebx
    mov [esi][ebx], [esi][edx]
skip:
    loop next
    mov [esi][edx], '0'
quit:
    ret

unique endp

I am using indirect addressing here, so I expect

cmp [esi][ebx], [esi][edx]

to be replaced with

cmp ds:[esi][ebx], ds:[esi][edx]

Where am I wrong here?

Maxim Masiutin
  • 3,991
  • 4
  • 55
  • 72
alexander.sivak
  • 4,352
  • 3
  • 18
  • 27
  • You cannot have more than one memory operand in the same instruction. – fuz Oct 30 '20 at 14:30
  • @fuz, how do you get an element if you have an address in register ebx, and a shift in register edx? – alexander.sivak Oct 30 '20 at 14:36
  • 2
    You can use a `mov` instruction like this: `mov al, [esi][ebx]`. Though it's slightly faster to use a `movzx` instead: `movzx eax, [esi][ebx]`. Most instructions support up to one memory operand. Your comparison then looks like this: `movzx eax, [esi][ebx]; cmp al, [esi][edx]`. – fuz Oct 30 '20 at 14:51
  • 1
    The attempt to use two memory operands is basically a duplicate of [Why isn't movl from memory to memory allowed?](https://stackoverflow.com/q/33794169). My answer there does happen to show `[reg+reg]` syntax. Also [Referencing the contents of a memory location. (x86 addressing modes)](https://stackoverflow.com/q/34058101) for addressing mode syntax. – Peter Cordes Oct 31 '20 at 01:37

1 Answers1

4

Conventional instructions are limited to one memory operand

You have specified the x86 tag in your question, that means that you are using Intel x86 instruction set.

An Intel x86 instruction can have multiple operands, separated with commas in the assembly language. Operands can be: immediate, when a constant expression evaluates to an inline value in the opcode; register, when a value is in a processor register; or memory, when the value is in the RAM.

You cannot use two memory operands in a single cmp instruction. You should split the cmp instructions in your code. Instead of a single instruction that you wish to use for two memory operands at once, use two instructions that each have one memory operand and one register operand. The first instruction will load the value from memory to a register, and the second instruction will compare the value from another memory location with that register.

For example, instead of a single instruction

cmp [esi][ebx], [esi][edx]

use two instructions:

mov al, [esi+edx]
cmp [esi+ebx], al

String instructions have two memory operands by index registers

You can use a cmpsb instruction that, along with the other string instructions like movsb, is an exception in the matter that it technically has two memory operands. But the mode on how you can address the operands by the string instructions is fixed by the index registers, 'esi' and 'edi' (register size may differ), to specify the first and the second memory addresses, respectively. You cannot use other registers. At the assembly code level, two forms of this instruction are allowed: the explicit operand form and the no-operand form (e.g.cmpsb). The explicit operand form allows the use of symbols to explicitly specify the first and second addresses of the memory, i.e. cmps byte ptr ds:[esi], byte ptr es:[edi]. This explicit operand form is provided to allow documentation, but the documentation provided in this form may be misleading, because the symbols do not have to specify the correct source and destination addresses, and if you specify them incorrectly, like 'eax' rather than 'esi', this error may be ignored by some assemblers, like Turbo Assembler Version 5.4, and the 'esi' will be used instead. These index registers for the string operations are always implicitly assumed by the instruction opcode and are defined so you have no choice. The first memory address is always specified by DS:(RSI/ESI/SI), although you can change the segment register for the first memory address. The second memory address is always specified by the ES:(RDI/EDI/DI) with no choice even for the segment register. Besides that, you also have to set the direction flag, by either cld or std instruction, to specify whether the index registers registers should be increased or decreased after the operation. Only the comparison result of the two memory operands will update flags, not the result of the increase/decrease of the index registers. Please note that the explicit operand form may not be supported by all assemblers, so the Netwide Assembler, for example, gives an error on any instance of the explicit form. Although for the explicit form, Turbo Assembler will ignore the index registers that you specify, it will anyway check the segment registers specified. If you will specify other segment register for the second memory address, it will give an error "Can't override ES segment".

Maxim Masiutin
  • 3,991
  • 4
  • 55
  • 72
  • 1
    `cmpsb` can take a segment override prefix to use a segment other than `ds` for the "source" operand. Writing of which, `cmpsb` actually acts like a (nonexistent) `cmp byte [si], byte [es:di]` (plus the index register increment or decrement without affecting flags). That is, the byte pointed to by `(r/e)si` is actually treated as the "destination operand" of the comparison. – ecm Oct 31 '20 at 16:49
  • 1
    @ecm Thank you very much for pointing that out. I have updated the reply accordingly. – Maxim Masiutin Oct 31 '20 at 20:11