Retrieve/save DWORD from data segment 8086

Question

I have a problem to solve, and I got to the point where I don't know what else I could try. So I have this data segment:

data segment
    a db 12h, 34h, 56h, 78h, 9Ah
    b dd 2 dup(?)
data ends

What I'm trying to make is to code it in such way that b will have the value exactly like a has it, meaning:

b dd 12345678h, 9A000000h

Now, the part where I got stuck. This is for the 8086 CPU, and the data is stored/retrieved using the little endian technique. The high byte from the data segment, is saved into the high byte of a register AX for example, and low byte from DS is saved into the low byte of the register AX.

What I've tried:

lea si, a    ; load offset of a into si
lea di, d    ; load offset of d into di
mov cx, 2    ; initialize counter with 2

repeta:
mov dx, [si] ; copy the data starting the offset si to dx
             ; would be cool to get the dword at offset si
             ; into DX:AX, but looks like it doesn't

mov [di], dx ; reverse it into the other variable d
             ; and save DX:AX to offset di
inc si
inc di
loop repeta  ; cx!=0 go to repeta

First of all, I am not sure if the code needs to be changed on the way or not, but I could get the data from the offset si, in little endian and also big endian, by getting the values into the DL, DH 8-bit registers (so that shouldn't be a problem).

The problem is that I don't know how I could get the doubleword starting at offset a, to the registers (DX:AX) for example, since they are usually used for this type of things. My current code, gets only a word into DX all the time, but AX remains unchanged.

So the question would be, how to retrieve into two registers a dword from the data segment, in such fashion, that I am able to also save it back in the data segment as a doubleword.

Any suggestions would be appreciated.

score 1 · Answer 1 · answered Feb 06 '15 at 16:15

1

You need to use additional instruction to accomplish this.

mov ax,[si+2]

answered Feb 06 '15 at 16:15

Igor Popov

2,588
17
20

score 1 · Answer 2 · edited Jun 20 '16 at 00:17

So the question would be, how to retrieve into two registers a dword from the data segment, in such fashion, that I am able to also save it back in the data segment as a doubleword.

lea     si, a
mov     ax,[si]
mov     dx,[si+2]

Hereafter DX:AX contains the dword.

Also, if somebody that knows, coud comment: Related to this question, I guess you can't load a dword directly into two registers like I thought, in otherwords (lea dx, a will never take the dword into DX:AX, or otherwise lea ax, a will never take a dword into AX:DX). If I'm wrong plese correct me.

You are correct to say that lea dx, a will never take the dword into DX:AX but the reason is that LEA does not treat the data stored at an address but rather the address itself.
Both lea dx, a and lea ax, a put the address of the a label in a register.

I guess you can't load a dword directly into two registers like I thought

There actually exist 2 instructions in 8086 assembly that do just that! LDS and LES. I'll use the latter to show you how it is used.

lea     si, a
push    es
les     ax,[si]
mov     dx,es
pop     es

Hereafter DX:AX contains the dword.

This is obviously much slower than two separate loads, but is atomic with respect to interrupts on the same CPU. (i.e. if you're reading a dword that could be modified by an interrupt-handler, an interrupt between two separate mov loads could cause "tearing", but les either happens before or after an interrupt).

Wojciech Galazka · Answer 3 · 2016-06-19T19:05:42.147

0

This code shows how to place values in 'b' based on values in 'a', expected to get b dd 12345678h, 9A000000h

 .model small
.stack  100h
.data
a   db 12h, 34h, 56h, 78h, 9Ah
b   dd 2 dup(?)

.code
main    proc
mov  ax, @data
mov ds, ax

mov cx, 4
mov si, offset a
mov di, offset b+3

m1: mov al, [si]
mov [di], al
inc si
dec di
loop    m1

mov si, offset a+4
mov di, offset b+7
mov al, [si]
mov [di], al
dec di
mov byte ptr [di], 0
dec di
mov byte ptr [di], 0
dec di
mov byte ptr [di], 0

mov ax, 4c00h
int 21h
main    endp
end main

edited Jun 19 '16 at 19:05

answered Jun 12 '16 at 10:45

Wojciech Galazka

1
1

3

Can you explain how the code solves the OP's problem? – Håken Lid Jun 12 '16 at 13:16

score 0 · Answer 4 · edited May 23 '17 at 11:58

First of all, your biggest problem is that the value you're looking for in b is NOT the same as a. x86 is little-endian, memcpy from a to b (or any other byte-at-a-time copy without byte-swapping) would actually produce:

a   db 12h, 34h, 56h, 78h,   9Ah, 0,0,0  ; added padding 

b   dd 78563412h,            0000009Ah

Your b dd 12345678h, 9A000000h has the first dword endian-swapped, and the 5th byte of a as the MSB of the 2nd dword in b, not the LSB.

Copying 5 bytes from a to b leaves the last 3 bytes of b uninitialized. (In Unix, .bss space is zero-initialized. I assume this happens for dup(?) space in MASM/TASM, but if not, whatever garbage was there before will still be there.)

If you copy 8 bytes from a to b, the three bytes after the 9A will be read from the start of b if they end up in the same section (rather than b going into bss. Perhaps this is why you used an org directive to separate them in your answer.

If you don't have any special reason to want to copy a dword all at once, then in 8086 code you should just use rep movsw, or normal mov instructions, like

mov   ax, [a]          ; If your addresses are static, might as well just use
mov   dx, [a+2]        ; absolute addressing, esp in 16bit code where it's only 2B


mov   [b], ax
mov   [b+2], dx

Note that your loop with si and di only increments them by 1, but you load/store two bytes. Unaligned overlapping loads/stores work, but you're doing redundant work.

For your case, you have 5 bytes to copy. You could use rep movsb with cx=5. 8086 of course doesn't support movsd or movsq, and rep startup overhead makes it inefficient for small copies.

If you do care about doing both loads at once, e.g. from a dword that an interrupt handler can modify:

On a single-core CPU, we don't have to worry about memory being modified by other concurrent threads. However, an interrupt (maybe triggering a context-switch to another thread) could arrive between any two instructions, but not in the middle of a single instruction. (This is the big difference between single-core atomicity and multi-core: on a multi-core).

So, if you're loading a dword that can be modified asynchronously (e.g. by an interrupt handler), and you want to load both halves of it at once, you need to get both halves with a single instruction.

Do not use this if you're just writing normal single-threaded programs without interrupt handlers.

One way is with Sep Roland's les trick (see his answer), but that leaves ES temporarily set to something weird, which might be a problem depending on your interrupt handler.

Another way uses the x87 FPU (not guaranteed to exist on 8086), but you can use it to copy in 32 or 64-bit chunks. e.g.

fild   dword ptr [a]    ; load 32bits as an integer
fistp  dword ptr [d]    ; store as the same integer
; also works with qword ptr

; or store to the stack and then load into dx:ax with two mov instructions
; your own stack memory is private, so you don't need atomic ops there

x87's internal 80-bit FP format can exactly represent every 64-bit integer, so this works on any possible bit-pattern. (fld/fstp wouldn't, because fld requires a valid IEEE double-precision floating point representation, unlike fild.)

Even on 8086, it will be atomic with respect to interrupts. fild dword is atomic for aligned loads on 486 and later hardware.

gcc actually uses this to implement C++11 std::atomic<uint64_t> loads/stores in 32-bit mode (since the ISA guarantees that naturally-aligned loads/stores of 64-bit and smaller values are atomic, on P5 and later).

gcc used to bounce std::atomic<double> values around with fild/fstp when SSE2 wasn't available, but that was fixed after I reported it. (I noticed the issue while answering Deoptimizing a program for the pipeline in Intel Sandybridge-family CPUs)

See Agner Fog's Optimizing Assembly guide for other useful tricks. (And also the x86 tag wiki).

A FILD/FISTP would be about an order of magnitude slower than two pairs of 16-bit MOVs on a 16-bit Intel CPU (eg. '286/'287). — Ross Ridge, Jun 19 '16 at 20:14
@RossRidge: updated to point out when this might actually be useful. — Peter Cordes, Jun 20 '16 at 00:37

score -1 · Answer 5 · answered Feb 06 '15 at 16:34

I guess I found my own answer.

Here is my code:

assume cs:code, ds:data
data segment
    a db 12h, 34h, 56h, 78h, 9Ah, 0BCh
    org 20h                             ; make sure I'm not overwrittin
    d dd 2 dup(11111111h)               ; this will be overwritten
data ends

code segment
start:
    mov ax, data
    mov ds, ax


    lea si, a
    lea di, d
    mov cx, 4

    repeta:
    mov dx, [si]
    inc si
    inc si

    mov [di], dx
    inc di
    inc di

    loop repeta


    mov ax, 4C00h
    int 21h
code ends
end start

This works, you can clearly see in the data segment that the doublewords that have the value 11111111h will be overwritten in the DS so that it will look like this (in hex): 12 34 56 78 9A 0BC 0BC 00 00

To note: If the data is already arranged in the DS, you don't have to worry about little endian because, when you take it you take it in "reverse order", but once it's written back, it's set back to it's original form (12 34 not 34 12). With this said, you can do easily mov dx, [si]

Also, if somebody that knows, coud comment: Related to this question, I guess you can't load a dword directly into two registers like I thought, in otherwords (lea dx, a will never take the dword into DX:AX, or otherwise lea ax, a will never take a dword into AX:DX). If I'm wrong plese correct me.

When using little endian, `db 12h 32h 56h 78h` is NOT the same as `dd 12345678h`. `dd 12345678h` in little endian is equivalent to `db 78h 56h 32h 12h`. Your assignment wants you to show that you understand little endian vs. big endian and how to convert between the two. — Mike Nakis, Feb 06 '15 at 16:41
Well, user user1812076, did you turn in your assignment? Was that the right way to solve the problem? And also, about the last thing you are asking, you are right, LEA will only affect the register which was specified as an operand, no other registers. — Mike Nakis, Feb 09 '15 at 10:58
This may have solved your problem, but it doesn't answer your question as asked. — Ross Ridge, Jun 19 '16 at 20:16

Retrieve/save DWORD from data segment 8086

5 Answers5

If you do care about doing both loads at once, e.g. from a dword that an interrupt handler can modify: