I'll try to explain it by reversing the code back into C.
Intel's Instruction Set Reference (Volume 2 of Software Developer's Manual) is invaluable for this kind of reverse engineering.
REPNE SCASB
The logic for REPNE and SCASB combined:
while (ecx != 0) {
temp = al - *(BYTE *)edi;
SetStatusFlags(temp);
if (DF == 0) // DF = Direction Flag
edi = edi + 1;
else
edi = edi - 1;
ecx = ecx - 1;
if (ZF == 1) break;
}
Or more simply:
while (ecx != 0) {
ZF = (al == *(BYTE *)edi);
if (DF == 0)
edi++;
else
edi--;
ecx--;
if (ZF) break;
}
String Length
However, the above is insufficient to explain how it computes the length of a string. Based on the presence of the not ecx in your question, I'm assuming the snippet belongs to this idiom (or similar) for computing string length using REPNE SCASB:
sub ecx, ecx
sub al, al
not ecx
cld
repne scasb
not ecx
dec ecx
Translating to C and using our logic from the previous section, we get:
ecx = (unsigned)-1;
al = 0;
DF = 0;
while (ecx != 0) {
ZF = (al == *(BYTE *)edi);
if (DF == 0)
edi++;
else
edi--;
ecx--;
if (ZF) break;
}
ecx = ~ecx;
ecx--;
Simplifying using al = 0 and DF = 0:
ecx = (unsigned)-1;
while (ecx != 0) {
ZF = (0 == *(BYTE *)edi);
edi++;
ecx--;
if (ZF) break;
}
ecx = ~ecx;
ecx--;
Things to note:
- in two's complement notation, flipping the bits of
ecx is equivalent to -1 - ecx.
- in the loop,
ecx is decremented before the loop breaks, so it decrements by length(edi) + 1 in total.
ecx can never be zero in the loop, since the string would have to occupy the entire address space.
So after the loop above, ecx contains -1 - (length(edi) + 1) which is the same as -(length(edi) + 2), which we flip the bits to give length(edi) + 1, and finally decrement to give length(edi).
Or rearranging the loop and simplifying:
const char *s = edi;
size_t c = (size_t)-1; // c == -1
while (*s++ != '\0') c--; // c == -1 - length(s)
c = ~c; // c == length(s)
And inverting the count:
size_t c = 0;
while (*s++ != '\0') c++;
which is the strlen function from C:
size_t strlen(const char *s) {
size_t c = 0;
while (*s++ != '\0') c++;
return c;
}