I'm not familiar with the implementation of C++ compilers and I'm writing a C++ snippet like this (for learning):
#include <vector>
void vector8_inc(std::vector<unsigned char>& v) {
for (std::size_t i = 0; i < v.size(); i++) {
v[i]++;
}
}
void vector32_inc(std::vector<unsigned int>& v) {
for (std::size_t i = 0; i < v.size(); i++) {
v[i]++;
}
}
int main() {
std::vector<unsigned char> my{ 10,11,2,3 };
vector8_inc(my);
std::vector<unsigned int> my2{ 10,11,2,3 };
vector32_inc(my2);
}
By generating its assembly codes and disassemble its binary, I found the segments for
vector8_inc and vector32_inc:
g++ -S test.cpp -o test.a -O2
_Z11vector8_incRSt6vectorIhSaIhEE:
.LFB853:
.cfi_startproc
endbr64
movq (%rdi), %rdx
cmpq 8(%rdi), %rdx
je .L1
xorl %eax, %eax
.p2align 4,,10
.p2align 3
.L3:
addb $1, (%rdx,%rax)
movq (%rdi), %rdx
addq $1, %rax
movq 8(%rdi), %rcx
subq %rdx, %rcx // line C: calculate size of the vector
cmpq %rcx, %rax
jb .L3
.L1:
ret
.cfi_endproc
......
_Z12vector32_incRSt6vectorIjSaIjEE:
.LFB854:
.cfi_startproc
endbr64
movq (%rdi), %rax // line A
movq 8(%rdi), %rdx
subq %rax, %rdx // line C': calculate size of the vector
movq %rdx, %rcx
shrq $2, %rcx // Line B: ← What is this used for?
je .L6
addq %rax, %rdx
.p2align 4,,10
.p2align 3
.L8:
addl $1, (%rax)
addq $4, %rax
cmpq %rax, %rdx
jne .L8
.L6:
ret
.cfi_endproc
......
but in fact, these two functions are "inlined" within main(), and the instructions shown above will never be executed but compiled and loaded into memory (I verified this by adding a breakpoint to line A and checking the memory):
main() contains duplicate logic of the above segment
So here are my questions:
- What are these two segments used for and why are they compiled out?
_Z12vector32_incRSt6vectorIjSaIjEE,_Z11vector8_incRSt6vectorIhSaIhEE - Does the instruction (Line B) in
_Z12vector32_incRSt6vectorIjSaIjEE(i.e.shrq $2, %rcx) means 'to shift size integer of the vector right by 2 bits'? If it does, what's this used for? - Why is
size()computed every turn of the loop invector8_inc, but invector32_incit's computed before executing the loop? (see Line C and Line C', under-O2optimization level ofg++)