For example, we have a CPU with AVX512bw support. Now i want to run 3 types of string-length SIMD functions on this CPU.
- The first function takes 16 bytes (AVX) of a string and search its characters for the null-terminator, and this continues until a null-terminator achieved.
- The second function takes 32 bytes (AVX2) of a string and search its characters for the null-terminator, and this continues until a null-terminator achieved.
- The third function takes 64 bytes (AVX512bw) of a string and search its characters for the null-terminator, and this continues until a null-terminator achieved.
But I can't understand that for AVX512 CPU, the whole 3 functions must uses AVX512 instructions or just use their SIMD instructions ?
For example, for the first function, I have to use vmovdqa or vmovdqa16 !!! ???
Or for the second function, I have to use vmovdqa or vmovdqa32 !!! ???
Why there are such vmovdqa16, vmovdqa32 and ... instructions when we just can use their AVX or AVX2 instructions ??!!
Is it possible to use AVX, AVX2 instructions in a AVX512 function ?? Or we must use the AVX512 version of those instructions ?