I have this C:
#include <stddef.h>
size_t findChar(unsigned int length, char*  __attribute__((aligned(16))) restrict string) {
    for (size_t i = 0; i < length; i += 2) {
        if (string[i] == '[' || string[i] == ' ') {
            return i;
        }
    }
    return -1;
}
It checks every other character of a string and returns the first index of the string that is [ or  . With x86-64 GCC 10.2 -O3 -march=skylake -mtune=skylake, this is the assembly output:
findChar:
        mov     edi, edi
        test    rdi, rdi
        je      .L4
        xor     eax, eax
.L3:
        movzx   edx, BYTE PTR [rsi+rax]
        cmp     dl, 91
        je      .L1
        cmp     dl, 32
        je      .L1
        add     rax, 2
        cmp     rax, rdi
        jb      .L3
.L4:
        mov     rax, -1
.L1:
        ret
It seems like it could be optimized significantly, because I see multiple branches. How can I write my C so that the compiler optimizes it with SIMD, string instructions, and/or vectorization?
How do I write my code to signal to the compiler that this code can be optimized?
Interactive assembly output on Godbolt: https://godbolt.org/z/W19Gz8x73
Changing it to a VLA with an explicitly declared length doesn't help much: https://godbolt.org/z/bb5fzbdM1
This is the version of the code modified so that the function would only return every 100 characters: https://godbolt.org/z/h8MjbP1cf
 
    