Is there anyway to get libc6's regexp functions regcomp and regexec to work properly with multi-byte characters?
For instance, if my pattern is the utf8 characters 猫机+猫, finding a match on the utf8 encoded string 猫机机机猫 will fail, where it should succeed.
I think this is because the character 机's byte representation is \xe6\x9c\xba, and the + is matching one or more of the byte \xba. I can make this instance work by putting parenthesis around each multibyte character in the pattern, but since this is for an application I can't require users to do this.
Is there a way to flag a pattern or string to match as containing utf8 characters? Perhaps telling libc to store the pattern as wchar instead of char?