My idea was to code a Hangman-like game in C. I want it to be able to use German words with umlauts (eg: ä, ü, ö) and also Greek words (completely non-ASCII characters).
My compiler and my terminal can handle Unicode well. Displaying the strings works well.
But how should I do operations on these strings? For the German language I could maybe handle the 6 upper- and lowercase accented characters by taking care of these cases in the functions. But considering Greek it seems like impossible.
I wrote this test code. It outputs the string, the length of the string (of course wrong, because the UTF-8 sequences take the place of two characters), and the value of the individual characters of the string in plain text and hex.
#include <stdio.h>
#include <string.h>
int main() {
    printf("123456789\n");
    char aTestString[] = "cheese";
    printf("%s ist %d Zeichen lang\n", aTestString, strlen(aTestString));
        
    for (int i = 0; i < strlen(aTestString); i++) {
        printf("( %c )", aTestString[i]);   // char als char
        printf("[ %02X ]", aTestString[i]); // char in hexadezimal
    }
    printf("\n123456789\n");
    char aTestString2[] = "Käse";
    printf("%s has %d characters\n", aTestString2, strlen(aTestString2));
        
    for (int i = 0; i < strlen(aTestString2); i++) {
        printf("( %c )", aTestString2[i]);  // char als char
        printf("[ %02X ]", aTestString2[i]); // char in hexadezimal
    }
    
    printf("\n123456789\n");    
    char aTestString3[] = "λόγος";
    printf("%s has %d characters\n", aTestString3, strlen(aTestString3));
    for (int i = 0; i < strlen(aTestString3); i++) {
        printf("( %c )", aTestString3[i]);  // char als char
        printf("[ %02X ]", aTestString3[i]); // char in hexadezimal
    }
}
For example, what is the recommended way to count the Unicode characters, or to see whether a specific Unicode character (that is, code point) is in the string? I am quite sure there must some simple solution because such characters are often used in passwords for example.
Here the output of the test program:
123456789
cheese has 6 character
( c )[ 63 ]( h )[ 68 ]( e )[ 65 ]( e )[ 65 ]( s )[ 73 ]( e )[ 65 ]
123456789
Käse has 5 characters
( K )[ 4B ](  )[ FFFFFFC3 ](  )[ FFFFFFA4 ]( s )[ 73 ]( e )[ 65 ]
123456789
λόγος has 10 characters
(  )[ FFFFFFCE ](  )[ FFFFFFBB ](  )[ FFFFFFCF ](  )[ FFFFFF8C ](  )[ FFFFFFCE ](  )[ FFFFFFB3 ](  )[ FFFFFFCE ](  )[ FFFFFFBF ](  )[ FFFFFFCF ](  )[ FFFFFF82 ]