So I know about String#codePointAt(int), but it's indexed by the char offset, not by the codepoint offset.
I'm thinking about trying something like:
- using
String#charAt(int)to get thecharat an index - testing whether the
charis in the high-surrogates range- if so, use
String#codePointAt(int)to get the codepoint, and increment the index by 2 - if not, use the given
charvalue as the codepoint, and increment the index by 1
- if so, use
But my concerns are
- I'm not sure whether codepoints which are naturally in the high-surrogates range will be stored as two
charvalues or one - this seems like an awful expensive way to iterate through characters
- someone must have come up with something better.