I have written this method to reverse a string
public string Reverse(string s)
{
if(string.IsNullOrEmpty(s))
return s;
TextElementEnumerator enumerator =
StringInfo.GetTextElementEnumerator(s);
var elements = new List<char>();
while (enumerator.MoveNext())
{
var cs = enumerator.GetTextElement().ToCharArray();
if (cs.Length > 1)
{
elements.AddRange(cs.Reverse());
}
else
{
elements.AddRange(cs);
}
}
elements.Reverse();
return string.Concat(elements);
}
Now, I don't want to start a discussion about how this code could be made more efficient or how there are one liners that I could use instead. I'm aware that you can perform Xors and all sorts of other things to potentially improve this code. If I want to refactor the code later I could do that easily as I have unit tests.
Currently, this correctly reverses BML strings (including strings with accents like "Les Misérables") and strings that contain combined characters such as "Les Mise\u0301rables".
My test that contains surrogate pairs work if they are expressed like this
Assert.AreEqual("", _stringOperations.Reverse(""));
But if I express surrogate pairs like this
Assert.AreEqual("\u10000", _stringOperations.Reverse("\u10000"));
then the test fails. Is there an air-tight implementation that supports surrogate pairs as well?
If I have made any mistake above then please do point this out as I'm no Unicode expert.