As of C++11, there are additional standard codecvt specialisations and types, intended for converting between various UTF-x and UCSx character sequences; one of these may suit your needs.
In <locale>:
std::codecvt<char16_t, char, std::mbstate_t>: Converts between UTF-16 and UTF-8.
std::codecvt<char32_t, char, std::mbstate_t>: Converts between UTF-32 and UTF-8.
In <codecvt>:
std::codecvt_utf8_utf16<typename Elem>: Converts between UTF-8 and UTF-16, where UTF-16 code points are stored as the specified Elem (note that if char32_t is specified, only one code point will be stored per char32_t).
- Has two additional, defaulted template paramters (
unsigned long MaxCode = 0x10ffff, and std::codecvt_mode Mode = (std::codecvt_mode)0), and inherits from std::codecvt<Elem, char, std::mbstate_t>.
std::codecvt_utf8<typename Elem>: Converts between UTF-8 and either UCS2 or UCS4, depending on Elem (UCS2 for char16_t, UCS4 for char32_t, platform-dependent for wchar_t).
- Has two additional, defaulted template paramters (
unsigned long MaxCode = 0x10ffff, and std::codecvt_mode Mode = (std::codecvt_mode)0), and inherits from std::codecvt<Elem, char, std::mbstate_t>.
std::codecvt_utf16<typename Elem>: Converts between UTF-16 and either UCS2 or UCS4, depending on Elem (UCS2 for char16_t, UCS4 for char32_t, platform-dependent for wchar_t).
- Has two additional, defaulted template paramters (
unsigned long MaxCode = 0x10ffff, and std::codecvt_mode Mode = (std::codecvt_mode)0), and inherits from std::codecvt<Elem, char, std::mbstate_t>.
codecvt_utf8 and codecvt_utf16 will convert between the specified UTF and either UCS2 or UCS4, depending on the size of Elem. Therefore, wchar_t will specify UCS2 on systems where it's 16- to 31-bit (such as Windows, where it's 16-bit), or UCS4 on systems where it's at least 32-bit (such as Linux, where it's 32-bit), regardless of whether wchar_t strings actually use that encoding; on platforms that use different encodings for wchar_t strings, this will understandably cause problems if you aren't careful.
For more information, see CPP Reference:
Note that support for header codecvt was only added to libstdc++ relatively recently. If using an older version of Clang or GCC, you may have to use libc++, if you want to use it.
Note that versions of Visual Studio prior to 2015 don't actually support char16_t and char32_t; if these types exist on previous versions, it will be as typedefs for unsigned short and unsigned int, respectively. Also note that older versions of Visual Studio can have trouble converting strings between UTF encodings sometimes, and that Visual Studio 2015 has a glitch that prevents codecvt from working properly with char16_t and char32_t, requiring the use of same-sized integral types instead