12

We know there is an application called AppLocale, which can change the code page of non-Unicode applications, to solve text display problems.

But there is a program whose right display code page is UTF-8, which means its text should be shown as UTF-8, but instead Windows displays it as the native code page and makes the text unreadable. It seems funny, because there are almost all countries and regions, but without UTF-8. I think it is a bug, because the programmers may use English and ignore testing non-English text display issues. I don't think the producer will fix it and I wanna fix it myself.

Is it possible to set non-Unicode output as UTF-8 by using software like AppLocale? Default non-Unicode output is native code page? How can I set the native code page to UTF-8?

phuclv
  • 30,396
  • 15
  • 136
  • 260

3 Answers3

13

Application-Wide

Microsoft has also added the ability for programs to use the UTF-8 locale without even setting the UTF-8 beta flag below. You can use the /execution-charset:utf-8 or /utf-8 options when compiling with MSVC or set the ActiveCodePage property in appxmanifest

You can also use UTF-8 locale in older Windows versions by linking with the appropriate C runtime:

Starting in Windows 10 build 17134 (April 2018 Update), the Universal C Runtime supports using a UTF-8 code page. This means that char strings passed to C runtime functions will expect strings in the UTF-8 encoding. To enable UTF-8 mode, use "UTF-8" as the code page when using setlocale. For example, setlocale(LC_ALL, ".utf8") will use the current default Windows ANSI code page (ACP) for the locale and UTF-8 for the code page.

To use this feature on an OS prior to Windows 10, such as Windows 7, you must use app-local deployment or link statically using version 17134 of the Windows SDK or later. For Windows 10 operating systems prior to 17134, only static linking is supported.

This is according to UTF-8 Support - Microsoft Docs.

OS-Wide

Previously it was not possible because Microsoft claimed a UTF-8 locale might break some functions (a possible example is _mbsrev) as they were written to assume multi-byte encodings used no more than 2 bytes per character. Thus, until now, code pages with more bytes, such as GB 18030 (cp54936) and UTF-8, could not be set as the locale.

This is according to Unicode in Microsoft Windows - Wikipedia.

However, there is now a "Beta: Use Unicode UTF-8 for worldwide language support" checkbox since Windows 10 insider build 17035 for setting the locale code page to UTF-8:

Beta: Use Unicode UTF-8 for worldwide language support

See also:

  1. Changing ansi and OEM code page in Windows
  2. Windows 10 Insider Preview Build 17035 Supports UTF-8 as ANSI

That said, the support is still buggy at this point:

  1. Freeze issue in Windows 10 1803 when use UTF-8 as default code page
  2. when unicode beta support in windows 10 is turned on, add-ons fail to install
  3. UTF-8 support for single byte character sets is beta in Windows and likely breaks a lot of applications not expecting this
  4. Build fail with internal error in MSVC
ᄂ ᄀ
  • 4,187
phuclv
  • 30,396
  • 15
  • 136
  • 260
0

From what I read about Microsoft AppLocale tool on Wikipedia, the tool can NOT change your code page to UTF-8. It only works with Non-Unicode applications, but UTF-8 is part of Unicode standard.

Under the hood, Unicode processing of non-ASCII characters greatly differs from non-Unicode one, so while it is possible to change between non-Unicode code pages (this is what AppLocale does) it is NOT possible to change between Unicode and non-Unicode without modification of the application made by its producer.

miroxlav
  • 14,845
0

Just to mention it here: In Windows 10 17133 there is now a beta option to use UTF-8 for worldwide support. But it does not help with non-Unicode programs for me as of now, but it is placed on the pop-up where I can change the locale for non-Unicode programs.

So, maybe they are working on something to end the necessity of having to change the locale for non-Unicode programs.

hippietrail
  • 4,605