6

I recently switched from OpenSuSE to Arch Linux. Files with unicode characters in the name used to display fine, but after the switchover I just get mojibake. For example, in my music library Queensrÿche appears as Queensrÿche.

This occurs on the console as well.

I piled on to a relevant thread in the Arch Linux forums, but haven't gotten an answer.

phuclv
  • 30,396
  • 15
  • 136
  • 260
Nathan
  • 439

1 Answers1

11

It's possible that your locale is configured improperly; the most likely reason is that, even though the file names are stored in UTF-8, your terminal (I'm guessing Konsole) still expects a legacy ISO-8859-* encoding.

I don't know the rest, but here are a few steps to ensure the basic configuration is correct.

This script may also help.

  1. Edit /etc/locale.gen, ensure that your preferred .UTF-8 locale (e.g. en_US.UTF-8) is uncommented.

    (By default, Arch does not enable any locales.)

  2. Run locale-gen to generate the locales, if they haven't been generated yet.

    (Currently generated locales are listed by locale -a.)

  3. Edit /etc/locale.conf and add LANG=en_US.UTF-8.

    (The LOCALE= variable in /etc/rc.conf does the same thing, but is, in a way, deprecated in favor of locale.conf.)

  4. Log out completely, then log in again, in order to refresh the environment variables.

  5. Run env | egrep '^(LANG|LC_)' | sort to see what locale settings are in your shell's environment.

    Run tr \\0 \\n < /proc/$PPID/environ | egrep '^(LANG|LC_)' | sort to see the terminal's environment.

    Both commands should return identical output. If not, they both must at least have ".UTF-8" in LANG values. (".UTF-8" and ".utf8" can be considered identical.) Also, neither command should list LC_ALL.

grawity
  • 501,077