10

I spend a lot of my time sshed into various machines, all of which are different (some are embedded, some run Linux, some run BSD, &c.). On my own local machines, However, I use OS X, which of course has a userland based on BSD. My locale on those machines is set to en_GB.UTF-8, which is one of the available options:

% echo `sw_vers`
ProductName: Mac OS X ProductVersion: 10.8.2 BuildVersion: 12C60
% locale -a | grep -i 'en_gb.utf'
en_GB.UTF-8

Several of the more-capable Linux systems I use appear to have an equivalent option, but I note that on Linux the name is slightly different:

% lsb_release -d
Description: Debian GNU/Linux 6.0.3 (squeeze)
% locale -a | grep -i 'en_gb.utf' 
en_GB.utf8

This makes me wonder: When I ssh into a Linux machine from my Mac, and it forwards all of my LC_* variables with that 'UTF-8' suffix, does that Linux machine even understand what is being asked of it? Or is it just falling back to some other locale?

Here is an example of what I'm referring to:

% ssh -v odin
...
debug1: Entering interactive session.
debug1: Sending environment.
debug1: Sending env LC_ALL = en_GB.UTF-8
debug1: Sending env LC_COLLATE = en_GB.UTF-8
debug1: Sending env LC_CTYPE = en_GB.UTF-8
debug1: Sending env LC_MESSAGES = en_GB.UTF-8
debug1: Sending env LC_MONETARY = en_GB.UTF-8
debug1: Sending env LC_NUMERIC = en_GB.UTF-8
debug1: Sending env LC_TIME = en_GB.UTF-8
debug1: Sending env LANG = en_GB.UTF-8
odin:~ % locale | tail -1  # locale is set to .UTF-8 without error...
LC_ALL=en_GB.UTF-8
odin:~ % locale -a | grep 'en_GB.UTF-8'  # ... even though .UTF-8 isn't an option
odin:~ % 

In either case, what is the mechanism behind its behaviour, and is it dependent on any particular set-up (e.g., will I see the same behaviour on a BusyBox-based system as on a GNU-based one)?

Giacomo1968
  • 58,727
kine
  • 1,849

2 Answers2

1

It's an interesting question, but I think there may be a misconception in there about how variables are set up. When a secure shell session is initiated (ssh remotehost), what happens at the other end is an instantiation of a new shell with a separate environment. That is a fancy way of saying that the server starts a fresh shell. That new shell may or may not be configured with the same locale as your original local shell.

E.g

geee: ~
$ echo `locale |grep LANG` :: `date`
LANG=en_US.UTF-8 :: Mon Dec 3 07:04:00 CET 2012

$ ssh flode
flode: ~
$ echo `locale |grep LANG` :: `date`
LANG=nb_NO.UTF-8 LANGUAGE=nb_NO.UTF-8 :: ma. 03. des. 06:59:33 +0100 2012

In order to demonstrate this, I set up the locale on the remote shell for Norwegian by adding the following lines to the ~/.bash_profile file:

export     LANG=nb_NO.UTF-8
export LANGUAGE=nb_NO.UTF-8
export   LC_ALL=nb_NO.UTF-8

Similarly, you will have to set up the environment on the remote shell to do the same. Of course, other shells read different startup files such as ~/.zprofile for the Z shell.

The misconception I suspected lay in that the local variables (settings) are in no way forwarded. The remote shell has its own settings. In order to list the available languages on the remote host, be it a minimalistic BusyBox shell or a full-blown GNU OS, use the locale command with the -a switch (as noted in the question). Any of the printed lines may be used as a locale setting for that environment.

As for the first question, the default locale that any shell starts with is usually configured in a central place such as /etc/profile. Most login shells read this file on startup.

1

Is the name for UTF-8 support slightly different on different systems for the following command as well?

LC_ALL='' locale charmap  # UTF-8 (on Mac OS X 10.6.8)

If you encounter weird locale-related issues, it may help to tell the SSH client to not send those LC_* variables by commenting out SendEnv LANG LC_* in /etc/ssh_config (see, for example, Fixing Mac OS X Lion's SSH UTF-8 Issues and Terminal in OS X Lion: can't write åäö on remote machine).

Another solution approach is this:

# from: http://mod16.org/hurfdurf/?p=189
tjac wrote:
Actually the real problem that's causing this is that Mac OS 10.7 sets totally 
non-standard locale values, at least when you tweak some of the formats in
SysPrefs/Language&Text as I did.

If you type "locale" on your Mac terminal you should see pretty much the same as on 
other Unices (e.g. lots of en_US.UTF-8s if you prefer US English), but you don't. 
If these garbled settings get transferred to other Unix hosts by the SendEnv option 
they naturally do not know what's going on.

So if you want to fix it cleanly to allow for sshing to all kinds of remote hosts,
including those with older character sets, put the following lines in your 
~/.bash_profile on your Mac client machine.

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

Monday, September 12, 2011 at 22:54 #
karly
  • 11