1

I'm opening an SSH session from Fedora to Raspberry Pi OS. Accented and special characters are replaced with question marks. Preferably I would like to learn to solve this without changing the server's charmap to UTF-8.

The client's charmap :

$ locale -k charmap
charmap="UTF-8"

The server's charmap :

$ locale -k charmap
charmap="ISO-8859-15"

Weirdly enough, that charmap is not listed by localectl on the server:

$ localectl list-locales
C.UTF-8

The client's localectl list-locales only contains locales ending in .UTF-8.

The client is set to send its locale :

/etc/ssh/ssh_config :

SendEnv LANG LC_*

Setting the client to charmap="ISO-8859-15" does not solve the problem.

When the client has charmap="ISO-8859-15", some characters are incorrectly displayed even in a local terminal.

Both systems have the following value for $TERM :

xterm-256color

These solutions didn't solve it for me:

https://superuser.com/questions/1788839/wrong-character-encoding-in-ssh-session-but-not-for-all-connectios

ssh and character encoding

Kusalananda
355k42 gold badges735 silver badges1.1k bronze badges
asked Aug 28 at 19:05
8
  • 2
    UTF-8 is the modern correct solution. Can you edit your question to explain why you don't want to take the path of least resistance? Commented Aug 28 at 19:34
  • To learn how to do it, and because someday I may have this problem on a server I don't fully control. Commented Aug 28 at 19:50
  • What happens when you change the locale on the client to match the server's before you connect? Commented Aug 28 at 19:53
  • It doesn't solve it. Both systems now have locale -k charmap output charmap="ISO-8859-15" (after logging out / in) but special letters are still weird. Commented Aug 28 at 20:39
  • After changing the locale, certain special characters can't be displayed in the terminal locally now. Commented Aug 28 at 21:07

1 Answer 1

3

If locale -k charmap reports ISO-8859-15 on the remote system, that means either the LC_ALL environment variable or LC_CTYPE if LC_ALL is not set or LANG if both are unset, is set to a locale that uses ISO-8859-15 as charmap. Using locale without argument will give a summary of the locale settings.

The reason you don't see that locale in the output of systemd's localctl is that since that commit, by default it only lists locales whose name ends in .UTF-8 or contain .UTF-8@ (whether or not they use UTF-8 as charmap, but it would be misleading to name a locale with .UTF-8 if it didn't).

Use locale -a or SYSTEMD_LIST_NON_UTF8_LOCALES=1 localectl list-locales1 to list all locales regardless of their name.

$ localectl list-locales
C.UTF-8
da_DK.UTF-8
en_DK.UTF-8
[...]
$ SYSTEMD_LIST_NON_UTF8_LOCALES=1 localectl list-locales
C.UTF-8
C.iso88591
da_DK.UTF-8
el_GR
el_GR.iso88597
[...]
$ locale -a
C
C.iso88591
C.utf8
da_DK.utf8
el_GR
el_GR.iso88597
[...]

The system's default locale is one thing, but any process can use a different locale. Different users on a system can typically very well use different locales depending on which region/culture they're from, what language they speak, etc.

If the locale in use in your environment on the client machine is one that uses the UTF-8 charmap (such as en_GB.UTF-8), then it's also most likely the charmap in use in your terminal emulator. For instance, when you type é, it will send bytes 0xc3 and 0xa9 which is the UTF-8 encoding of the é character, and if it receives those 2 bytes, it will display a é.

So you should not use a locale that uses another charmap, locally or remotely over ssh as that won't work properly. For instance, software running in a locale where the charmap is ISO-8859-15 would interpret those two 0xc3 0xa9 bytes sent upon pressing é as the à and © characters, as those bytes are the encoding of those characters in that charmap.

So you should update the locale settings in your ssh session so at least the LC_CTYPE category be assigned a locale that uses UTF-8.

Using the SendEnv LANG LC_* ssh client configuration directive is one way to achieve it, but the server must be configured to have the corresponding AcceptEnv LANG LC_* if it's also an openssh implementation (usually in /etc/sshd_config or /etc/ssh/sshd_config or files in a directory with a .d suffix), and the locale settings not been overridden by session initialisation files such as /etc/profile and ~/.profile for Bourne-like shells.

Here, it looks like the remote system supports a C.UTF-8 locale, so you can do: export LC_CTYPE=C.UTF-8; unset -v LC_ALL (assuming a Korn/POSIX-like shell, you may need to adapt to the syntax of the login shell of the remote user) in the ssh session to update the locale to use the UTF-8 charmap.

If it didn't, another approach would be to either reconfigure your terminal to use ISO-8859-15 instead of UTF-8 for its encoding or use luit which can transcode characters on the fly. That is call:

luit -encoding ISO-8859-15 -- ssh user@host

(or call luit in a locale that uses ISO-8859-15 as the charmap matching that of the remote system).

Example (here simulating your ssh with a new shell in which I set the locale to en_GB.iso885915)

$ luit -encoding ISO-8859-15 -- zsh
$ export LANG=en_GB.iso885915
$ printf %s 'é' | od -An -vtx1
 e9

Upon pressing é, my terminal sent 0xc3 0xa9, which luit translated to 0xe9, the shell line editor printed its echo as 0xe9 which luit translated back to 0xc3 0xa9 for it to display as é, but you can see od correctly received the 0xe9 byte which is the correct encoding of é in the ISO8859-15 charmap.

In any case, the value of $TERM is not relevant, it's set by terminal emulators to tell applications what kind of emulation the terminal implements. xterm-256color would mean your terminal emulator claims to implement an emulation compatible with that of xterm when supporting 256 colours. It doesn't say anything about character encoding. The ssh client will pass $TERM to the server when in interactive mode (like when in rlogin mode when not specifying a remote command to run or with -t), and the ssh server will propagate it to the remote shell or command then regardless of AcceptEnv.


1 with that SYSTEMD_LIST_NON_UTF8_LOCALES environment variable only documented in doc/ENVIRONMENT.md in the code on Debian-based systems included in /usr/share/doc/systemd/ENVIRONMENT.md.gz

answered Aug 29 at 9:13

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.