3
1
I've just arrived in my new lab and japan and the server I can use has only Japanese locales. A call to locale -a
returns
C
POSIX
ja_JP
ja_JP.eucjp
ja_JP.ujis
ja_JP.utf8
japanese
japanese.euc
So I changed my environment variables and now my locale is set to ja_JP.utf8
which should support Unicode just fine. A call to locale now returns (changed from eucjp):
LANG=ja_JP.utf8
LANGUAGE=
LC_CTYPE="ja_JP.utf8"
LC_NUMERIC="ja_JP.utf8"
LC_TIME="ja_JP.utf8"
LC_COLLATE="ja_JP.utf8"
LC_MONETARY="ja_JP.utf8"
LC_MESSAGES="ja_JP.utf8"
LC_PAPER="ja_JP.utf8"
LC_NAME="ja_JP.utf8"
LC_ADDRESS="ja_JP.utf8"
LC_TELEPHONE="ja_JP.utf8"
LC_MEASUREMENT="ja_JP.utf8"
LC_IDENTIFICATION="ja_JP.utf8"
LC_ALL=
I can read file containing Japanese characters in Unicode just fine, whether I'm using less, emacs or vim and connecting from PuTTY or a remote xterm with cygwin. It also seem to display other Unicode characters fine.
But here comes the problem: if I type something in Japanese it seems to go wrong. I like to use IRC and for some reason, while I can read perfectly fine any Japanese character if I type something it's sent as garbage for other people. I'm using the configuration found here http://xkr47.outerspace.dyndns.org/howtos/irssi-utf-8-guide.txt
I'm getting these results for
/set charset
term_charset = utf-8
recode_out_default_charset = ISO-8859-15
and /set recode
recode = ON
recode_autodetect_utf8 = ON
recode_fallback = ISO-8859-15
recode_out_default_charset = ISO-8859-15
recode_transliterate = ON
If you have suggestions, please try to think of a way which doesn't require root rights if possible since it would take forever for the administrator to actually do something on the server. I've looked up a lot online about locale but I didn't find anything about this problem.
Which IRC client are you using? If you run
cat > testfile.txt
, does it store the typed text correctly? – user1686 – 2015-04-23T07:23:38.987I'm running irssi. I'm using the same config file as on my other server where everything works fine. I can't try cat now but I will do this tomorrow when I get back in the lab. – meneldal – 2015-04-23T11:24:34.483
With cat I'm getting a textfile I can read on both Linux and Windows. Notepad++ says it's encoded in UTF-8 without BOM. Native notepad also opens it fine. Documents I create with
nano
also use this encoding. – meneldal – 2015-04-24T01:29:29.360So your terminal is working fine, but Irssi is interpreting things weirdly. Could you check what
/exec locale
outputs, as well as/set charset
and/set recode
? – user1686 – 2015-04-24T05:42:57.4771I'm getting
LANG=ja_JP.UTF-8
(and the other lines same,LC_ALL
not set),term_charset = utf-8
andrecode_out_default_charset = ISO-8859-15
and for the last onerecode = ON recode_autodetect_utf8 = ON recode_fallback = ISO-8859-15 recode_out_default_charset = ISO-8859-15 recode_transliterate = ON
– meneldal – 2015-04-24T05:48:13.517By the way since I noticed I had
LANG=ja_JP.UTF-8
instead ofLANG=ja_JP.utf8
I quit that console again, logged out completely so now it's alsoLANG=ja_JP.utf8
but it's still not working. I see my own message correctly but people don't receive it right while the messages other people send me work fine (I'm actually testing by sending messages from my other server, including characters not present in ShiftJS or eucJP) – meneldal – 2015-04-24T05:56:59.320It's most likely caused by
recode_out_default_charset
telling Irssi to convert everything to ISO-8859-15. Fix that setting. – user1686 – 2015-04-24T06:10:12.590What should I put there instead? I'm using the recommended settings from the irssi FAQ so I assume it should be working. – meneldal – 2015-04-24T06:12:14.830
Uh,
UTF-8
, what else. – user1686 – 2015-04-24T06:12:43.487Thank you that does fix the irssi problem. I'm pretty sure I pasted the settings from the same place though – meneldal – 2015-04-24T06:18:00.937