Questions tagged [utf-8]

UTF-8 is a multibyte character encoding of the Unicode character set, made up of one or more bytes. Unlike some other encodings such as UTF-16, the UTF-8 encoding is upward compatible with 7-bit ASCII characters, and can be processed to some degree by applications that are only aware of bytes.

Full support of UTF-8 for searching, collation, word parsing, etc, does require support of Unicode concepts such as characters, normalisation, supplementary characters, etc. Many application and OS problems with "special characters" such as accented European letters, or ideographs such as used in Japanese or Chinese, derive from mismatched character encodings.

Related tags:

103 questions
39
votes
5 answers

How to make the 'less' command handle UTF-8?

On my Mac terminal, printing UTF-8 works in general, but the less doesn't work correctly. So this works correctly: $ echo -e '\xe2\x82\xac' € but piping it into less gives something like this: $ echo -e '\xe2\x82\xac' | less …
user9474
  • 2,368
  • 2
  • 24
  • 26
27
votes
5 answers

Converting UTF-8 NFD filenames to UTF-8 NFC, in either rsync or afpd

I have a home file server running FreeNAS 8. A few days ago I used rsync to upload my entire iTunes library from Mac so that I could load my library over the network instead of off a slow USB drive. This mostly worked, and iTunes runs much better…
Twipped
  • 613
  • 2
  • 7
  • 10
25
votes
5 answers

How to get system locale in Windows 7 cmd?

How can I get system locale in Windows 7? I mean something like: cs_CZ.UTF-8 I tried writing "locale" in the command line but that doesn't work in Windows. Any suggestions?
Richard Knop
  • 1,089
  • 2
  • 19
  • 33
24
votes
7 answers

How to find out if a terminal supports UTF-8

I'm setting up the CPAN module for perl on CentOs 5, and one of the questions is 'Does your terminal support UTF-8?' (paraphrased). How do I find out?
Whatsit
  • 465
  • 2
  • 5
  • 9
16
votes
2 answers

Command to create MySQL database with Character set UTF-8

I use create database dbname; to create database. but I want it to created with Character set UTF-8 Anyone know what is the command to use?
Komputer
13
votes
2 answers

Is there a MySQL performance benchmark to measure the impact of utf8_unicode_ci versus utf8_general_ci?

I read here and there that using the utf8_unicode_ci collation ensures a better treatment of unicode text (for example, it knowns how to expand characters such as 'œ' into 'oe' for searching and ordering) compared to the default utf8_general_ci…
MiniQuark
  • 3,695
  • 2
  • 20
  • 23
10
votes
4 answers

Change filesystem encoding to UTF-8 in Ubuntu

How to find out what charset encoding is used by current file system and how to change it to UTF-8? EDIT: Here is the output of mount: /dev/sdb6 on / type ext3 (rw,relatime,errors=remount-ro) tmpfs on /lib/init/rw type tmpfs…
jack
  • 1,705
  • 5
  • 21
  • 24
10
votes
1 answer

Is the Ext3 filename limited to 255 symbols or 255 bytes?

I cannot save the file with the name containing more than 127 Cyrillic UTF-8 symbols on my Ext3 filesystem. It is possible so save the files containing up to 255 English UTF-8 symbols though. So is there a limit on a number of bytes containing the…
v_2e
  • 329
  • 3
  • 11
10
votes
1 answer

Can PuTTY be configured to display the following UTF-8 characters?

I'd like to be able to render the characters as seen in this tweet: I saved the tweet's JSON data and wrote a one-liner python script for testing. python -c 'import json,urllib; print…
sente
  • 263
  • 1
  • 2
  • 10
8
votes
5 answers

Debian, How to convert filesystem from ISO-8859-1 into UTF-8?

I have a old pc that is running Debian stable, that is in need of a upgrade. The problem is that it is using latin1 (ISO-8859-1) for everything, and since the rest of the world has moved to UTF-8 I plan to convert this computer as well. And for…
Johan
  • 795
  • 2
  • 7
  • 13
8
votes
5 answers

Best way to make sure a MySQL database is fully in UTF8

After some problems with UTF8 and none-UTF8 strings, we're standardising on UTF8. One thing I need to do is check that everything is in UTF8 in the MySQL database? What do I need to check? Server default characterset Default character set of each…
Amandasaurus
  • 30,211
  • 62
  • 184
  • 246
8
votes
3 answers

Lighttpd sending wrong headers for UTF-8 content

Ubuntu/Lighttpd is not serving my UTF-8 encoded files with the correct Content-Type header. It's sending Content-Type: text/html rather than Content-Type: text/html; charset=UTF-8. How do I configure Lighttpd to send the correct headers? I didn't…
sourcenouveau
  • 489
  • 1
  • 5
  • 18
7
votes
1 answer

autoindex list UTF-8 charset in Nginx

My nginx autoindex page does not display UTF-8 characters correctly, utf-8 problem I have set the charset utf-8; in my server block config section of nginx.conf file but that doesn't seem to fix the problem.
Dara Ardalan
  • 81
  • 1
  • 6
7
votes
0 answers

How to re-compile iconv on Linux (Ubuntu 14.04 LTS) with pseudo-charset UTF8-MAC?

I was googling around for hours to find a solution for my problem and I couldn't get it to work: I have to rsync a file structure on an ext4 formatted drive to a hfs+ formatted drive. Folders and file names can contain german Umlauts (äöüß) and the…
7
votes
5 answers

Are there any disadvantages of using UTF8 in an oracle database?

We are installing ordering a configured oracle database and they are asking us what character encoding we would like to have. The application (in Java) is in English only but users are from different parts of the world. Are there any motivations…
user22463
1
2 3 4 5 6 7