10

How to find out what charset encoding is used by current file system and how to change it to UTF-8?

EDIT:

Here is the output of mount:

/dev/sdb6 on / type ext3 (rw,relatime,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
/proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
varrun on /var/run type tmpfs (rw,nosuid,mode=0755)
varlock on /var/lock type tmpfs (rw,noexec,nosuid,nodev,mode=1777)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
lrm on /lib/modules/2.6.27-11-generic/volatile type tmpfs (rw,mode=755)
securityfs on /sys/kernel/security type securityfs (rw)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)
gvfs-fuse-daemon on /root/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev)

Here is the output of "cat /etc/fstab"

# /etc/fstab: static file system information.
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    defaults        0       0
# /dev/sdb7
UUID=50d660f1-1948-41e1-96af-3cb9bca338dd /               ext3    relatime,errors=remount-ro 0       0
# /dev/sdb8
UUID=efaee412-8e29-4f65-927d-f57252451088 none            swap    sw              0       0
jack
  • 1,705
  • 5
  • 21
  • 24

4 Answers4

6

On Unix-like systems, the encoding of file names is not set at the filesystem level, but rather in the user environment. Check the output of locale and look at the stuff after the dot — for example, in my case LANG=en_US.UTF-8, so the file names in my environment are interpreted as UTF-8. This is the default setting in Ubuntu.

The answer from Dennis Williamson is relevant for special filesystem types that require translation, and I am not attempting to get into this issue because your outputs of mount and cat /etc/fstab show this is not your case.

Amir
  • 797
  • 6
  • 16
  • 1
    system locale is already en_US.UTF-8 – jack Nov 22 '09 at 11:35
  • 1
    I don't think this is true. My LANG-en_US.UTF8 but Ubuntu creates files as us_ascii. Ubuntu doesn't seem to do anything with encoding file system level. Unfortuantely – onknows Jun 12 '15 at 13:26
3

Ubuntu uses UTF-8 encoding by default and it seems you haven't changed it. You could have file names with a different encoding. In that case, you could use convmv to fix that.

raphink
  • 11,337
  • 6
  • 36
  • 47
3

You don't say what filesystem, however you can look at the output of mount which on one of my systems currently shows a iso9660 filesystem and a couple of vfat ones that are utf8. You can also look at the contents of /etc/fstab which is where you'd set them or they are already set. See man mount which shows that NTFS and jfs are two more that have that option.

Dennis Williamson
  • 60,515
  • 14
  • 113
  • 148
  • @Dennis, I posted the output of mount and "cat /etc/fstab". It looks like there is no charset encoding information there. – jack Nov 22 '09 at 10:14
  • You still don't say which filesystem/device or what specific problem you're trying to solve. As **Amir** said, you're apparently not using one of the filesystems I mentioned and you say `locale` is already correct. What is the issue? – Dennis Williamson Nov 22 '09 at 13:35
  • @Dennis, you said your mount output shows you have a iso9660 filesystem. I have posted my mount output. I didn't see anything related to "filesystem" you mentioned in the output. Could you please help me figure it out? – jack Nov 27 '09 at 03:06
  • The filesystem is what's listed after the word "type" in the output of `mount` or under the "type" column in `/etc/fstab`. In the output you show, "ext3" is an example. The iso9660 filesystem refers to a CD-ROM. What specifically is the problem you are trying to solve? – Dennis Williamson Nov 27 '09 at 06:55
-1

In short, you can't really.

There are 2 things, the encoding of the filenames, and the encoding of the data in the files. In both cases the filesystem will just store the raw bytes. It's up the user to make sure they are the encoding the user wants.

Amandasaurus
  • 30,211
  • 62
  • 184
  • 246