Rsync Character set problems

6

3

I'm attempting to backup a windows box to a Linux box (Ubuntu 9.10) using rsync on the Linux box, and I get "file has vanished" errors for filenames with unusual characters in the filenames. I get a similar error ("no such file or directory") if I use "cp" instead of rsync. The source in a share on an English language Windows box.

One of the characters is the apostrophe character.

I've been playing around with various --iconv options but haven't been able to solve the problem. Suggestions?

Nerdfest

Posted 2010-01-06T03:26:54.350

Reputation: 808

it's probably not the real apostrophe; it's probably a "smart" single-quote character. – quack quixote – 2010-01-06T12:08:52.563

Answers

7

You're mounting the share from Windows on Linux, then using rsync to copy files locally. How do you mount the share?

Windows should be storing filenames in UTF8 or UTF16, but you need to tell Linux that so it can mount the share correctly. Use a mount option like utf8/utf16, or iocharset=utf8/iocharset=utf16 in your mount command:

mount -t cifs -o utf16,other,options,here //server/share /path/to/mount/point
              ^^^^^^^^
                   |
                   -- if utf16 doesn't help, try iocharset=utf16
                      utf8 or iocharset=utf8 may also work

Other users are indicating that UTF16 is more likely to be correct.

quack quixote

Posted 2010-01-06T03:26:54.350

Reputation: 37 382

According to man page - Mount options for ntfs, nls=name - New name for the option earlier called iocharset. – kasi – 2018-03-02T12:52:39.627

1

mhh .. claiming ntfs is storing the filenames as utf8 is kind of bold. the m$-api mostly does its thing in utf16 (http://bit.ly/7c8SeR and http://bit.ly/8hK2rs) but the general direction of your answer is correct :)

– akira – 2010-01-06T13:29:07.373

you might be right about that. all i can say is, i've never had a filename translation problem that mounting as UTF8 didn't fix, but then i'm not using an incredibly large set of non-ascii characters. – quack quixote – 2010-01-06T14:01:30.347

and i think: just replace your utf8 in your answer with utf16 and that shoould be the solution to the problem of the question :) – akira – 2010-01-06T16:25:59.167

I'll try both when I get home and let you know, thanks for the tips so far. – Nerdfest – 2010-01-06T16:59:03.467

options utf8 and utf16 had no effect. iocharset=utf16 generated an error stating "cannot access a needed shared library". iocharset=utf8 seems to have worked perfectly. Thanks very much for your help! – Nerdfest – 2010-01-06T22:06:54.827

0

One way to get around this is -- for the limited directories with special character filenames, zip or tar the directory and rsync with an exclusion for that directory (but including the zip/tar file instead).

nik

Posted 2010-01-06T03:26:54.350

Reputation: 50 788