7

I was googling around for hours to find a solution for my problem and I couldn't get it to work:

I have to rsync a file structure on an ext4 formatted drive to a hfs+ formatted drive. Folders and file names can contain german Umlauts (äöüß) and the handling of UTF8 is different between OS X and Linux. OS X filesystems use Unicode Normalization Form D (NFD), where Linux uses Form C (NFC).

That behaviour causes the deletion and re-syncing of files with Umlauts in their names, what produce a big unneeded overhead, especially if you rsync with the --backup option.

A solution to prevent that behaviour is the use of --iconv=UTF8,UTF8-MAC but this works only with a newer iconvlib on a Mac. The actual iconvlib on Ubuntu 14.04 doesn't support the pseudo-charset UTF8-MAC:


root@ubuntu:~/wartung# iconv -l
...
UTF-7, UTF-8, UTF-16, UTF-16BE, UTF-16LE, UTF-32, UTF-32BE, UTF-32LE, UTF7, UTF8, UTF16, UTF16BE, UTF16LE, UTF32, UTF32BE, UTF32LE, ...

OS X with the latest rsync from homebrew do it:


bash-3.2$ iconv -l
ANSI_X3.4-1968 ANSI_X3.4-1986 ASCII CP367 IBM367 ISO-IR-6 ISO646-US ISO_646.IRV:1991 US US-ASCII CSASCII
UTF-8 UTF8
**UTF-8-MAC UTF8-MAC**, ...

Related Q&A to my problem are:
rsync --iconv option on Mac not working (sync from remote Linux server to local Mac)
Converting UTF-8 NFD filenames to UTF-8 NFC, in either rsync or afpd

EDIT
With a some fiddling around I could re-compile the libiconv for Ubuntu:

sudo -i
# Get the libiconv sources
wget https://ftp.gnu.org/gnu/libiconv/libiconv-1.14.tar.gz
tar -xzvf ./libiconv-1.14.tar.gz -C /usr/src && cd /usr/src/libiconv-1.14

# Get the patch for the Makefile
wget https://raw.githubusercontent.com/Homebrew/patches/9be2793af/libiconv/patch-Makefile.devel
patch -p1 ./Makefile.devel < patch-Makefile.devel

# Get the patch for the translation file
wget https://raw.githubusercontent.com/Homebrew/patches/9be2793af/libiconv/patch-utf8mac.diff
patch -p1 < ./patch-utf8mac.diff

# Replace utf8mac.h file
rm lib/utf8mac.h && cd lib
wget http://opensource.apple.com/source/libiconv/libiconv-9/libiconv/lib/utf8mac.h?txt -O utf8mac.h

# Append flags.h with utf8mac
echo "#define ei_utf8mac_oflags (HAVE_ACCENTS | HAVE_QUOTATION_MARKS | HAVE_HANGUL_JAMO)" >> flags.h

# Edit stdio.h.in to prevent gcc errors because of insecure 'gets' function
cd ../srclib
sed -i -- 's/(gets/(fgets/g' ./stdio.in.h

# compile & install ...
cd ..
./configure
make -f ./Makefile.devel
make
checkinstall

Until that step everything works well. Additionally I had to set the path to the shared library with ...

touch /etc/ld.so.conf.d/libiconv.conf
echo "/usr/local/lib" > /etc/ld.so.conf.d/libiconv.conf
ldconfig

Now the pseudo-charset is available in Ubuntu too:

root@ubuntu:/# iconv -l
ANSI_X3.4-1968 ANSI_X3.4-1986 ASCII CP367 IBM367 ISO-IR-6 ISO646-US ISO_646.IRV:1991 US US-ASCII CSASCII
UTF-8
**UTF-8-MAC UTF8-MAC**, ...

But surprise surprise – it's not available in rsync ?!

root@ubuntu:/# rsync --iconv=UTF8,UTF8-MAC --force --ignore-errors --delete --numeric-ids --archive --hard-links --sparse --backup --backup-dir=/path/to/TEMP/ /path/SOURCE/ /path/TARGET/
iconv_open("UTF-8", "UTF8-MAC") failed
rsync error: requested action not supported (code 4) at rsync.c(121) [Receiver=3.1.0]
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(226) [sender=3.1.0]

When I test iconv with a document, then the pseudo-charset works. What's wrong with rsync? Thanks for you help!

  • You have to re-compile rsync using your custom libiconv. But I think your libiconv doesn't work (utf8mac.h maybe isn't correct) and gives this error: > receiving incremental file list. ABORTING due to invalid path from > sender: ⸀吀攀洀瀀漀爀愀爀礀䤀琀攀洀猀⼀昀漀氀搀攀爀猀⸀㔀 ㄀/吀攀洀瀀漀爀愀爀礀䤀琀攀洀猀 – kiribu Dec 21 '16 at 23:24

0 Answers0