Summary:
I use the following setup to backup data on a Synology NAS to a remote disk via rsync.
- First backup local: rsync initialised on the Synology NAS to the disk (mounted on the NAS).
- Future backups remote: rsync initialised on a Mac to the disk, now mounted on the Mac on a remote location.
Problem: I get two copies of all data (files or folders) with special characters.
Question: Is there a way to use the same basic process (first backup locally via NAS, rest via Mac, using rsync) without the above problem?
State of play: What follows below is rather long and includes two edits, but although the problem is further analysed, it is not yet solved.
Full Description:
I have a setup that has been going on for ages. Indeed, ever since I managed to solve an initial special character problem (detailed here), after which I use the "--iconv=utf-8-mac,utf-8" option for the rsync on the mac. The setup is this:
Location 1: Synology NAS
Location 2 (in a galaxy far far away): Mac with external disk (Mac OS Journaled)
Task: rsync job on the Mac pulling folders to the external disk (location 2) from the NAS (location 1).
I now plan to set up a new disk (also Mac OS Journaled) on Location 2. Since there is around 2TB of data to transfer, I did the following:
Location 1: Plugged the new disk into the NAS thanks to the wonders of USB.
Location 1: Pushed the data to the new disk with a rsync job on the NAS
Travelled to the far away galaxy that we here call Location 2
Location 2: initialised a limited rsync pull job from the Mac, now with the new disk plugged in.
Problem: For some reason, step (4) did not finish in 2 seconds with no changes at all, but started to complain that “file has vanished: …[file location specified]” for a bunch-load of files. Then it started to copy folders and files to the disk — even though they were already there! 70 GB later, from what I could tell, it had made a completely redundant copy of all folders which had special characters in their name (and a redundant copy of all files with special characters in their name in folders which did not have special characters in their name). For example:
drwxrwxrwx 5 _unknown _unknown 170 Aug 7 2013 Pippi Långstrump-Pippi i Söderhavet
drwxrwxrwx 5 _unknown _unknown 170 Aug 7 2013 Pippi Långstrump-Pippi i Söderhavet
These two folders seem completely identical, yet they are listed alongside each other as two distinct folders. If I use the Mac GUI, I can enter each of them and see that they contain the same (qualitatively identical) three tracks (I do not even know how to separate them using the command line, but with the GUI I can visually see that I ‘enter’ different folders). And they are not merely virtual, since the total size of the subset of the data went from 64 GB to 82 GB.
What has happened? To my untrained eye, it seems as if the rsync process initialised on the Mac cannot ‘see’ that the source files on the NAS are already present on the target disk, and put them there again. When the mac terminal displays the file and folder names, it evidently uses the same symbols, but it must still interpret them as different ‘underneath’, since otherwise the file system would not allow it.
Now, this is not all. When I try to get the system to keep only one of the special character folders/files with the --delete option, everything just happens all over. Folders are deleted indeed, but new ones are copies and in the end I am still sitting with duplicates and 82 GB in the subset instead of 64 GB as a result.
What is going on and what can I do about it?
EDIT 11 sept: The wise Tomáš Pospíšek (well acquainted with special characters, I presume ;) advised me to go “under the hood”, and so I used his command (on Ronja instead of Pippi, since I had too many different Pippi folders). A simple “ls -l” gave me:
drwxrwxrwx 2 _unknown _unknown 68 Aug 7 2013 Ronja Rövardotter
drwxrwxrwx 2 _unknown _unknown 68 Aug 7 2013 Ronja Rövardotter
whereas
sh-3.2# ls -l Ronja* | hexdump -C
resulted in:
00000000 52 6f 6e 6a 61 20 52 6f cc 88 76 61 72 64 6f 74 |Ronja Ro..vardot|
00000010 74 65 72 3a 0a 0a 52 6f 6e 6a 61 20 52 c3 b6 76 |ter:..Ronja R..v|
00000020 61 72 64 6f 74 74 65 72 3a 0a |ardotter:.|
or, if I sort it out a bit:
52 6f 6e 6a 61 20 52 6f cc 88 76 61 72 64 6f 74 74 65 72 3a 0a 0a |Ronja Ro..vardotter:..
52 6f 6e 6a 61 20 52 c3 b6 76 61 72 64 6f 74 74 65 72 3a 0a |Ronja R..vardotter:.
In other words, they are not identical, only superficially displayed as such.
Thanks for that. But what should I do about it? Is there a way to format the disk (e.g. case-sensitive Journaled) so that both the NAS and my mac can write to the disk properly? Or is there no way around biting the bullet, i.e. connecting the disk to a local Mac (location 1) and do the first backup via ethernet? That would take forever compared to the USB 3 connection, but at least the backup would be "mac interpreted" both locally and remotely. What do you suggest?
EDIT 14 sept: The helpful Tomáš further suggested (via comment below) that I should try to rsync a single file with special characters in the name, to see what happens then (and he suggests a workaround). Unfortunately, what happens is that I am left with two files on the destination disk with seemingly identical names, but, when hexdumping you can see that they are coded differently. My problem then was that I could not seem to delete both files properly. That is, when I “rm”-deleted them so that no files were visible (“ls -l” did not list them), I could still see the files (or folders; same there) in Mac Finder. This happened even if I rebooted the system etc, so somehow the file information was there for Mac Finder to display, even though they did not turn up from a command listing.
At this point, I sort of threw in the towel and went for the cowardly solution of simply erasing the disk and go back to pulling the data on the initial site (location 1) through a mac and the same rsync command. That took a much longer time, transfer-wise, but directly ‘solved’ the problem. I now have it all set up, working like clockwork.
Still, the problem as such is not yet solved. That is, I would like to know how to:
- push data to an external disk (Mac OS Journaled, mounted on the NAS) from a Synology NAS with an rsync process initialised on the NAS
- Backup that data on an external site using a rsync command on a mac
to which the disk is mounted.
If anyone knows, answer the question and I will mark you as the problem-solver (and a hero) right away!
NOTE: This question is now like a little “what I did last summer”-tale of its own, so I have re-written the summary above for potential problem solvers to have a change of knowing what the core question is.