4

We are using rsync to transfer some (millions) files from a Windows (NTFS/CYGWIN) server to a Linux (RHEL) server. We would like to force all file and directory names on the Linux box to be lower case.

Is there a way to make rsync automagically convert all file and directory names to lower case? For example, lets say the source file system had a file named:

/foo/BAR.gziP

Rsync would create (on the destination system)

/foo/bar.gzip

Obviously, with NTFS being a case insensitive file system there can not be any conflicts...

Failing the availability of an rsync option, is there an enhanced build or some other way to achieve this effect? Perhaps a mount option on CYGWIN? Perhaps a similar mount option on Linux?

Its RHEL, in case that matters.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
SvrGuy
  • 1,002
  • 3
  • 16
  • 29
  • 2
    Is this a once off transfer or will it be done regularly? – mgorven Apr 14 '12 at 16:59
  • If none of the other answers meet your requirements, as a last resort the rsync source code could be modified to lower-case destination filenames when the files and directories are created on the server. – Brian Swift Apr 16 '12 at 04:13

3 Answers3

2

You can change the case of the resulting filenames on the target server after the rsync. I wouldn't attempt to do this mid-transfer (in case you need to restart the copy). As for making the change on the linux side, you'd need to determine if there are any conflicts. You will also need to determine if you need the directory names' case to be changed. Will all names be unique? If so, an appropriate find script coupled with the tr or rename command could do the job...

# Examples - Don't run directly
`rename 'y/A-Z/a-z/' *` # would change case on files within a directory.
ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Couple of issues: (1) there can't possibly be any conflicts, because the source of the data is a case insensitive file system (2) the script would impose a *huge* performance penalty -- their are **millions** of files being copied. How long does it take to rename ~12 million files -- also, while the rename script is running **the service being migrated is down**. The hope was that there was an inline way to do this... – SvrGuy Apr 14 '12 at 17:45
  • I'd still do it afterwards. Script execution time depends on filesystem and design choices. I've done something similar, renaming 4 million files on an XFS filesystem using a find script that recursed into the directories and applied changes with `xargs`. But perhaps there will be some better examples of an efficient script posted in other answers. – ewwhite Apr 14 '12 at 18:03
  • 1
    could you do the change prior to the transfer? A powershell or VBS bit of code would make this simple and you could have it done prior to doing the transfer. Assuming there aren't file locking issues to deal with or naming issues. Though in Windows I don't "think" it would be. – MikeAWood Apr 16 '12 at 05:42
2

You can mount a case-insensitive file system. Look at this post.

Also, this page suggests creating a disk image of type FAT32 and mounting it. The created fs will be case-insensitive such any Windows partition.

Using such a solution will eliminate the need to convert all these millions of files to lower-case.

Khaled
  • 35,688
  • 8
  • 69
  • 98
2

Not the most elegant solution, but you can use LD_PRELOAD to override the relevant system calls and force everything to lowercase. I thought it is fun so I did a little proof of concept and...

> ls in out
in:
CyltApJik  keumyomDu  LidusIcweo  spydjiPa  SycsEyror  tusUngEg

out:
> rsync -av in/ --rsync-path='env LD_PRELOAD=$PWD/lowercase.so rsync' localhost:out/ 
sending incremental file list
./
CyltApJik
LidusIcweo
SycsEyror
keumyomDu
spydjiPa
tusUngEg

sent 372 bytes  received 129 bytes  1002.00 bytes/sec
total size is 0  speedup is 0.00

> ls out
cyltapjik  keumyomdu  lidusicweo  spydjipa  sycseyror  tusungeg

And here is the sample, which may take a few iterations to become good enough to sync the whole thing.

> cat lowercase.c
#include <ctype.h>
#include <string.h>
#include <sys/stat.h>
#define __USE_GNU
#include <dlfcn.h>

static int (*real_lstat) (const char *, struct stat *) = NULL;
static int (*real_rename)(const char *, const char *)  = NULL;

char * lowered(const char * string)
{
        char * low = strndup(string, 2048);
        char * c;
        if (low == NULL) return NULL;
        for (c = low; *c; c++) {
                *c = tolower(*c);
        }
        return low;
}

int lstat(const char * path, struct stat * buf)
{
        int ret = 0;
        if (real_lstat == NULL) {
                real_lstat = dlsym(RTLD_NEXT, "lstat");
        }
        ret = real_lstat(path, buf);
        if (ret == 0) return ret;
        ret = real_lstat(lowered(path), buf);
        return ret;
}

int rename (__const char *__old, __const char *__new)
{
        if (real_rename == NULL) {
                real_rename = dlsym(RTLD_NEXT, "rename");
        }
        return real_rename(__old, lowered(__new));
}
> gcc -ldl -fPIC -shared -o lowercase.so lowercase.c
chutz
  • 7,569
  • 1
  • 28
  • 57
  • `#include ` is missing in `lowercase.c`, without it `tolower()` could not be found and compilation fails. On the other hand, I wanted to change the case to uppercase (using `toupper()`), but this didn’t help me. The culprit, however, might be elsewhere; I wanted to change the case of files on SMB remote share (shared from MS Windows) mounted on Linux server. Even `mv "$path/a" "$path/A"` fails with error that the files are the same file. It seems like I need to rename the file in two steps (with a temp file), unfortunately. – tukusejssirs Jul 27 '22 at 13:15
  • Thank you. Updated the code in the header, and it seems to work still. – chutz Aug 17 '22 at 06:50