5

I need to chown 1.5 million files on a drive. I'm currently doing:

sudo chown -R www-data:www-data /root-of-device

but it takes an awfully long time to run. I was wondering if there was some sort of superfast low-level way to chown every file on the drive.

Aidan Kane
  • 185
  • 1
  • 1
  • 8
  • How about using a umask on the folder (in some situation) ? chown www-data:www-data /var/www && chmod -R ug+rwxs /var/www – Andy Sep 22 '17 at 00:42

3 Answers3

8

Use xargs -p or GNU parallel to speed things up considerably.

pfo
  • 5,630
  • 23
  • 36
  • 1
    On a single filesystem, the speedup will be less than you think; potentially it could be *slower* because of contention between the multiple threads of execution accessing the same filesystem. – geekosaur Apr 10 '12 at 21:24
  • 1
    That should not be the case in XFS, as it supports parallel meta data operation according to the number of allocation groups created at formatting time. A parallel chmod will be faster than the sequential version. – pfo Apr 10 '12 at 21:59
  • I tried xargs -P4 (number of allocation groups on my xfs) and it ran slightly faster than the regular chown. I ran it a few times though and I think the difference in timings was just AWS EBS i/o fluctuations. – Aidan Kane Apr 11 '12 at 13:36
  • So what's the complete pipe using `chown ...` and `xargs -p` ? – Chris F Nov 13 '20 at 01:19
5

Unfortunately I do not think there is such a thing, but I would be pleasantly surprised if there was. You could write your own implementation in C and optimise it heavily. However the success of that depends on how well optimized chown is to begin with. And considering it's one of the core utilities I would say it's rather optimized. In addition you are bound most likely by i/o speed.

I have had some success avoiding limitations of ls and rm by piping the results of find to xargs, in the case a directory has a lot of files, i.e.:

find /path/* | xargs rm

So, a wild guess, maybe this can speed up chown, in case it is slower at recursively scanning a filesystem than find:

sudo find /path/* | xargs chown www-data:www-data
aseq
  • 4,550
  • 1
  • 22
  • 46
  • Ha! Don't fancy my chances of coming up with a better implementation to be honest. It's an Amazon EBS device which means that i/o performance is a bit unstable. – Aidan Kane Apr 10 '12 at 21:09
  • I updated my answer with a few suggestions that may help. – aseq Apr 10 '12 at 21:13
  • 1
    In theory you could abuse `xfs_metadump` and `xfs_db` to get something slightly faster; in practice it's not worth the effort or the potential failure modes. – geekosaur Apr 10 '12 at 21:21
  • Yeah I think whichever way you look at it it will take a long time... – aseq Apr 10 '12 at 23:32
  • 1
    Thanks for the hints. Playing around with various things it really looks like I'm not going to do much better than regular chown. – Aidan Kane Apr 11 '12 at 13:38
0

I'm using Amazon EC2 as well and had this issue. 2 things:

Fixing Current situation: You will have to deal with the slowness. Perhaps you can use "screen" or something similar so that the process can continue in the background.

Fixing Future situation: You could look upstream and see how the files are being generated in the first place. Since you mentioned www-data, I'm assuming that the consumer of the files is Apache. If files are being dropped from another program (NFS, Samba, SSH etc) make sure that those programs are setting the user:group as www-data:www-data.

lsu_guy
  • 101