2

Possible Duplicate:
Doing an rm -rf on a massive directory tree takes hours

I'm running a simulation program on a computing cluster (Scientific Linux) that generates hundreds of thousands of atomic coordinate files. But I'm having a problem deleting the files because rm -rf never completes and neither does

find . -name * | xargs r

Isn't there a way to just unlink this directory from the directory tree? The storage unit is used by hundreds of other people, so reformatting is not an option.

Thanks

Nick
  • 143
  • 3
  • If you have lots of files it just takes time. For example I have a test system for this with a directory containing 900k files. The solutions in @ewwhite's answer (I recognise those commands Ed!) take ~80 minutes to complete. – user9517 Nov 11 '12 at 08:36

4 Answers4

3

Method 1 Assuming those file are meant to be created, just need to be removed after use.

If possible, have all those files, and only those files, create in a standalone partition or disk. When it is time to deleted them, umount the partition and format it. EXT4 (not EXT2) format only takes a few seconds.

Make sure you are not saving information/report/etc in the same location.

You can mount a new partition or a new disk to the original location, either directly or with the -o bind option.

Method 2

Thinking a bit out of the box, instead of individual file, put all those data into a database table. Then drop the whole table after use.

John Siu
  • 3,577
  • 2
  • 15
  • 23
  • Making a new ext file system doesn't just take a few seconds: for any decently-sized disc it takes many minutes, and for large RAID arrays can take over an hour. – MadHatter Nov 11 '12 at 08:16
  • More arguments for the use of XFS :) – ewwhite Nov 11 '12 at 08:27
  • It is correct if you are referring to ext2, but not ext4. By the way, how much disk space are we talking here, regarding those atomic files? – John Siu Nov 11 '12 at 08:28
2

I typically use something like:

find ./directoryname -type f -name '*file-pattern*' -exec rm {} +

It is also possible to use the -delete flag to the find command.

find ./directoryname -type f -name '*file-pattern*' -delete

Is the generation of these files a problem/bug? Is there anything at the application level that can help?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
2

My guess is you're running across a strange filetype that's blocking rm from completing. Try something like

find . -type d -o -type f -print0 | xargs -0 rm -rf --
Rob Paisley
  • 296
  • 2
  • 9
2

Just unlinking the directory would be perfectly possible if you didn't mind not getting the free space back, and all the files reappearing in /lost+found at the next fsck.

Removing the files isn't the time-consuming bit, it's all the file system maintenance code that tidies up behind the scenes that is time-consuming, and it takes an extra-long time to do millions of small files. It takes even longer if they are in a flat, wide file structure, instead of a deep, thin one (ie many files in few directories instead of many files in may nested directories). As you've noticed, in some cases it can longer to do this than simply to recreate the file system.

If this were my issue, I'd make a custom partition to keep those files in, and in addition, I'd probably use tmpfs, which is better-designed for the storage of temporary files anyway, and will cut down the file system re-creation time.

MadHatter
  • 78,442
  • 20
  • 178
  • 229