12

I want to archive files (with tar) which are below 3 MB in size. But I also want to retain the directories in which those files exist. (so I cannot use find command). I just want to avoid the files which are above 3 MB in size. How can this be done?

MikeyB
  • 38,725
  • 10
  • 102
  • 186
nixnotwin
  • 1,513
  • 5
  • 34
  • 54

3 Answers3

23

Simpler than you think:

$ tar cf small-archive.tar /big/tree --exclude-from <(find /big/tree -size +3M)

On a semi-related note (relating to your statement that you can't use find) to get a listing of all files (including directories) under a path minus files larger than 3MiB, use:

$ find . -size -3M -o -type d

You could then do:

$ tar cf small-archive.tar --no-recursion --files-from <(find /big/tree -size -3M -o -type d)

But I'd prefer the first one as it's simpler, clearly expresses what you want and will lead to less surprises.

MikeyB
  • 38,725
  • 10
  • 102
  • 186
1

If the filename contains square brackets, in some systems, need to exclude explicitly. For example

$ mkdir test
$ echo "abcde123456" > ./test/a[b].txt
$ echo "1" > ./test/a1.txt
$ ls -la ./test
total 16
drwxrwxr-x 2 user user 4096 Jan 10 16:38 .
drwx------ 4 user user 4096 Jan 10 16:38 ..
-rw-rw-r-- 1 user user    2 Jan 10 16:38 a1.txt
-rw-rw-r-- 1 user user   12 Jan 10 16:38 a[b].txt
$ tar -zcvpf a.tar.gz ./test
./test/
./test/a[b].txt
./test/a1.txt
$ tar -zcvpf a3.tar.gz ./test --exclude-from <(find ./test -type f -size +3c)
./test/
./test/a[b].txt
./test/a1.txt
$ tar -zcvpf ax.tar.gz ./test --exclude-from <(find ./test -type f -size +3c) --exclude '*\[*'
./test/
./test/a1.txt
0

If you're trying to do this on a server via SSH, it will not work because of this. To workaround it, you can use pipes and xargs:

find /path/to/dir -type f -size -3M | xargs tar cf archive.tar
morganbaz
  • 111
  • 6