11

I have a simple question, yet I can't find or solve the answer. I want to make a tar archive, but I want to exclude some files from it using regular expression.

Example of the file to exclude is this: 68x640X480.jpg

I have tried this with no luck:

tar cvf test.tar --exclude=[0-9]+x[0-9X]+\.jpg /data/foto

Can anybody help ?

user9517
  • 114,104
  • 20
  • 206
  • 289
Frodik
  • 273
  • 1
  • 3
  • 14

2 Answers2

13

You can use some additional tools like find and egrep:

find directory/ -type f -print | egrep -v '[0-9]+x[0-9X]+\.jpg' | tar cvfz directory.tar.gz -T -

The drawback of the above mentioned method is that it will not work for all possible file names. Another opportunity is to use the built-in exclude functionality of tar:

tar -czvf directory.tar.gz --exclude='*x*X*.jpg' directory

Unfortunately the second method does not work with regular expressions, but only with wildcards.

Vladimir Blaskov
  • 6,073
  • 1
  • 26
  • 22
  • Thanks, this is what I was looking for. Can you please make a note about what file names wouldn't work ? e.g. containing what characters ? – Frodik Oct 12 '11 at 09:46
  • You shouldn't worry too much about that - most file names work perfectly fine with that solution. The problem is that UNIX/Linux file names can include pretty much everything, even control characters - such obscure combinations will not work with the first solution. – Vladimir Blaskov Oct 12 '11 at 10:15
  • A nice read related to UNIX/Linux/POSIX file names: http://www.dwheeler.com/essays/fixing-unix-linux-filenames.html – Vladimir Blaskov Oct 12 '11 at 10:16
  • What does the "-T -" at the end do? – MBasith May 31 '21 at 18:12
  • 1
    @MBasith the `-T` option enables `tar` to read the files to archive from another file (`--files-from=FILE`). The `-` (dash) refers in this case to standard input. This is useful when the file list needs to be generated from another process and piped into `tar`. – AndOs Dec 08 '21 at 20:50
0

Maybe you should try cpio

https://www.gnu.org/software/cpio/manual/cpio.html

It reads from file list and does the archiving. You can generate a file list using sed like given below.

ls |sed   '/[0-9]*x[0-9]*X.*/d' >/tmp/files

You can then use it as the input to cpio.

Daniel F
  • 343
  • 3
  • 16
nitins
  • 2,527
  • 15
  • 42
  • 65