"tar -czf" for first 5 thousand files

2

1

I have a directory with too much files in it.

I want to compress first 5 thousand files in that directory to become file.tar.gz and then 5001 - 10000...and so on

how to do it?

Captain

Posted 2010-12-20T05:25:06.937

Reputation:

Are the first five thousand named differently than the next? – None – 2010-12-20T05:38:48.077

yes. very different name. – None – 2010-12-20T05:49:49.860

Have you tried using a regular expression to match the first 5k? Then maybe a simple perl or python script to do the leg work – None – 2010-12-20T17:40:35.407

Answers

0

Use ls to generate the list of names and head and tail to filter them. Here's a one-liner that does it in a loop. You'll need to know the number of files in the directory (ls | wc -l will tell you).

for ii in $(seq -w 5000 5000 NUMBER_OF_FILES) ; do echo $ii ; ls | head -n $ii | tail -n 5000 | tar -f ../ARCHIVE_FILE_$ii.tar.gz -czv -T - ; done

Replace the bits in capitals with what you want.

Optimal Cynic

Posted 2010-12-20T05:25:06.937

Reputation: 316

Useless Use Of ls Award goes to... – Hello71 – 2011-05-28T02:10:30.727

0

This script gradually adds all files to the archive, and numbering the archive. Rename ARCHIVE_NAME and '5000'.

$ COUNT_MOD=0; for i in *; do tar -r -f ARCHIVE_NAME`expr $COUNT_MOD / 5000`.tar $i; ((COUNT_MOD++)) ; done

This script is not optimized, so there are a few rules:

  1. ARCHIVE_NAME# must not exist when starting this script, so if anything fails, do an 'rm ARCHIVE_NAME*'.
  2. A directory entry is treated as 1 entry by the script, but not 'tar'. Tar will go into the directory and will add all files recursively, and you might end up more than 5000 files in an archive.
  3. Compressed archives cannot be updated, I left out '-z', sorry :-)

karatedog

Posted 2010-12-20T05:25:06.937

Reputation: 809

you could of use for i in * instead. – Wuffers – 2011-05-28T02:48:40.500

0

You could build a set of files that list each 5000 filenames and use them with the -T arguments for tar. Something like this might work:

ls -1 | split -l 5000 - tarlist
count=0
for f in tarlist*
do
    tar -czf save.$count.tar.gz -T $f
    count=`expr $count + 1`
done

Shannon Nelson

Posted 2010-12-20T05:25:06.937

Reputation: 1 287