6
1
I receive anywhere from 4 to 100 very large tar (~20GB) archive files everyday. I have been concatenating them in the past by looping through each of the archives I see on the file system and doing something like this
/bin/tar -concatenate --file=allTars.tar receivedTar.tar
The problem with this however is that as I concatenate more and more tar files, it must read to the end of allTars.tar
to begin concatenating again. Sometimes it takes over 20 minutes to start adding another tar file. It is just too slow and I am missing an agreed upon delivery time of the complete allTars.tar
.
I also tried handing my tar command a list of files like so:
/bin/tar --concatenate --file=alltars.tar receiverTar1.tar receivedTar2.tar receivedTar3.tar...etc
This gave very odd results. allTars.tar
would be the expected size (ie close to all the receivedTar.tar
files' sizes added together) but seemed to overwrite files when allTars.tar
was unpacked.
Is there any way to concatenate all these tar files in one command or so it doesn't have to read to the end of archive being concatenated to every time and have them unpack correctly and with all files/data?
How do you receive the files? By network? I'd reduce any unneeded copy operation, best would be to move each file to a directory tree on the same partition. – ott-- – 2015-07-16T18:33:30.597
What version of tar are you using 1.28? – cybernard – 2015-07-17T04:19:59.297
Can't you just make a new tar ball of tar balls (nested tar ball)? Rather than concatenating them. It'll make extraction very slow but this doesn't sound like your problem... – Colin – 2015-07-16T14:52:51.950
Unfortunately, no. Our client is very particular about how the tar ball unpacks. – Jeff Hall – 2015-07-16T14:56:16.790
Have you tried untarring each source file as it's own background thread with
&
(in parallel), then cat'ing before re-taring and zipping? Going to need a huge swap file tho! – Colin – 2015-07-16T15:03:00.930Yes, I am using tar version 1.28. – Jeff Hall – 2015-07-20T18:50:03.080
Unpacking and then re-tarring all the files ended up being slightly too slow as well. I ended up just using cat to run them all together and convinced our clients to use the "-i" command line option. – Jeff Hall – 2015-07-20T18:52:27.190