8
I have a folder with about 20K files. The files are named according to the pattern xy_{\d1,5}_{\d4}\.abc
, e.g xy_12345_1234.abc
. I wanted to compress the first 10K of them using this command:
ls | sort -n -k1.4,1.9 | head -n10000 | xargs tar -czf xy_0_10000.tar.gz
however the resulting file had only about 2K files inside.
ls | sort -n -k1.4,1.9 | head -n10000 | wc -l
however returns 10000, as expected.
It seems to me that I am misunderstanding something basic here...
I am using zsh 5.0.2 on Linux Mint 17.1, GNU tar 1.27.1
EDIT:
forking as suggested by @Archemar sounds very plausible, with the latest fork overwriting the resulting file - the file contains the 'tail' of the files - 7773 to 9999.
result of xargs --show-limit
:
Your environment variables take up 3973 bytes
POSIX upper limit on argument length (this system): 2091131
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2087158
Size of command buffer we are actually using: 131072
replacing -c
with -r
or -u
did not work in my case. The error message was tar: Cannot update compressed archives
using both -r
and -u
is invalid and fails with tar: You may not specify more than one '-Acdtrux', '--delete' or '--test-label' option
replacing -c
with -a
seems to be invalid as well and fails with the same tar: You must specify one of the '-Acdtrux', '--delete' or '--test-label' options
though I dont recognize the issue azf
and Acdtrux
seem disjunct to me.
EDIT 2:
-T looks like a good way, I have also found an example here.
However when I try
ls | sort -n -k1.4,1.9 | head -n10000 | tar -czf xy_0_10000.tar.gz -T -
i get
tar: option requires an argument -- 'T'
well, perhaps the filenames dont reach tar? But it looks like they, do because when I execute
ls | sort -n -k1.4,1.9 | head -n10000 | tar --null -czf xy_0_10000.tar.gz -T -
i get
tar: xy_0_.ab\nxy_1_...<the rest of filenames separated by literal \n>...998.ab
Cannot stat: File name too long
So why is tar not seeing the filenames?
and if you try a instead of c, in the tar command? – Olivier Dulac – 2015-09-22T16:12:23.313
5
Relevant: Don't parse the output of
– 8bittree – 2015-09-22T17:35:25.780ls
1OP's file do not have tricky names. – Archemar – 2015-09-23T08:42:10.827
@8bittree - well as a general advice for robust shell scripts, yes. but what do you suggest instead for working with lists of files with the regular one-off oneliners? – kostja – 2015-09-23T11:39:56.443
@Archemar True, but future people coming here for help might have tricky file names, and the OP may do something similar in the future with tricky file names. Might as well learn the safe way now. – 8bittree – 2015-09-23T12:30:36.867
1
@kostja I'd use
– 8bittree – 2015-09-23T13:08:08.033find
, which has a-print0
option to use a null byte as the delimiter instead of a newline.sort
can handle that with the-z
flag.head
, unfortunately does not handle understand null byte delimiters, but this answer has a solution usingtr
to swap\n
and\0
before and afterhead
.tar
has--null -T -
to read null delimited file names fromstdin
.@8bittree - cool, this works as well :) Probably I will ignore your (still valid and reasonable) advice for most of what I do on the command line because I mostly do simple oneliners not meant for sharing and the find/nullbyte solution has some added churn. Until I run into an error because of that and learn the hard way :) Thank you. – kostja – 2015-09-23T17:40:50.770