Argument list too long for xargs/exec

2

0

I'm working on a CentOS server and I have to move around and cat together millions of files. I've tried many incarnations of something like the below, but all of them fail with an argument list too long error.

command:

find ./ -iname out.* -type f -exec mv {} /home/user/trash
find ./paramsFile.* -exec cat > parameters.txt 

error:

-bash: /usr/bin/find: Argument list too long
-bash: /bin/cat: Argument list too long

or

echo ./out.* | xargs -I '{}' mv /home/user/trash
(echo ./paramsFile.* | xargs cat) > parameters.txt  

error:

xargs: argument line too long
xargs: argument line too long              

The second command also never finished. I've heard some things about globbing, but I'm not sure I understand it completely. Any hints or suggestions are welcome!

ohblahitsme

Posted 2013-07-10T05:05:47.513

Reputation: 133

Answers

4

You have multiple mistakes. You should escape the * globbing. You have to put {} between quotes (for filename security), and you have to end the -exec with \;.

find ./ -iname out.\* -type f -exec mv "{}" /home/user/trash \;
find -name ./paramsFile.\* -exec cat "{}" >> parameters.txt \;

The problem here is that * is matching all the files in your directory, thus giving you the error. If find locates the files instead of shell globbing, xargs gets individual filenames that it can use to construct lines of the correct length.

Bernhard

Posted 2013-07-10T05:05:47.513

Reputation: 1 017

2You don't have to put {} between quotes for filename security! Without quotes is perfect unless your shell does weird stuff with it. – gniourf_gniourf – 2013-07-10T06:39:01.323

1

Try this:

find . -iname 'out.*' -type f -exec mv '{}' /home/user/trash \;
find . -name 'paramsFile.*' -print0 | xargs -0 cat >> parameters.txt

The >> is to make sure multiple invocations of cat (if you really have a huge number of files) output to the same file, without overwriting the result from previous calls. Also, make sure parameters.txt starts out empty (or delete it first).

jjlin

Posted 2013-07-10T05:05:47.513

Reputation: 12 964

Thanks for the answer! It seems to still be too long for find and cat. – ohblahitsme – 2013-07-10T05:44:51.380

Oh, I guess I didn't really read that carefully. I updated the answer with the other call. Also, +1 to @Bernhard for bringing up some good points that I forgot about. – jjlin – 2013-07-10T06:20:43.177

0

I don't have a box handy at the moment to give a very good (ie, tested) answer, but I think this is a good use for parallel.

If I understand right, your command

find ./ -iname out.* -type f -exec mv "{}" /home/user/trash

is making one huge command:

mv out.1 out.2 out.3 out.4 ... out.10100942 /home/user/trash

Instead, something like

find ./ -iname out.* -type f | parallel mv "{}" /home/user/trash

will execute millions of smaller commands:

mv out.1 /home/user/trash
mv out.2 /home/user/trash
...

You might want to look into some of parallel's options, specifically -j and -i so you don't unexpectedly overload your server.

PS. Follow @Bernhard's advice, whenever you use a shell variable, especially for user input, quote it! Do "{}" not {}.

djeikyb

Posted 2013-07-10T05:05:47.513

Reputation: 891