3
I have 100,000 URLs of small files to download. Would like to use 10 threads and pipelining is a must. I concatenate the result to one file. Current approach is:
cat URLS | xargs -P5 -- curl >> OUTPUT
Is there a better option that will show progress of the whole operation? Must work from the command line.
"Would like to use 10 threads and pipelining is a must. I concatenate the result to one file." So the order doesn't matter? – Bobby – 2013-08-16T13:21:41.627
1
Use GNU parallel, it will even keep the order of the output. If you tag your question accordingly, you might be lucky and the author might chime in ;-)
– Adrian Frühwirth – 2013-08-16T14:37:08.193Order is not an issue. Tagged for gnu-parallel good idea. Is it possible to use parallel and still get the pipelining in curl? – William Entriken – 2013-08-16T15:45:45.280
Don't you get the files intermingled when you do that? Unless your webserver is single-threaded, I don't see how you would avoid having two processes writing simultaneously to your output file. – rici – 2013-08-16T16:30:17.983
Mangling, jumbling are all not a problem for me. – William Entriken – 2013-08-16T20:18:51.540