5
2
If I want to process large number of files with command "do_something" which only can use one core, what's the best way to use all available cores assuming each file can be processed independently?
At this moment I do something like this:
#!/bin/zsh
TASK_LIMIT=8
TASKS=0
for i in *(.)
{
do_something "$i"&
TASKS=$(($TASKS+1))
if [[ $TASKS -ge $TASK_LIMIT ]]; then
wait; TASKS=0; fi
}
wait
Obviously, this is not efficient because after reaching $TASK_LIMIT it waits when all "do_something" finish. For example in my real script I make use of about 500% of my 8-core CPU instead of >700%.
Running without $TASK_LIMIT is not an option because "do_something" may consume lots of memory.
Ideally, the script should try to keep number of parallel tasks at $TASK_LIMIT: for example if task 1 of 8 finished and there is at least one more file to process, the script should run next "do_something" instead of waiting for remaining 7 tasks to finish. Is there a way to achieve this in zsh or bash?
hint: use
trap
to catch SIGCHLD in monitor mode. – Keith – 2012-10-26T16:22:43.527