Bash script 'while read' loop causes 'broken pipe' error when run with GNU Parallel

2

0

According to the GNU Parallel mailing list this is not a GNU Parallel-specific problem. They suggested that I post my problem here.

The error I'm getting is a "broken pipe" error, but I feel I should first explain the context of my problem and what causes this error. It happens when trying to use any bash script containing a 'while read' loop in GNU Parallel.

I have a basic bash script like this:

#!/bin/bash
# linkcheck.sh

while read domain
do
host "$domain"
done

Assume that I want to pipe in a large list (250mb say).

cat urllist | ./linkcheck.sh

Running host command on 250mb worth of URLs is rather slow. To speed things up I want to break up the input into chunks before piping it and then run multiple jobs in parallel. GNU Parallel is capable of doing this.

cat urllist | parallel --pipe -j0 parallel ./linkcheck.sh {}

{} is substituted by the contents of urllist line-by-line. Assume that my systems default setup is capable of running 500ish jobs per instance of parallel. To get round this limitation we can parallelize Parallel itself:

cat urllist | parallel -j10 --pipe parallel -j0 ./linkcheck.sh {}

This will run 5000'ish jobs. It will also, sadly, cause the error "broken pipe" (bash FAQ). Yet the script starts to work if I remove the while read loop and take input directly from whatever is fed into {} e.g.,

#!/bin/bash
# linkchecker.sh

domain="$1"
host "$1"

Why will it not work with a while read loop? Is it safe to just turn off the SIGPIPE signal to stop the "broken pipe" message, or will that have side effects such as data corruption?

Thanks for reading.

Joe White

Posted 2012-09-27T22:35:53.873

Reputation: 23

Answers

1

So, did

cat urllist | parallel --pipe -j0 parallel ./linkcheck.sh {}

work correctly?  I believe part of your problem may be that you left out the second --pipe, as in

cat urllist | parallel -j10 --pipe parallel -j0 --pipe ./linkcheck.sh {}

 


BTW, you never need to say

cat one_file | some_command

You can always change this to

some_command < one_file

resulting in one fewer process (and one fewer pipe).  (It may be appropriate/necessary to use cat when you have multiple input files.)

Scott

Posted 2012-09-27T22:35:53.873

Reputation: 17 653

Thanks. Missing out the second --pipe was my problem. I could've sworn I'd already tried that, but obviously not. – Joe White – 2012-09-28T00:31:01.167

1@Joe: Also, if you’re using --pipe, I suspect that {} is unnecessary/ignored. – Scott – 2012-09-28T14:48:13.303

0

It appears to me that error may be arising because of a bad race condition because of the window between forking a child to run another copy of linkcheck.sh while the pipe is still open and when the child actually tries to read. In that window, another copy has read EOF and the pipe has closed.

Nicole Hamilton

Posted 2012-09-27T22:35:53.873

Reputation: 8 987

Yes I think you are correct. Thanks for your input. – Joe White – 2012-09-28T00:32:17.717