Running several asynchronous tasks and get their outputs and exit codes in bash

0

1

I have to run a bunch of commands asynchronously and as soon as one finishes, I need to perform actions according to its exit code and output. Note that I can't predict how for long any of these tasks will run in my real use case.

To solve this problem, I ended up with the following algorithm:

For each task to be run:
    Run the task asynchronously;
    Append the task to the list of running tasks.
End For.

While there still are tasks in the list of running tasks:
    For each task in the list of running tasks:
        If the task has ended:
            Retrieve the task's exit code and output;
            Remove the task from the list of running tasks.
        End If.
    End For
End While.

This gives me the following bash script:

  1 #!/bin/bash
  2 
  3 # bg.sh
  4 
  5 # Executing commands asynchronously, retrieving their exit codes and outputs upon completion.
  6 
  7 asynch_cmds=
  8 
  9 echo -e "Asynchronous commands:\nPID    FD"
 10 
 11 for i in {1..10}; do
 12         exec {fd}< <(sleep $(( i * 2 )) && echo $RANDOM && exit $i) # Dummy asynchronous task, standard output's stream is redirected to the current shell
 13         asynch_cmds+="$!:$fd " # Append the task's PID and FD to the list of running tasks
 14         
 15         echo "$!        $fd"
 16 done    
 17 
 18 echo -e "\nExit codes and outputs:\nPID       FD      EXIT    OUTPUT"
 19 
 20 while [[ ${#asynch_cmds} -gt 0 ]]; do # While the list of running tasks isn't empty
 21         
 22         for asynch_cmd in $asynch_cmds; do # For each to in thhe list
 23                 
 24                 pid=${asynch_cmd%:*} # Task's PID
 25                 fd=${asynch_cmd#*:} # Task's FD
 26                 
 27                 if ! kill -0 $pid 2>/dev/null; then # If the task ended
 28                         
 29                         wait $pid # Retrieving the task's exit code
 30                         echo -n "$pid   $fd     $?      "
 31                         
 32                         echo "$(cat <&$fd)" # Retrieving the task's output
 33                         
 34                         asynch_cmds=${asynch_cmds/$asynch_cmd /} # Removing the task from the list
 35                 fi
 36         done
 37 done

The output tells me that wait fails trying to retrieve the exit code of each tasks, expect the last one to be run:

Asynchronous commands:
PID     FD
4348    10
4349    11
4351    12
4353    13
4355    14
4357    15
4359    16
4361    17
4363    18
4365    19

Exit codes and outputs:
PID     FD  EXIT OUTPUT
./bg.sh: line 29: wait: pid 4348 is not a child of this shell
4348    10  127  16010
./bg.sh: line 29: wait: pid 4349 is not a child of this shell
4349    11  127  8341
./bg.sh: line 29: wait: pid 4351 is not a child of this shell
4351    12  127  13814
./bg.sh: line 29: wait: pid 4353 is not a child of this shell
4353    13  127  3775
./bg.sh: line 29: wait: pid 4355 is not a child of this shell
4355    14  127  2309
./bg.sh: line 29: wait: pid 4357 is not a child of this shell
4357    15  127  32203
./bg.sh: line 29: wait: pid 4359 is not a child of this shell
4359    16  127  5907
./bg.sh: line 29: wait: pid 4361 is not a child of this shell
4361    17  127  31849
./bg.sh: line 29: wait: pid 4363 is not a child of this shell
4363    18  127  28920
4365    19  10   28810

The output of the commands is flawlessly retrieved, but I don't understand where this is not a child of this shell error comes from. I must be doing something wrong, as wait is able to get the exit code of the last command to be run asynchronously.

Does anyone know where this error comes from? Is my solution to this problem flawed, or am I misunderstanding the behavior of bash?

Informancien

Posted 2019-09-13T13:37:03.563

Reputation: 175

None of those commands were run in the background, so they were never children of this shell. – l0b0 – 2019-09-13T14:20:49.473

Answers

0

The error message is because the process with the specific PID you're trying to wait for has already finished was never run as an asynchronous process. Why exec … results in $! being populated is beyond my knowledge. You can test with an unreachable PID:

$ wait $(($(cat /proc/sys/kernel/pid_max) + 1))
bash: wait: pid 32769 is not a child of this shell

l0b0

Posted 2019-09-13T13:37:03.563

Reputation: 6 306

Isn't wait supposed to be able to grab the exit code of finished jobs? Running this sequence of commands in an interactive shell works flawlessly: echo <(sleep 5 && exit 3), sleep 10, wait $! and then echo $?. The process's substitution job is clearly already finished when wait is called. I'm having a hard time understanding how "finished" processes behave. – Informancien – 2019-09-13T14:10:22.043

From the bash manpage under Process Substitution: "The process list is run asynchronously, and its input or output appears as a filename.". If the manual is to be trusted, the command is in fact ran asynchronously, thus populating $!. Moreover, it doesn't explain why wait is able to grab the exit code of the last command. – Informancien – 2019-09-13T14:26:49.290