0

Using bpipe plugin with Bareos Backup I get an error:

Fatal error: bpipe-fd: Pipe read error: ERR=Error 0

after

zfs send -R dpool/some_dataset

finished, executed from bash script.
But only iff dpool/some_dataset has child datasets (there is dpool/some_dataset/child1 for example). The -R option includes the child within the zfs send. This is my script zfs_create_send_snapshot.sh:

#!/bin/sh
#
# create recursive ZFS snapshot for given dataset and pipe
# replication stream to stdout, then delete snapshot

if [ -z "$1" ]; then
  (>&2 echo "ERROR: missing dataset name argument")
  exit 1
fi

DDD=`date +%y%m%d%H%M`
SNAPNAME=$1@bareos_${DDD}
(>&2 echo "creating ZFS snapshot ${SNAPNAME}")

zfs snapshot -r ${SNAPNAME}
(>&2 echo "sending ZFS snapshot ${SNAPNAME}")
zfs send -R ${SNAPNAME}
RC=$?
(>&2 echo "deleting ZFS snapshot ${SNAPNAME}")
(>&2 zfs destroy -r ${SNAPNAME})
exit ${RC}

It's executed by bareos fileset like this:

Plugin = "bpipe:file=/tmp/zfs_snap.bin:reader=/etc/bareos zfs_create_send_snapshot.sh dpool/some_dataset:writer=/etc/bareos/writer.sh /tmp/zfs_snap.bin"

The job fails with broken pipe error only if dpool/some_dataset has child zfs datasets. Otherwise everything is fine. And it seems to be only a side-effect: The backup job writes complete zfs snapshot stream to tape until -- just erroneously finishing by error. It happens on openindiana/illumos. Recent Bareos client 17.2 compiled from git sources.

NorbertM
  • 101
  • 2

1 Answers1

0

Obviously I found a workaround for the problem/bug by searching for How to echo a EOF in bash?

For some obscure reason the zfs_create_send_snapshot.sh script terminates it's stdout somehow different, depending on whether zfs send does send more than one filesystem... Very strange.

So adding exec 1>&- after zfs send within the script solves the issue.

#!/bin/sh
#
# create recursive ZFS snapshot for given dataset and pipe
# replication stream to stdout, then delete snapshot

if [ -z "$1" ]; then
  (>&2 echo "ERROR: missing dataset name argument")
  exit 1
fi

DDD=`date +%y%m%d%H%M`
SNAPNAME=$1@bareos_${DDD}
(>&2 echo "creating ZFS snapshot ${SNAPNAME}")

zfs snapshot -r ${SNAPNAME}
(>&2 echo "sending ZFS snapshot ${SNAPNAME}")
zfs send -R ${SNAPNAME}
RC=$?
exec 1>&-
(>&2 echo "deleting ZFS snapshot ${SNAPNAME}")
(>&2 zfs destroy -r ${SNAPNAME})
exit ${RC}

BTW: can someone point me to a nice reference to this stange exec 1 >&- syntax/command!?

Digging for the root cause I found this code snippet within bareos fd sources (bareos/core/src/plugins/filed/bpipe-fd.cc):

   case IO_READ:
      if (!p_ctx->pfd) {
        Jmsg(ctx, M_FATAL, "bpipe-fd: Logic error: NULL read FD\n");
        Dmsg(ctx, debuglevel, "bpipe-fd: Logic error: NULL read FD\n");
        return bRC_Error;
      }
      io->status = fread(io->buf, 1, io->count, p_ctx->pfd->rfd);
      if (io->status == 0 && ferror(p_ctx->pfd->rfd)) {
        io->io_errno = errno;
        Jmsg(ctx, M_FATAL, "bpipe-fd: Pipe read error: ERR=%s\n",
             strerror(io->io_errno));
        Dmsg(ctx, debuglevel, "bpipe-fd: Pipe read error: ERR=%s\n",
             strerror(io->io_errno));
        return bRC_Error;
      }
      break;

Perhaps a feof() is missing? Just a guess...

NorbertM
  • 101
  • 2