Count number of bytes piped from one process to another

Question

I'm running a shell script that pipes data from one process to another

process_a | process_b

Does anyone know a way to find out how many bytes were passed between the two programs? The only solution I can think of at the moment would be to write a small c program that reads from stdin, writes to stdout and counts all the of the data transfered, storing the count in an environment variable, like:

process_a | count_bytes | process_b

Does anyone have a neater solution?

score 33 · Answer 1 · answered Dec 18 '09 at 09:26

33

Use pv the pipe viewer. It's a great tool. Once you know about it you'll never know how you lived without it.

It can also show you a progress bar, and the 'speed' of transfering.

answered Dec 18 '09 at 09:26

Amandasaurus

30,211
62
184
246

In my searching I had come accross this, but I need it to set a variable with the number of bytes transfered so that I can use it in another process. – Simon Hodgson Dec 18 '09 at 09:30
Usage example: `cat file | pv -b` will return the size of file. – rodorgas Sep 27 '18 at 23:16

Phil P · Accepted Answer · 2009-12-18T12:11:49.473

18

Pipe through dd. dd's default input is stdin and default output is stdout; when it finishes stdin/stdout I/O, it will report to stderr on how much data it transferred.

If you want to capture the output of dd and the other programs already talk to stderr, then use another file-descriptor. Eg,

$ exec 4>~/fred
$ input-command | dd 2>&4 | output-command
$ exec 4>&-

edited Dec 18 '09 at 12:11

answered Dec 18 '09 at 12:04

Phil P

3,040
1
15
19

2

Couldn't you skip the `exec` and just output to the file directly? `input-command | dd 2>~/fred | output-command` – Dennis Williamson Dec 18 '09 at 15:34
2

Uh, yes. I was apparently having one of "those" moments, sorry. – Phil P Dec 21 '09 at 02:28

score 9 · Answer 3 · edited Mar 20 '17 at 10:16

9

process_a | tee >(process_b) | wc --bytes might work. You can then redirect wc's count to where-ever you need it. If process_b outputs anything to stdout/stderr you will probably need to redirect this off somewhere, if only /dev/null.

For a slightly contrived example:

filestore:~# cat document.odt | tee >(dd of=/dev/null 2>/dev/null) | wc --bytes
4295

By way of explanation: tee lets you direct output to multiple files (plus stdout) and the >() construct is bash's "process substitution" which makes a process look like a write-only file in this case so you can redirect to processes as well as files (see here, or this question+answer for an example of using tee to send output to many processes).

edited Mar 20 '17 at 10:16

Community

1

answered Dec 18 '09 at 10:22

David Spillett

22,534
42
66

I like this solution, sadly the shelll I'm using (BusyBox) doesn't appear to support the >() notation, but it does provide a way of doing what I'm after. – Simon Hodgson Dec 21 '09 at 09:04
Aye, you need a pretty complete bash to have that feature - it is the sort of thing that isn't commonly used so gets stripped out of cut-down shells (even those with a target of being more-or-less bash compatible) like busybox in order to save space. – David Spillett Dec 21 '09 at 12:43

Claudio · Answer 4 · 2016-03-20T13:39:54.377

I know I'm late to the party, but I believe I have a good answer which can enhance this useful thread.
This is a mix of @Phil P and @David Spillett answer, but:

differently from @Phil P 's, it avoids creating a new file
differently from @David Spillett 's, it maintains the pipeline structure

Bytes-count is printed to stdout, along with any output of process_b.
You can use a prefix to identify the line containing bytes when working with the output(Bytes: in the example).

exec 3>&1
process_a | tee >({ echo -n 'Bytes:'; wc -c; } >&3) | process_b
exec 3>&-

WARNING:
Do not rely on the order of the lines in the output
The order is unpredictable and it can always differ, even when calling the same script with the same parameters!

Sadly, it is still a bash-only construct... – Mikhail T. Jun 26 '17 at 18:43 — Mikhail T., Jun 26 '17 at 18:43

Count number of bytes piped from one process to another

4 Answers4