How do I count the bytes outputted by another program in Bash/Linux?

2

Say I have a program that writes to a bunch of places on the filesystem. It runs from a single executable. I want to determine, at any point in its run (it runs for a long time), how many bytes it has written to the disk.

Most people seem to like tools like pv for this task, but it won't work for my case, because the executable in question writes to many different places out on the filesystem, if I were to write my_exec | pv <whatever> | cat or whatnot, my_exec would just write out a big blob of data, without parsing it out into folders as it should.

Similarly, stuff like iotop isn't what I'm after, as I would like to be able to attach/detach a "watcher" to my IO heavy process.

I'm aware the question seems confusing, perhaps an example would help. What I'd like to do is something like this.

my_exec &
local exec_pid = $?

mystery_command ${exec_pid} # continuously writes out the number of bytes  
                            # written to disk by my_exec since the invocation  
                            # of mystery_command

Or, alternatively, something that wraps/watches another arbitrary command, like this:

{ my_exec } | mystery_command # my_exec will still write to folders as it  
                              # should, but mystery_command will continuously  
                              # output the number of bytes written to disk by  
                              # the attached {} group.

Zac B

Posted 2012-08-29T22:43:49.003

Reputation: 2 653

Answers

1

IF your my_exec program doesn't output to screen, or any log files, etc. (e.g. skew the wchar count with output to stdout, stderr, etc.), why not just look directly at the linux's count of wchar:

grep wchar /proc/${exec_pid}/io

Again, wchar will include ALL characters written to files, and everything is a file in UNIX including /dev/null, but if the program is silent except for data files, you'd get an accurate or closely (+-1 byte) accurate count.

If there's output of the aforementioned in addition to data files, then it's going to be hard for you to differentiate from other files without being able to add counters directly to my_exec's code (which is what I'd do if I had my_exec source, anyway - io to a single counter would be minimum).

That count will give you total since inception of PID. Calculating since last check is a matter of some storing last seen value in a temp file or variable, some simple eval math, etc. A quick and dirty bash script with no error checking, conciseness, fancy params, etc.:

#!/bin/bash
# one param, PID of running process.

COUNTFILE="/tmp/counter"
WAIT="2"

if [ -r "$COUNTFILE" ]; then
        LCOUNT="`cat $COUNTFILE`"
else
        LCOUNT=0
fi

cd /proc/$1
while true; do
        MYSTAMP="`date`"
        TCOUNT="`grep wchar io |cut -d':' -f2`"
        NCOUNT="`expr $TCOUNT - $LCOUNT`"
        printf "$MYSTAMP: %9s bytes total, %9s bytes new\n" $TCOUNT $NCOUNT
        LCOUNT="$TCOUNT"
        echo "$LCOUNT" >$COUNTFILE
        sleep $WAIT
done

# don't remove count file
exit

Hope that helps.

zenfridge

Posted 2012-08-29T22:43:49.003

Reputation: 13

0

Have you tried strace? To attach it to already running process, just enter

strace -p $PID

choroba

Posted 2012-08-29T22:43:49.003

Reputation: 14 741