0

When wanting to quickly take a look at the distribution of a sequence of values, the stem-and-leaf plot is an incredibly simple, yet powerful tool. It takes a few minutes to teach the computer to draw it, or you can trivially do it by hand.

The only issue is that it does not preserve the order of values, which sometimes contains useful information. I have been trying to think of an equally simple way to plot a timeseries with preserved order, but failing to come up with something.

The obvious solution of doing a regular timeseries chart with time on the X axis and values on the Y axis suffers from the problem that it needs quite a bit of preparatory work before getting into the actual rendering. It is nowhere near as conceptually simple as the stem-and-leaf plot.

Is there something that is? Or is what I'm asking for impossible?

Oh, and an important requirement I almost forgot: I would like this to be easily printable on a line-buffered terminal...


The reason I ask here is because my primary use case for this are health metrics and other samples from servers. When ruling out causes for a malfunctioning system, it'd be nice to quickly get intuition for how some subsystem has behaved over time.

kqr
  • 91
  • 1
  • 8

1 Answers1

0

I'm not sure this is entirely what I'm looking for, since it is a bit complex, but it has satisfied my needs so far:

I started working with temperatures, which were all conveniently in the 25–60 °C range, so I could simply replicate a string of * to create a sort of bar plot of them:

$ cat temps2.txt | perl -pe 'print "*" x $_ . " ";'
******************************************** 44.0
*************************************************** 51.0
******************************************* 43.0
********************************************* 45.0
************************************** 38.0
**************************************** 40.0
*********************************** 35.0
************************************ 36.0
******************************** 32.0
******************************** 32.0
******************************* 31.0
******************************* 31.0
******************************** 32.0
******************************** 32.0
******************************* 31.0
************************************ 36.0
******************************** 32.0
************************************ 36.0
******************************* 31.0
*********************************** 35.0
************************************ 36.0
************************************ 36.0
********************************* 33.0
******************************* 31.0
******************************** 32.0
******************************* 31.0
********************************* 33.0
******************************** 32.0
******************************** 32.0
************************************ 36.0

Of course, this only works with values that are in a convenient range, but it is stupidly effective when they are – and when they are not, one can just add some arithmetic manipulations to the $_ variable indicating the number of repetitions.

For example, the average processor run queue length each second (which for me is in the range 0–8) can be multiplied by 10 to get shifts visible in the output:

$ cat runq.txt | perl -pe 'print "*" x ($_ * 10) . " ";'
 0
 0
 0
 0
******************** 2
********** 1
******************** 2
**************************************** 4
****************************** 3
**************************************** 4
****************************** 3
****************************** 3
******************** 2
******************** 2
********** 1
********** 1
********** 1
************************************************************ 6
********** 1
********** 1
********** 1
 0
 0

This would absolutely have satisfied my needs.


Of course, me being me, I took this way overboard and created a big script that includes automatic calculation and updating of the coordinate transformations, and also streaming computation of system average and natural process limits:

$ cat temps2.txt | ./limits.pl
----------------------------------------------------------------
X:    51.0           [            | *         ]
X:    43.0         [           * |            ]
X:    45.0            [         |          ]
X:    38.0          [      *   |          ]
X:    40.0           [      *  |        ]
X:    35.0           [   *    |        ]
X:    36.0           [    *  |        ]
X:    32.0           [ *     |       ]
X:    32.0           [ *    |      ]
X:    31.0           [*     |     ]
X:    31.0           [*    |     ]
X:    32.0           [ *   |     ]
X:    32.0           [ *   |    ]
X:    31.0           [*   |    ]
X:    36.0           [    *     ]
X:    32.0           [ *  |     ]
X:    36.0          [     |     ]
X:    31.0          [ *   |     ]
X:    35.0          [    *|     ]
X:    36.0          [     |    ]
X:    36.0          [     *    ]
X:    33.0          [   * |    ]
X:    31.0          [ *   |    ]
X:    32.0          [  * |     ]
X:    31.0          [ *  |    ]
X:    33.0          [   *|    ]
X:    32.0          [  * |    ]
X:    32.0          [  * |    ]
X:    36.0          [    |*   ]
UPL=42.1
Xbar=35.2
LPL=28.2

Un-clean source code to this script attached as well. This is the first draft so please excuse bad code.

#!/usr/bin/env perl

use v5.26;
use strict;
use warnings;
use List::Util qw( min max );

my $max_width = 52;

my $n = 0;
my $xbar = 0;
my $mrbar = 0;
my $lpl;
my $upl;

sub print_values {
    print "\n";
    printf "UPL=%.1f\n", $upl;
    printf "Xbar=%.1f\n", $xbar;
    printf "LPL=%.1f\n", $lpl;
}

$SIG{INT} = \&print_values;

my $min_y;
my $max_y;

my $xprev;
while (my $x = <>) {
    $n++;
    $xbar *= $n - 1;
    $xbar += $x;
    $xbar /= $n;

    if (defined($xprev)) {
        my $mr = abs ($x - $xprev);
        $mrbar *= $n - 2;
        $mrbar += $mr;
        $mrbar /= $n - 1;

        $lpl = $xbar - $mrbar * 2.66;
        $upl = $xbar + $mrbar * 2.66;

        my $space_changed;

        # If any point is about to be drawn outside of the screen space, expand
        # the space to include the currently drawn points and then some.
        if (min($lpl, $x) < $min_y or max($upl, $x) > $max_y) {
            my $min_diff = abs($min_y - min($lpl, $x));
            my $max_diff = abs($max_y - max($upl, $x));
            # Change min and max values in slightly larger steps to avoid
            # changing the space too often with a drifting process.
            $min_y -= $min_diff * 2;
            $max_y += $max_diff * 2;
            $space_changed = 1;
        }
        if ($min_y == $max_y) {
            $max_y = $min_y + 1;
        }

        my %screen_coords;
        $screen_coords{lpl} = $lpl;
        $screen_coords{upl} = $upl;
        $screen_coords{xbar} = $xbar;
        $screen_coords{x} = $x;

        # Transform the recorded values to the screen space.
        for my $coord (keys %screen_coords) {
            # Set offset to 0.
            $screen_coords{$coord} -= $min_y;
            # Divide by range to scale down to 0–1.
            $screen_coords{$coord} /= ($max_y - $min_y);
            # Scale up again to proper width.
            $screen_coords{$coord} *= ($max_width - 1);
        }

        # Render the recorded values into an array of characters.
        my @characters = split('', ' ' x $max_width);
        $characters[$screen_coords{xbar}] = '|';
        $characters[$screen_coords{lpl}] = '[';
        $characters[$screen_coords{upl}] = ']';
        $characters[$screen_coords{x}] = '*';

        # Print a separator whenever the space needs to be expanded.
        if ($space_changed) {
            printf ('-' x ($max_width + 12) . "\n");
        }

        printf "X: %7.1f %s\n", $x, join('', @characters);
    } else {
        $min_y = $x;
        $max_y = $x;
    }

    $xprev = $x;
}

print_values;
kqr
  • 91
  • 1
  • 8