Is using cat -v an appropriate way to sanitize untrusted text?

Question

It is well known that a terminal tends to trust things which are printed to it through stdout/stderr, making outputting attacker-controlled data to the terminal a risky action. Is using cat -v an effective way to sanitize untrusted data that will be output to the terminal? My threat model assumes that my terminal emulator, terminal multiplexer, or Linux VT subsystem may be vulnerable to arbitrary code execution if made to print malicious data. Attacker-controlled data can come from foreign sources, such as from a website, or from a lesser user, for example when viewing a file as root which is owned and writable by a lesser, potentially compromised user.

Some example uses:

cat -v /home/lesseruser/evil.txt
curl -I example.com | cat -v
curl -s example.com/evil.txt | cat -v
telnet 198.51.100.98 23 | cat -v

From section 3.1 of the coreutils GNU Info page:

-v
--show-nonprinting
    Display control characters except for LFD and TAB using ^ notation
    and precede characters that have the high bit set with M-.

From the source code for src/cat.c, in coreutils version 8.28:

if (show_nonprinting)
  {
    while (true)
      {
        if (ch >= 32)
          {
            if (ch < 127)
              *bpout++ = ch;
            else if (ch == 127)
              {
                *bpout++ = '^';
                *bpout++ = '?';
              }
            else
              {
                *bpout++ = 'M';
                *bpout++ = '-';
                if (ch >= 128 + 32)
                  {
                    if (ch < 128 + 127)
                      *bpout++ = ch - 128;
                    else
                      {
                        *bpout++ = '^';
                        *bpout++ = '?';
                      }
                  }
                else
                  {
                    *bpout++ = '^';
                    *bpout++ = ch - 128 + 64;
                  }
              }
          }
        else if (ch == '\t' && !show_tabs)
          *bpout++ = '\t';
        else if (ch == '\n')
          {
            newlines = -1;
            break;
          }
        else
          {
            *bpout++ = '^';
            *bpout++ = ch + 64;
          }

        ch = *bpin++;
      }
  }

Is using cat -v to sanitize untrusted input before printing useful for my threat model?

score 2 · Accepted Answer · answered Dec 21 '17 at 06:39

It seems like a reasonable belt-and-suspenders security measure, though you'll probably get better answers over at unix.stackexchange.com. That said, with my tinfoil (and highly-imaginative) hat on,

There could be a vulnerability in cat you're not aware of (either in the code you cited (presumably ch is an unsigned 8-bit type but what if it isn't and ch is negative?) or in some other section, for example where it loads the file).
Your version of cat could be compromised to include a vulnerability.
There could be a vulnerability in the filesystem, or the other tools or the shell's pipeline, that is exposed by loading the file.
There could be a vulnerability in your shell when passing the filename to cat (perhaps the filename is crafted to expose the bug).
You (because you forgot, or because you were tricked; or the user of whatever script/tool you are providing) could be using a non-ASCII charset that includes control characters in the {32..126} space.
There could still be a vulnerability in the shell when printing the output (I'm imagining some really weird bug like 2^n consecutive tab characters overflows something).
There could be a vulnerability in one of the shared libraries or interpreter that your, or a future, version of cat loads.
There could be a vulnerability in the method you use to invoke cat (for example if you write a shell script that examines the file first).
The file could contain wrong information, written in a convincing way so you believe it.

`ch` is indeed `unsigned char`. The rest of the risks you mentioned seem to be out-of-scope (like compromising `cat` itself, or a vulnerability in another command before or after it), though they are certainly true and can apply to some threat models. As for the `strings` bug you linked, that was due to it being linked with `libbfd`, which hasn't been the case for a long time. I get your point though. — forest, Dec 21 '17 at 06:54

score 0 · Answer 2 · answered Dec 31 '17 at 13:36

0

Sending the intr, susp, eof, or quit characters (by default ^C, ^Z, ^D, and ^\; see "stty -a") will break out of "cat" and most other things at the shell.

This example sends ^Z (susp):

<pre style='background-color: #EEEEEE'>
# List mount points.
mount
<span style='font-size: 0px'>
&#26;
printf "\e[0;1;31mYOU HAVE BEEN OWNED.\e[0m\n"
</span>
# List USB devices.
lsusb
</pre>

answered Dec 31 '17 at 13:36

David A

71
3

1

I don't see how that would work with `cat -v`, since those are all non-printing characters. – forest Dec 31 '17 at 14:22
2

Ahh... if you make sure you save it as a file first, cat will work fine (as will hexdump, less, and xxd). If you paste it *into* cat at the terminal, it will break out of it. – David A Dec 31 '17 at 14:55
I don't think less will always work well, due to the ubiquity of lesspipe, LESSOPEN, and LESSCLOSE. – forest Dec 31 '17 at 23:22

Is using cat -v an appropriate way to sanitize untrusted text?

2 Answers2

Linked

Related