Detect if your program has been mutated

16

0

Write a program that terminates without an error.

If any single byte is substituted by any other byte the program should output

CORRUPTED
  • Do not read your source code from a file
  • Your program should not produce any other output

This is so the shortest answer in bytes wins.

Edit: removed "NOT CORRUPTED" requirement

noɥʇʎԀʎzɐɹƆ

Posted 2017-01-22T01:59:34.580

Reputation: 1 316

[tag:radiation-hardening] has a host of similar questions, but I haven't found any that are quite like this one. – FryAmTheEggman – 2017-01-22T02:13:01.827

7To the downvoters: I suspect this is possible (if very difficult), if you pick the right language. Please don't close or delete the question unless you think there's something wrong with it other than being impossible. – None – 2017-01-22T02:20:52.983

7What does changed mean? Substituted by another byte? – Dennis – 2017-01-22T03:18:01.717

4@ais523 FWIW I downvoted the challenge because it seems hastily written, not because I think it's too difficult. – Dennis – 2017-01-22T05:07:26.943

@Dennis sandboxed! – noɥʇʎԀʎzɐɹƆ – 2017-01-22T14:56:53.377

Ignoring for a moment that the challenge you sandboxed is substantially different from the challenge you posted here, sandbox or no sandbox makes no difference. I'm sure you could find ways to make the challenge clearer by writing more than three sentences, just as you could have made your comment clearer by writing more than one word. – Dennis – 2017-01-22T17:06:07.940

@Dennis what is unclear? – noɥʇʎԀʎzɐɹƆ – 2017-01-22T17:53:38.980

5It's not that anything is unclear, but it could be made clearer. You could clarify if a full program is required, add an example program and illustrate all possible modifications, mention how single-byte substitutions would affect a UTF-8 encoded file, add a script that can be used to test submissions, mention that the program should not take input, etc. – Dennis – 2017-01-22T18:03:14.783

Answers

30

A Pear Tree, 76 bytes

$@='NOT ';print"$@CORRUPTED"__DATA__ =®®”print"$@CORRUPTED"__DATA__ =®®”Ê®›~

This program contains a few stray octets that aren't valid UTF-8. As such, it's shown as it looks in Windows-1252. (By default, if A Pear Tree sees a non-ASCII octet in a string literal or the like it treats it as an opaque object and doesn't try to understand it beyond being aware of what its character code is; this behaviour can be changed via an encoding declaration but the program doesn't have one. So the program is logically in "unspecified ASCII-compatible character set". All the non-ASCII octets are in comments anyway, so it doesn't really matter.)

Explanation

A Pear Tree checksums the program, looking for the longest substring that has a CRC-32 of 00000000. (If there's a tie, it picks the octetbetically first.) Then the program gets rotated to put it at the start. Finally, the program gets interpreted as a language that's almost a superset of Perl, defining a few things that are undefined in Perl to work the same way as in Python (and with a few minor changes, e.g. print prints a final newline in A Pear Tree but not in Perl). This mechanism (and the language as a whole) was designed for and problems; this isn't the former, but it's definitely the latter.

In this program, we have two notable substrings that CRC-32 to 00000000; the entire program does, and so does print"$@CORRUPTED"__DATA__ =®® by itself (which appears twice). As such, if the program is uncorrupted, it'll set $@ to NOT and then print it followed by CORRUPTED. If the program is corrupted, then the CRC-32 of the program as a whole will fail to match, but one of the shorter sections will remain uncorrupted. Whichever one is rotated to the start of the program will just print CORRUPTED, as $@ will be the null string.

Once the string's been printed, __DATA__ is used to prevent the rest of the program running. (It crosses my mind writing this that __END__ could be used instead, which would clearly save two bytes. But I may as well post this version now, because I've spent a bunch of time verifying it, and a modified version would have to be re-verified due to the CRC changes; and I haven't put a huge amount of effort into golfing the "payload" yet, so I want to see if anyone has other improvements in the comments that I can incorporate at the same time. Note that # doesn't work in the situation where a character is corrupted into a newline.)

You might be wondering how I managed to control the CRC-32 of my code in the first place. This is a fairly simple mathematical trick based on the way that CRC-32 is defined: you take the CRC-32 of the code, write it in little-endian order (the reverse of the byte order that's normally used by CRC-32 calculation programs), and XOR with 9D 0A D9 6D. Then you append that to the program, and you'll have a program with a CRC-32 of 0. (As the simplest example possible, the null string has a CRC-32 of 0, thus 9D 0A D9 6D also has a CRC-32 of 0.)

Verification

A Pear Tree can handle most sorts of mutations, but I'm assuming "changed" means "replaced with an arbitrary octet". It's theoretically possible (although unlikely in a program this short) that there could be a hash collision somewhere that lead to the wrong program running, so I had to check via brute force that all possible octet substitutions would leave the program working correctly. Here's the verification script (written in Perl) that I used:

use 5.010;
use IPC::Run qw/run/;
use warnings;
use strict;
undef $/;
$| = 1;
my $program = <>;
for my $x (0 .. (length $program - 1)) {
    for my $a (0 .. 255) {
        print "$x $a    \r";
        my $p = $program;
        substr $p, $x, 1, chr $a;
        $p eq $program and next;
        alarm 4;
        run [$^X, '-M5.010', 'apeartree.pl'], '<', \$p, '>', \my $out, '2>', \my $err;
        if ($out ne "CORRUPTED\n") {
            print "Failed mutating $x to $a\n";
            print "Output: {{{\n$out}}}\n";
            print "Errors: {{{\n$err}}}\n";
            exit;
        }
    }
}

say "All OK!    ";

user62131

Posted 2017-01-22T01:59:34.580

Reputation:

An n-bit CRC will detect any single error burst not longer than n bits. Hash collisions are impossible in the given case, there's no need for brute-force verification. – Rainer P. – 2017-01-22T15:08:28.033

@RainerP.: I'm aware that a mutation will prevent the CRC for the portions which originally have a CRC of 0 matching. However, there's the possibility that it could introduce a new substring of the code which has a CRC of 0; the purpose of the brute-forcing is to guarantee that that didn't happen. – None – 2017-01-22T16:45:43.447