8086 machine code (MS-DOS .COM), score 26 402
The file is 52 947 bytes long. The length was a product of what produced benign instruction sleds, and the maximum file size for .COM files (65 280 bytes -- 65 264 in case an interrupt happens before you can stop it).
Here in binary form, with repeating lines snipped out for obvious reasons:
00000000 : E9 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 : .ggggggggggggggg
00000010 : 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 67 : gggggggggggggggg
<...>
00006720 : 67 67 67 67 67 90 D1 EE AD 93 AC E8 00 00 5A 2C : ggggg.........Z,
00006730 : 30 01 C2 3C 04 75 03 83 C2 02 D1 E0 D1 E0 01 C2 : 0..<.u..........
00006740 : 83 C2 1B B4 09 CD 21 CD 20 43 6F 64 65 24 47 6F : ......!. Code$Go
00006750 : 6C 66 24 61 6E 64 24 24 43 6F 64 69 6E 67 24 43 : lf$and$$Coding$C
00006760 : 68 61 6C 6C 65 6E 67 65 73 24 EB 9F 9F 9F 9F 9F : hallenges$......
00006770 : 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F : ................
<...>
0000CE80 : 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F 9F D1 : ................
0000CE90 : EE AD 93 AC E8 00 00 5A 2C 30 01 C2 3C 04 75 03 : .......Z,0..<.u.
0000CEA0 : 83 C2 02 D1 E0 D1 E0 01 C2 83 C2 1B B4 09 CD 21 : ...............!
0000CEB0 : CD 20 43 6F 64 65 24 47 6F 6C 66 24 61 6E 64 24 : . Code$Golf$and$
0000CEC0 : 24 43 6F 64 69 6E 67 24 43 68 61 6C 6C 65 6E 67 : $Coding$Challeng
0000CED0 : 65 73 24 : es$
In readable form:
org 0x100
cpu 8086
SLED_LEN equ (0x6767 - (part1_end - part1) + 2)
db 0xe9
times SLED_LEN db 0x67
part1:
nop
shr si, 1 ; SI = 0x80
lodsw ; SI = 0x82
xchg bx, ax ; AX = 0
lodsb ; AL = first character of command line argument
; Load DX with IP, since we only know our strings' relative position
call near .nextline
.nextline:
pop dx
sub al, '0'
add dx, ax
cmp al, 4
jne .skip
add dx, 2
.skip:
shl ax, 1
shl ax, 1
add dx, ax
add dx, 0x1b
mov ah, 0x09
int 0x21
int 0x20
p1_msg0 db "Code$"
p1_msg1 db "Golf$"
p1_msg2 db "and$$"
p1_msg3 db "Coding$"
p1_msg4 db "Challenges$"
part1_end:
part2:
db 0xeb
times SLED_LEN db 0x9f
shr si, 1 ; SI = 0x80
lodsw ; SI = 0x82
xchg bx, ax ; AX = 0
lodsb ; AL = first character of command line argument
; Load DX with IP, since we only know our strings' relative position
call near .nextline
.nextline:
pop dx
sub al, '0'
add dx, ax
cmp al, 4
jne .skip
add dx, 2
.skip:
shl ax, 1
shl ax, 1
add dx, ax
add dx, 0x1b
mov ah, 0x09
int 0x21
int 0x20
p2_msg0 db "Code$"
p2_msg1 db "Golf$"
p2_msg2 db "and$$"
p2_msg3 db "Coding$"
p2_msg4 db "Challenges$"
Rundown
This is a slightly adapted variation on a previous answer of mine.
The active part is duplicated so that there is always one untouched by radiation. We select the healthy version by way of jumps. Each jump is a near jump, and so is only three bytes long, where two last bytes are the displacement (i.e. distance to jump, with sign determining direction). This displacement is replicated over and over, forming a long NOP sled of sorts.
We can divide the code into four parts which could be irradiated: jump 1, code 1, jump 2, and code 2. The idea is to make sure a clean code part is always used. If one of the code parts is irradiated, the other must be chosen, but if one of the jumps is irradiated, both code parts will be clean, so it will not matter which one is chosen.
The reason for having two jump parts is to detect irradiation in the first part by jumping over it. If the first code part is irradiated, it means we will arrive off mark, somewhere in the second NOP sled. If we make sure that such a botched landing selects code 2, and a proper landing selects code 1, we're golden.
For both jumps, we duplicate the displacement word, making each jump part one opcode followed by a massive repetition of the displacement. This ensures that irradiation somewhere in the sled will still make the jump valid, as long as at least two bytes of the sled remains. Irradiation in the first byte will stop the jump from happening at all, since the following sled will form a completely different instruction.
Take the first jump:
E9 67 67 67 67 ... jmp +0x6767 / dw 0x6767 ...
If either of the 0x67
bytes are removed, it will still jump to the same place, as long as at least one word remains. If the 0xE9
byte is removed, we will instead end up with
67 67 67 67...
each of which is a address-size override prefix. The CPU will gladly chomp them all down, and then we fall through to code 1, which must be clean, since the damage was in jump 1.
If the jump is taken, we land at the second jump:
EB 9F 9F 9F 9F ... jmp -0x61 / db 0x9f
If this byte sequence is intact, and we land right on the mark, that means that code 1 was clean, and this instruction jumps back to that part. The replicated displacement byte guarantees this, even if it is one of these displacement bytes that were damaged.
The astute reader will notice that this instruction jumps too far back, but that's alright, since if we hit the jump, then part1's NOP sled must be wholly intact, making a slightly longer jump backwards okay.
If we either land off the mark (because of a damaged code 1 or jump 1) or the 0xEB
byte is the damaged one, the remaining bytes will also here be benign:
9F 9F 9F 9F ... lahf / lahf
Whichever the case, if we end up executing that sled of instructions, we know that either jump 1, code 1, or jump 2 were irradiated, which makes a fall-through to code 2 safe.
Testing
Testing proved problematic. I've done some tests on certain corner-cases and a few random ones, but I have to find a better way for exhaustive testing. It should not be a problem up to the above-mentioned score.
The following program was used to automatically create all versions of the .COM file for a certain gap size. It also creates a BAT file that can be run in the target environment, which runs each irradiated binary, and pipes their outputs to separate text files. Comparing the output files to validate is easy enough, but DOSBox does not have fc
, so it was not added to the BAT file.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
FILE *fin, *fout, *fbat;
int fsize;
char *data;
int gapsize = 1;
if (!(fin = fopen(argv[1], "rb")))
{
fprintf(stderr, "Could not open input file \"%s\".\n", argv[1]);
exit(1);
}
if (!(fbat = fopen("tester.bat", "w")))
{
fprintf(stderr, "Could not create BAT test file.\n");
exit(2);
}
if (argc > 2)
{
gapsize = atoi(argv[2]);
}
fseek(fin, 0L, SEEK_END);
fsize = ftell(fin);
fseek(fin, 0L, SEEK_SET);
if (!(data = malloc(fsize)))
{
fprintf(stderr, "Could not allocate memory.\n");
exit(3);
}
fread(data, 1, fsize, fin);
fprintf(fbat, "@echo off\n");
for (int i = 0; i <= fsize - gapsize; i++)
{
char fname[512];
sprintf(fname, "%06d.com", i);
for (int j = 0; j < 5; j++)
fprintf(fbat, "%s %d >> %06d.txt\n", fname, j, i);
fout = fopen(fname, "wb");
fwrite(data, 1, i, fout);
fwrite(data + i + gapsize, 1, fsize - i - gapsize, fout);
fclose(fout);
}
free(data);
fclose(fin);
fclose(fbat);
}
Must we write a full program, or is a function acceptable? – Robin Ryder – 2019-11-22T19:56:13.970
@RobinRyder As per default, either a full program or function is acceptable – caird coinheringaahing – 2019-11-22T19:59:07.027
It's worth an ask, but would the downvoter mind explaining their vote? I can only improve, and any and all feedback would help me do that. – caird coinheringaahing – 2019-11-23T02:28:32.937
1I haven't downvoted, but are you sure the scoring system can't be abused for an infinite score? (in Lenguage, a + is, as far as I remember, decremented into ], so it won't work, but most other length encodings work; I'm trying to produce an arbitrarily extendable Befunge solution now) – my pronoun is monicareinstate – 2019-11-23T02:33:28.437