Lenguage (with stty +brkint -ignbrk
), score 255, 3890951 bytes
The program consists of 3890951 NUL bytes (making this one of the shortest Lenguage programs ever; 3890951 bytes is easily small enough to fit on my disk, so I actually ran this in a Lenguage interpreter). The OP wanted the Lenguage/Unary solution to be golfed, so here we go. (Note that Unary would be much longer because it requires the use of 0
rather than allowing the use of NUL.)
Note that Lenguage does not, despite what its documentation imply, act like brainfuck does; I/O works entirely differently (something that I noticed when testing this program). In particular, Lenguage refuses to take input from anything other than a terminal, so the bytes that are being filtered out are the raw bytes sent over the terminal connection (note also that as it's filtering out the raw bytes, you won't see the keys you type at all). In practice, this means that the program will absorb any sort of input sent to it except for the NUL byte (typically typed as Ctrl-@), which will be echoed literally (at which point, the vast majority of terminals will ignore it, as the NUL byte is the terminal equivalent of a NOP instruction). In order to verify that the program works, it's simplest to modify the Lenguage interpreter to output in decimal, leading to the echo being actually visible.
What happens at EOF? Well, if the terminal is sending a series of bytes, there's no way to send EOF; all 256 possible bytes are interpreted literally, and there's nothing else you can insert into the terminal stream. However, if you happen to be using an old-fashioned serial terminal, you can press the "break" button on your terminal to purposely send misencoded data, allowing for a 257th possible code; this "break" is the only plausible equivalent of an EOF, as it's sent out-of-band and indicates something other than valid data. If your terminal configuration has the "interrupt-on-break" flag set (and the Lenguage interpreter does not, as far as I can tell, alter that setting), sending the break signal will cause the Lenguage interpreter to crash, which conveniently acts as a way to implement the desired EOF behaviour. I'm not sure whether this is a default setting (because nobody actually uses serial terminals nowadays, it basically never comes up), so I mentioned it in the header as part of the specification of the language interpreter being used.
Explanation
1110110101111100000111
001 Initialise tape element 0 to -1
110 111 While tape element 0 is nonzero:
110 111 While tape element 0 is nonzero:
101 Read a byte from the terminal into tape element 0
100 Output tape element 0
000 Add 1 to tape element 0
The inner loop will only exit when NUL is typed at the terminal; after this, we immediately echo the character typed (i.e. the NUL). Adding 1 at this point will ensure that tape element 0 is nonzero again, so the outer loop cannot exit at all (until a break input crashes the interpreter), and we'll fall back into the inner loop.
It's golfiest to use subtraction to enter the outer loop, but addition to continue looping around it; addition has a shorter encoding, but cannot appear at the start of the program (as the encoding would look like leading zeroes and thus be ignored).
If the code accepts a Unicode string must we filter byte-by-byte - for example if the current highest voted entry (Japt by Luis felipe De jesus Munoz) entry is fed "Ŧ" (U+0166) should it yield "f" (U+0066)? Or is there a guarantee that we only receive the first 256 Unicode characters in the input? – Jonathan Allan – 2019-02-25T13:51:18.133
1Is it OK if the function doesn't work if the string contains null bytes (\0)? As C uses null terminated strings, a function cannot know whether it's the end of the string or just a null byte. – wastl – 2019-02-25T15:18:01.330
@wastl By default you may assume you know the length of the input. Note that this is a bit controversial for other languages, however.
– FryAmTheEggman – 2019-02-25T19:42:21.9704"Bytes mean octal bytes." Do you mean Octets? Octal is a way of writing numbers, three bits at a time so it doesn't always line up with byte boundaries. – Kevin – 2019-02-26T00:31:30.640
I suspect he means "octets" – Jasen – 2019-02-26T05:38:15.937
1seems to be just begging for an answer in brainfuck – Jasen – 2019-02-26T05:41:49.000
I'm surprised no one has come up with a brainfuck solution yet... – cmaster - reinstate monica – 2019-03-03T00:16:30.753
1@cmaster For people who cannot see deleted posts: There are a couple of brainfuck attempts, both failing because they cannot distinguish a NUL byte from EOF. – Ørjan Johansen – 2019-03-05T01:11:47.990
3@ØrjanJohansen I've added a rule that can let you ignore a byte (for EOF purposes). This shouldn't be too abusable, since it also decreases your max score, and it makes the question more inclusive. – Jo King – 2019-03-05T05:06:48.740
Possible dupe. – Shaggy – 2019-03-23T14:43:55.773