15
1
Challenge
Write a program that compresses and decompresses ASCII text losslessly. It should be specialized to work well with palindromes, including case-insensitive and punctuation-insensitive palindromes. The best compression with the smallest source wins.
Scoring
total_bytes_saved / sqrt(program_size)
- Highest score wins
total_bytes_saved
is how many bytes smaller the compressed strings are than the originals, total across the test cases below. program_size
is the size in bytes of the source code of both the compression and decompression programs. Code shared between the two need only be counted once.
For instance, if there are 10 test cases and a 100 byte program saved 5 bytes on 7 test cases, 10 each on 2 of them, but the last test case was 2 bytes longer, the solution would score 5.3. ((7 * 5 + 10 * 2 - 2) / sqrt(100) = 5.3
)
Test Cases
tacocat
toohottohoot
todderasesareddot
amanaplanacanalpanama
wasitacaroracatisaw?
Bob
IManAmRegalAGermanAmI
DogeeseseeGod
A Santa at NASA
Go hang a salami! I'm a lasagna hog.
Rules
- Standard loopholes apply.
- The compression must work on all printable ASCII (bytes 32-126, inclusive) text strings, not just palindromes. It doesn't actually have to save space for any inputs, however.
- Output can be any sequence of bytes or characters, regardless of its implementation or internal representation (strings, lists, and arrays are all fair game, for instance). If encoding to UTF-8, count bytes, not characters. Wide strings (e.g. UTF-16 or UTF-32) are not allowed unless the only codepoints possibly used are between 0 and 255.
- Compression/Decompression builtins are not allowed.
For the sake of our own enjoyment, post the compressed strings with your source code.
UPDATE 1: Scoring changed from total_bytes_saved / program_size
to total_bytes_saved / sqrt(program_size)
in order to give more weight to better compression and less weight to aggressive golfing. Adjust your scores accordingly.
UPDATE 2: fixed wasitacaroraratisaw?
to be wasitacaroracatisaw?
2If case, punctuation and spacing is removed from the the input, is it guaranteed that inputs will be strict palindromes? Edit: nevermind - I see the
wasitacaroraratisaw?
is a counterexample to that – Digital Trauma – 2018-01-29T23:57:57.3702Which range of ASCII characters are we supposed to support in the input? Is it
[32-126]
? – Arnauld – 2018-01-30T07:14:11.2971Yeah I don't think the
1000 *
part is really needed, and no I don't think it will make the score feel more "satisfying" ;) – Erik the Outgolfer – 2018-01-30T11:31:37.0931Can we use compression/decompression built-ins? – Lynn – 2018-01-30T13:37:54.140
@Arnauld: I want to say 0-127, but just the printables sounds fine. Gives you an extra 32 values you can use for compression control codes. So yeah, [32-126] – Beefster – 2018-01-30T19:06:55.420
@Lynn I'm going to say no on that. – Beefster – 2018-01-30T19:09:25.660
@DLosc: I didn't realize I didn't have an even-length palindrome in there. Good catch. (if only someone had noticed in the sandbox) – Beefster – 2018-01-31T00:14:54.940
4With so few inputs, there is not much scope for doing anything clever. It would be nice to have at least a few times more. – user1502040 – 2018-01-31T08:16:13.120
1Here are a few more test cases I think would be a good addition–two longer all-letter palindromes, a long mixed palindrome, a very long, very mixed palindrome, and a short non-palindrome (to ensure entries can work on non-palindromes). – ETHproductions – 2018-01-31T15:31:12.813
I'm thinking I need to change the scoring to give more weight to the bytes saved. Maybe I should sqrt the program length... – Beefster – 2018-01-31T18:11:51.410
I like this scoring system much better. (Also, I only noticed now that
wasitacaroraratisaw?
is not a palindrome... sneaky sneaky!) – ETHproductions – 2018-01-31T19:20:52.153@ETHproductions Oh. You're right. I actually missed that. It actually should be
wasitacaroracatisaw?
I remembered the palindrome wrong. – Beefster – 2018-01-31T20:08:20.290Incidentally, DigitalTrauma noticed that one and I didn't realize it wasn't actually a palindrome even after he pointed it out. – Beefster – 2018-01-31T20:12:14.413
@Beefster Oh :P Compressed 6 more bytes for free, thanks! – ETHproductions – 2018-01-31T23:31:56.443
1
A man, a plan, a God's 'nam. Tables, nitrate, tar, tinsel, Batman's dog: Anal Panama
-- 1632 – SIGSTACKFAULT – 2018-02-01T22:02:22.897@Blacksilver my own solution saved 22 bytes on that one. – Beefster – 2018-02-01T23:38:37.390