1
1
Challenge:
Write a a program to take an array of 24 4-letter strings, and output a compressed (lossless) representation of the array. You should also provide a program to decompress your representations.
Input:
The input will be comprised of 4 letter words using each of [G,T,A,C]
exactly once (e.g. GTAC
, ATGC
, CATG
, etc). There are always exactly 24 words, so the full input is something like:
GTAC CATG TACG GACT GTAC CATG TACG GACT GTAC CATG TACG GACT GTAC CATG TACG GACT GTAC CATG TACG GACT GTAC CATG TACG GACT
So there are \$4!=24\$ possibilities for each word, and 24 words in total.
Output:
The output should be a printable ASCII string that can be decompressed back into the input string. Since this is lossless compression, each output should match to exactly one input.
Example:
>>> compress(["GTAC", "CATG", "TACG", "GACT", "GTAC", "CATG", "TACG", "GACT", "GTAC", "CATG", "TACG", "GACT", "GTAC", "CATG", "TACG", "GACT", "GTAC", "CATG", "TACG", "GACT", "GTAC", "CATG", "TACG", "GACT"])
"3Ax/8TC+drkuB"
Winning Criterion
Your code should minimize the output length. The program with the shortest output wins. If it's not consistent, the longest possible output will be the score.
Edit: Based on helpful feedback comments, I would like to amend the criterion to add:
If two entries have the same results, the shortest code measured in characters wins. If an entry is length dependent, a length of 1 million words is the standard.
However, I fear the community angrily telling me I didn't say it right and downvoting me to the abyss, so I leave to community editors to see if it makes sense with code golf traditions
1I tried to reword this and clean it up a bit. I did change one thing: now, the output is scored by the maximum length instead of the average, for ease of scoring. I hope this is okay with you. If you want to do average, you should pick a specific set of inputs to test it on. – Rɪᴋᴇʀ – 2019-05-17T02:19:55.337
You also changed the output from alphanumeric to the entirety of printable ASCII, but the one existing answer didn't comply with that to begin with. – Unrelated String – 2019-05-17T02:30:04.190
4Despite the edits for clarity, this remains a challenge that seems pretty boring. An optimal solution is to map the
n
'th possible input onto then
'th ASCII string sorted lexicographically, so anyone can win by coding that or anything achieving the same maximum length. In fact, the existing solution does exactly this, using alphanumeric strings as the old rule required which could be easily adjusted to all printable ASCII. I think a better direction would be to make the challenge code golf and require the optimal output length. – xnor – 2019-05-17T02:31:39.790@xnor a good comment. If optimal solutions are known, then code golf should be the standard. – AwokeKnowing – 2019-05-17T14:21:55.900