Compress to Impress

8

Inspired by both the challenge "Unique is Cheap" by @Laikoni, where the score is based on the challenge itself, as well as the JavaScript (ES6) answer of @ETHproductions for the "Palindrome Compression" challenge, where he uses a pretty cool compression method for the palindrome-flag, upper/lower case indication, and letters.

Challenge:

You will create two programs/function: a compress program/function, and a decompress program/function.

Compress program/function:

Given the characters used in your own source code (both the compress and decompress source code combined) as only possible input, use any kind of bit-compression method of your choice and output the resulting 0s and 1s of the bit-compression of this input.

The amount of bits (0s and 1s) outputted must be as short as possible, and this amount will be the score of your answer.

The idea is to have a nice balance between the different kind of characters used in your own source code, the size of your programs/functions, and the type of bit-compression you've used. Or to quote @RobertFraser in this comment:

This is a great example of the fundamentals of engineering. Taking a problem description, thinking about different ways to solve it, and making tradeoffs between requirements (ie how many bits to dedicate to various styles), etc

Challenge rules:

  • The compressor and decompressor program/function should be in the same programming language.
  • You must provide both a compression and decompression program/function, and the amount of 0s and 1s your compression program outputs for both programs/functions combined (concatted) as input will be your score.
  • The compression must - obviously - work for all characters used in your source code of both your compression and decompression program/function, in any order or amount. (It can have undefined behavior for any characters not in your source code.)
  • Input-type doesn't necessarily have to be a string. It could also be a list/array/stream of characters, and it can be as program argument, STDIN, etc. Your call.
    Same applies to the output. Can be returned from a function or printed to STDOUT. Can be a single string, integer-array, etc.
  • Your compression program must output at least one 0 or 1 (so an empty cat program for both the compression and decompression program isn't possible).
  • Your source code may not only contain 0s and 1s already, excluding no-ops (to prevent programming languages that print their own source code by default, where both the compression and decompression programs can be a single 0 or 1, and the no-ops part is to prevent this behavior with an unused comment; sorry binary-based programming languages which only use 0s and 1s as source code).
  • Although you'll have to support an input of 0 or more of the character-pool used in your programs/functions, you don't have to support an empty input. So you can assume every input will be at least 1 character.
  • A possible input can be larger than your compression and decompression sizes combined.
  • If for whatever reason your compression method is order-depended and has a shorter output when your input is DecompressionProgramCompressionProgram instead of CompressionProgramDecompressionProgram, you are allowed to choose either order of concatting of your programs/function for your score.

Example:

Let's say the compression program is ABC, and the decompression program is 123aBc. For any input containing zero or more of the character-pool 123ABCab, it should be able to correctly compress those characters to 0s and 1s, and be able to decompress these 0s and 1s back into the correct characters. Some valid example inputs for these two programs can be: ABC123aBc (of course); A; 1Ca; 22222b1121221b; etc.

General rules:

  • Standard rules apply for your answer, so you are allowed to use STDIN/STDOUT, functions/method with the proper parameters, full programs. Your call.
  • Default Loopholes are forbidden.
  • If possible, please add a link with a test for your code.
  • Also, please add an explanation if necessary.

Example of an answer:

Java 8, score 1440 bits, 180 (87 + 93) bytes

Here a very bad implementation in Java 8, where each character is simply printed as 8-bit binary String.

Compression function:

Input provided as java.util.stream.IntStream.

s->s.forEach(c->System.out.print("".format("%8s",Long.toString(c,2)).replace(' ','0')))

Try it online.

Decompression function:

Input provided as String.

s->{for(Integer i=0;i<s.length();System.out.printf("%c",i.parseInt(s.substring(i,i+=8),2)));}

Try it online.

Kevin Cruijssen

Posted 2018-03-05T10:38:04.810

Reputation: 67 575

This is not a quine test, so can I read my source code? – Weijun Zhou – 2018-03-05T16:22:23.523

@WeijunZhou Depends on what you use it for, but I probably allow it in most cases. I guess you want to read your source code and then map easy of it's characters with a look-up table, or something like that? – Kevin Cruijssen – 2018-03-05T16:53:13.423

Yes, something similar is on my mind. – Weijun Zhou – 2018-03-05T16:54:26.577

1I may have missed something, but here's a possible loophole in the rules: compressor->if input==compr_code+decompr_code then return 0 else return binary charcodes of input, decompressor->if input==0 then return compr_code+decompr_code else convert binary codes to characters and return them. Final score: 1 (lowest possible). The programs won't be trivial to write, but with some tricks from quines they surely are doable. – Leo – 2018-03-06T08:15:07.610

@Leo Hmm, you're indeed right that the current rules allow such an answer (unintentionally of course). Not sure how to fix it, though.. I could state you must provide the mapping you've used for each individual character, which must correspond / be the same for the compressor + decompressor input as well as any random configuration of these characters. But in that case I might limit possible answers who want to compress pairs of characters, or whatever they come up with. Any suggestions on a rule to close this loop hole, but still leave some flexibility for creativity?.. – Kevin Cruijssen – 2018-03-06T08:27:56.240

One possibility is requiring that adding a character to any input should increase the length of the compressed output by no more than 8 bits. But beware that even this change could invalidate some "honest" answers – Leo – 2018-03-06T09:34:31.480

Answers

2

Jelly, score 350, 39 + 30 = 69 bytes

Compressor: Jelly, 39 bytes

“iЀ;⁴⁾¤8Bz0ZUṖFs5Ḅị”;⁾“”¤iЀ;⁴BUz0ZUṖF

Try it online!

Decompressor: Jelly, 30 bytes

s5Ḅị“iЀ;⁴⁾¤8Bz0ZUṖFs5Ḅị”;⁾“”¤

Try it online!

HyperNeutrino

Posted 2018-03-05T10:38:04.810

Reputation: 26 575

@KevinCruijssen whoops. fixed, thanks – HyperNeutrino – 2018-03-05T15:40:35.930

1@KevinCruijssen oh I'm bad lol. thanks for the fix! – HyperNeutrino – 2018-03-05T16:11:30.193

2

Java (OpenJDK 9), score 702 bits, 88 (42 + 46) bytes

Compressor, 42 bytes

$->new java.math.BigInteger($).toString(2)

Try it online!

Decompressor, 46 bytes

$->new java.math.BigInteger($,2).toByteArray()

Try it online!

Olivier Grégoire

Posted 2018-03-05T10:38:04.810

Reputation: 10 647

Sorry, I'm playing the low byte-cost (hum, for Java, I mean) rather than the hard compression :S – Olivier Grégoire – 2018-03-05T15:44:15.980

Didn't even knew it was possible to put any String in the BigInteger constructor.. Figured it would give a NumberFormatException or something.. I did knew about the .toString(2). Nice golf of my example answer I guess. ;) – Kevin Cruijssen – 2018-03-05T16:03:01.967

@KevinCruijssen Indeed, as it currently is, it's more a golf of your example than anything else. 703 bits means 88 bytes. I couldn't compress anything in less than 88 bytes so far, but I'm still trying to ;-) – Olivier Grégoire – 2018-03-05T16:10:10.513

0

Pyt, score 179 bits, 9+10=19 bytes

Compressor:

←6!58*-Ĩɓ

Try it online!

Decompressor:

←2Ĩ6!58*-Ṕ

Try it online!

mudkip201

Posted 2018-03-05T10:38:04.810

Reputation: 833

1Decompressor still seems to throw an error – Jo King – 2018-03-06T00:30:50.840

That's strange. It works when I run it locally. – mudkip201 – 2018-03-06T01:51:56.907