Write two programs that compresses and decompresses data

8

1

Challenge:

Create a program that compresses a semi-random string, and another program that decompresses it. The question is indeed quite similar to this one from 2012, but the answers will most likely be very different, and I would therefore claim that this is not a duplicate.

The functions should be tested on 3 control strings that are provided at the bottom.

The following rules are for both programs:

The input strings can be taken as function argument, or as user input. The compressed string should either be printed or stored in an accessible variable. Accessible means that at the end of the program, it can be printed / displayed using disp(str), echo(str), or the equivalent in your language.

If it's not printed automatically, a command that prints the result should be added at the end of the program, but it will not be included in the byte count. It's OK to print more than the result, as long as it's obvious what the result is. So, for instance in MATLAB, simply omitting the ; at the end is OK.

Compressing a string of maximum length should take no more than 2 minutes on any modern laptop. The same goes with decompression.

The programs may be in different languages if, for some reason, someone wants to do that.

The strings:

In order to help you create an algorithm, an explanation of how the strings are made up follows:

First, a few definitions. All lists and vectors are zero-indexed using brackets []. Parentheses (n) are used to construct a string/vector with n elements.

c(1) = 1 random printable ascii-character (from 32-126, Space - Tilde)
c(n) = n random printable ascii-characters in a string (array of chars ++)
a*c(1) = 1 random printable ascii-character repeated a times
r(1) = 1 random integer
r(n) = n random integers (vector, string, list, whatever...)
c(1) + 2*c(1) + c(3) = 1 random character followed by a random character repeated 2 
                       times followed by 3 random characters 

The string will be made up as follows:

 N = 4       // Random integer (4 in the following example)
 a = r(N)    // N random integers, in this example N = 4
 string = a[0]*c(1) + c(a[1]) + a[2]*c(1) + c(a[3])

Note: repeated calls to c(1) will give different values each time.

As an example:

N = 4
a = (5,  3,  7,  4)
string: ttttti(vAAAAAAA=ycf

5 times t (random character), followed by i(v (3 random characters), followed by 7 times A (random character) followed by =ycf (4 random characters).

For the purpose of this challenge, you may assume that N > 10 and N < 50, every second random number in A is larger than 50 and less than 500, while the other random numbers can be from 1 to 200. As an example:

N = 14
a = (67, 48, 151, 2, 51, 144, 290, 23, 394, 88, 132, 53, 77, 31) 

The score will be the combined length (bytes) of the two programs, multiplied be the compression rate squared.

The compression rate is the size of the compressed data divided by the size of the original data. The average rate for all three strings is used.

score = (Bytes in program 1 + Bytes in program 2)*(Compression rate)^2

The winner will be the one with the lowest score two weeks from today.

Test strings:

String 1 (5022 chars):

TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTX_w}yo7}vWL$Y@qNR*Xxqt|oqmwr4+32ejdnaKdEf1<a?<iEKswv)HcNyF/pGc).SPpCF-j$& 1**(NNZ.>Zy0e-`a)i$1Z,X[hcR5JX18wG|`9:H;Qi&nluCKC:b! Q+)i77B28/j/4ZYT1=FN!>DR7'yyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyeDW,FmJh.%AgO&<CIO|Z>gSmszi/I?nL3P8)se$cbNit%['G<X9VW/)+Xg%$Y}E98\X o;y<Jf8(,8=i`v\e B\7\?<\!Pht(U7FFg\!\L_&bh=G*IJLPLpKGc@ 3j9E%{z^+'3bFmM3q"|c2Gt#ed%-U+y?<bB'/[I]o}bmyE=Y$h!oo/H,9$&^*7Rbzd.L;KGN-Wllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllllgk4D:*\(Kt);&^0:RL.KB)IqS79Xj)c8qhf5+S=Up%y0xj%1lA=C<.^F*!UuE2u4wbZ[1#?Q)wz*E;;_5 w\{VUBqH}0(tE& HV(4eZ}S@7xi_s]nzwtP2$8_v`)BDFEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEzgGFo9b8`U':3H<;;K)D'B4:L'}7x;3d]^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^*'wt#^or$m_F{@D"`[n?x1Ow3H1bh$5Z@yzRJ4=my&%X+bc6Or/Bw`Zx,VO{Ss10}[fKFLX}Rh9W?_k7)\&j\`Z.BABUy'q8\VP5D_n-f|v3Y$cLe;;7r{5lD@uc?r/c+&O=0{Hr!5&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&9<OIozM4dNlw9N-MUW<kwD/E]XB^1/(?)?C4x)%p,K)p#<cG&PMV"10"&+vN-/oKw9FsubG=*&c'A)a Tu)uZD,S{c|<QO}w+[Pdc=$}3f(!73W?Ko!z:gPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP(!B"n}ydQP W^]2!$,0,ym! cVy4U>hmsNbdU}b-T`n'B^:L#Z}pI5l+46(1LCS>:BAp8+?[ ?}}1mtpo3\[{I]!7T33333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333FzE klH&6Cj[VPd: HB\e9FvH_./lxP*Z\LD-,Y``IegX+=1T_:B>VJ{Ikq>'_>k5>rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr"EI5K,%OB??_{"fNG>Ql6"jJ4m[S{I_/`P000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000=K#Gc-0ai'N"zDO[roJAJOPPY!%C#+J7"xd0V^teUZX$QW!<\s 3kuuXS'W`F mUvkzr7R ET"(2Y9c}M-a&shkT9j>*x+KDprC'9WFXl(`I{AfsCffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffvKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK#v>boc"..................................................................................................................................................................................................a#,uAOemp[b5CPOzI85g:a[|:]<Ss=`JuIB]+Sg$b'>PJ=%:zM#I$,YM1eX)Ja=P5x^WhuVt1?$ZU5qoM68P?n;T]R-RZ0PMH^pS%W*so-v!=2Z=9J^p,j$4)"'mXvWFF]IQN^MqG:^Lr&V?is6A%N${wNjCXpJE+F^wBG4@c`c^/CU-}8TIYJHu$|KGq=\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\l7G7DK9Nq+'{=>.^a"I<ytX0(HsP'x:I4enw5'^kjQ{ZQta\FL|zOC2C[d4y\z8'z<OgHw3+XZ_nSq@B9m)Yu"|JkOTP*L3T"t\<'sh,y*{0%*NBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBl,OuBGl;X(Yxx._o0Jv8a_]`]j=u6-W^Ve%&meh`]PmR}c>3CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCP+GZ-jP@U#$K.}zTy^J(@9"LZ<,Dm}LkKn'>>ZBn:fn?o_o>LT1{2{t0r4$M-GnV;?/M^P-#uzJ=PnBhYo<,uyXNJ#yiZ;R29ta5 >.D0_$\BWWO%3=|#W:c8^VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV%'r`CdgOv(WZ_y2*/sW|$Mmut0CX>Mw+109!Ky$;o`eKqd1D2Kh9x=y8{;(p)xpuIVT+9JS<T>/UIWB< T5$hs|V.$(>$J6j}@\WtWM3\>dvc{O!<(mzw@<xeRkhCIE7L;z7_OFx|nbxfIxu|hhBiN!d"`5;vxnpk3juf;J2})#r!]AFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF<7#0%Uj,b<WrAK?I%kPx![bJBF}RE'j>`f>U]*f%gDY?aa]O3>sL.V\.3#u/%O;xHIl<A4#6zO}umALe*B5P'*`kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkINNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNIv%x1QT@:A`TeVc"AVnFzfPGN%^</a=G=#P/G/oAS^ZPI-8yhu0T8>V5kF80Gh;QU=SC>ymTH{Onh/)[kN+:y .iRj[yK!V HDFW<<fU&zmm2.OY-H^Gf)yH{R%>5DNI]'AX7-kpEJr`+IM-cUn S{co^]ir%J,(P/[q1 h},R",d\Kg%(*HpGDEq`=ubhTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTgeQ_!6|Qj$L77777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777777>JV8&V]xq4k#U)5e^8VTJPRzD)HeT6STV:WgqwBbF61R/{_x=diD{<5jKf/Yds7.;Eu}[bYDyA1wRA{-S:1l[%5dHHVOgWMQBy">VO^fJ4yn>oN.,1LEzxT.)>cHk!PbB|$#."Jg^;8}\% D>*8e))=OnSNhRQ

String 2 (2299 chars):

VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV`]e!oO{=i&}\o8^('MDPC`VCI`(@3AFa"A3Dhc<h2Rhc99F^$<LpAOdzC6Y%dTm:!iHH@&&OCV?y)Vv [`wq=?0-YjXSPx1t3k&=>(6^EW?%pH3y6Rb8="2tG%$Jo6A<X^nS4K\v@nZ(Bi1jCW4?p]aIv}<26gXQ%'GKa*<$aPOnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn.$$}.HK{It43ltY"H&:VcTkq+C3.g2VB`Ui-P=8I^%9\TN5=[&@;YOR0`[sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssZz_qnx9wqa'caEfvlp,;c0't70I8'>|W=SHQx\{#ed9)WFM!:l[24.qU,UV^0gARZYW4}n<.6HJdK:{]8,QOO]KbZ'Tugd9{9>X1q.-[adNHmMP*+"<]XIIf>7>Rp/,sQ0QHTO$jduG3O>AV/,GY++$AOBDepNz9qIPzr\G$.NtKLD=j8?8ia*@y34GgmtM%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%]"0'-roq!u`Aw/EYlk< R>-AsSmJ3D w}Pp;Rpi`r755VI,Ao(uVA%)v0]WC/XW{-v7k+37y5QQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQQ`9q;^dn#byX+NvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvaY#. xKADce75y7E]`(m'cAg.$N5j{,u!v%Pc(6D?"axU*VQ,n^bWomxD.`LA:I1nvX=^VGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGtOnQL:PKI':C}q.:6|aHiYII*C7{kGa,9aSE8D}h#S%[^:P&e^:kazo,"W]e--\bW=],xD44\@,Z;Q7$RPKA0b6yO_7&h}b/c4@nE.CvI}0.-ySF2zWy_3gwpmcqWZZ'Y))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))X']&p]icj|RFj'ci2*1jXg%* Eu&9QZEWwGpen1Pa,Gti_?FUL)&=r<YL-Th"]f%jV<Kx);@L^)mw'g S(nry%kbZp pGD]R@j3{idSHH<!X{%(/T,ow]$259a P-6_AX*o?4g^>(n<v^:/U@cmh9nOG|ot=8Rw5FfvU/'IGD(gm++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++K#&U4RN$dR*rv1p5t`<n7XpvpVz#uncF647sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssYl?CCkw#e?#be5,c0GttH}N j:$5AHa Xz[<z-=CdX)@}fHmk7L-k&hZHOP5o9^yU%%:g|TD1b=7G !HKGMN|}/l:}2Ia^fffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff56&4i{38sY*]<DTT#:5>RJm*\c&|kM+K\s^"055%FL5Fl&X|{q+4N6t^(<\gt@;v?z}xly?Bi_!mSA+8r/6n4)Kdh4)P8'|oK&-7tFNO:]mr$nl6L1jr):uC(vh`Ei)19MfumB<VtL]"Vc

String 3 (10179 chars):

"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""":=P,W$WS.vBP9d89L65VeuKY27|*-Ih1x/nY}p09Sq$PQ%4z!l*)@wmbP3b;C3E8#*?2-T`W"0&::+kA:`.OrjmD-u(oDpE:{x0 ttb13yO"q6U:1N@/[R2g}#y}d,7)_STJ^0hb]4]hSd9%L#]Bimmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm+HuOde&d,uX?B??|AJ)h.}D<HouV7NXUP0!.,HmhBaar7(c.)A#%8aPc{g8iE.hw0H)P5B^zQT('wG7Vu(|M>lo.5EM3Z/o&[Pfd}A{Vsi,+lAlam*K}69zlNWJv)|u0<e#+:l![((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((G#zx B)=N\[hRY8Dm`x@SPG=3ZRJ^SkywE:U/[\Z?q[fjY9gxO.]TiH)xKw}%*Tb[JhlD]2D4:(CE:#Zre/}9:z2*G)u)O,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,It)UnqhJ_XIzXNlT%C*Mq%R6KNTg=Q(4G)}=Q2Fp.A>Tuz?y4,wvh%,qQS5gb`6E>^6M]1FV/*4s%LDf%bwEr!oH.G/////////////////////////////////////////////////////////////////////////////////////////////////////JfITfs\Bp{8uJmAE@qW>QT!"R\q\q}2Rwo3333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333q Kxp"zbN(am%qRC"bc#]zzD,sr]=H)yMS"0X&_/|yqs!kZ)`MY[C%3j`6$D!(Qk4x."e,m]                                                                                                                                     ;B|S)E{BQT:%eSr0j7Gqx0&u7c7.lT]P?.&&mV2ZT^.k^"0)4K*#E4d!z/$#[-?zi8a(S?iIU9|q?lfy3T"}fh)oWfO5^{sAXu`CB*LStW5(KCe3I]-|oE^>!VL<#Pd$PLDBqZfQ)QPCWja7&7HtM7uM9*..................................................................................................................................................................v/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////\sh*_l7Tk`b7KQueol'sWCC,5|\=H>v0I4c\PPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPPP4q.r!#H?yY/-t2:iDRwn,TwSR6 v@lBv\jjO!Kz+1NT>ksfbl=hq)/y1q<}3"kYHLJ&&m'NS'lhqaPYJW3}3Qjh)|ZnAQvb^v=6TIw%Ry{!M,aBIzd9QLhB%cjXOoc*C\S0!(f :NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNv(9%9@j.1yF&fw@L9edFP3#7tuHlxpCk8]hD.ky{V6bS#?rEz5PERx_"V:\TN {2_"pE/T:X<#&<\V%IQISzo6w5vO%lvPx)|wO!"+>\t-SzP\wWXFQrmXp`JGJ/sIoVyns2u=yU`26&A1]vrpEXHD/K1HjnN4-t";S*****************************************************************************************************************************************************************a&.z "aV6]K"JhyO;/`UOpAkb}zP53=$AD:Io3lkjdUMjw1w!rL2Za3Dk:,`]AsG!L[3e^ECxxqx[I_{|qe)z;zZ#V&HZ:J4g+2U>}y!Oazq`qY_]'n=egXV9*IaFbtRZGVQG!ojmkTkStrbzFnd|7Keu [7$2f7Npb]ne=wuAg\9*4Rd/cqcDApSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS[OTjad;#l+*>m&fFSS{rc|xb?MTV$Z2I>)l8QfAUB"wNHhMNpvaz_TnvM:2Ck>*2=jWS2)a/$dP\,"pB9#L^1lSir+m%oG@YA/G#M3T10gF+xdJDqe"8888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888888":6Y<zLVYm@aQ-.\-u\M_.7<}L`(w!+7lm@<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<aJHBA#?>9cSxN5HZ:$Z{Kr15x>cZ?!NACujxBVAu;->*BxS)wu dA:2#Sq:FaGgr5,mKLv&_ms"s1:NXEVVX)`y#ekL4H$;%{xrU$ai&9J5C8eqyQE(E|;+3tf^csoENQJG#}X81VE2m1xY 2SI d[*Eozzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz7/+M6f *n*GBp_/+rpBX@]%OpIgfkF":Cc"xzIGYw5H{LVBWFKg\1j{S duWiWB[-%'z7)NGE)QR >"t4#N1{(EntG)$u^o/J(*IaBAB<D\iE?gWj)ccWk-[OToXKjQZTWIji%^] ,Z#:_5Rsk e`s-bxrLW|(,c!>mmin\w5lcLO`````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````+b"0@z2a@N^IO}K{uefxE%pgz'!7GoJ2mLaOZh3#I$tSEZ/=x9S6]Y0[5T@qRftKeXuBFr_ \)-?-2">0r*=MB'hhf4x"%g8Y\[kmPx><6ejGL; s:yp@5af+rg=/`W(xm,{OKLzj dY/3PC)Ea]LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLcqGx?cJ"y,>w5jkIbja_qUNv#-r_G 86[090^^F@x|4(8@dWy`"MnL5<+Mf{IsyVO32xy@ZKS4.o2T`G$gNJbCddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddddd:>WH*?#n+7oFZQ\gLEzvR8LS6u&0^pR[;n`x&D9m"YFf````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````````G=Z^o74rpXU<`',;JPl|}/>8f<ir3M6;&O)Y4rV5T2^+AJ>QvlE([bozzb,ifw=BG7+P((lS$g{c!N*&_K<);rx,2&9@7QJi6Q*(.qFQys>EB['1a4(`?htRwJ"a+<j6*>.Lm}F)`:M3;=^n+Bp&\HZI[sv1FGI93\6VVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVVV O}h4W!dP?}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}}Z'lC(E *d#-Ub?y". r:D{n--gCk&=IpmT{\I-;2vQ18xtU'2u1<qAN<IF!YEwx)ZKvY_)D%CU;l1bC%$>u|W.(2=dC!{_e?bc32]DYC?m}<{vdZKnZ <.VhO?EYwINo7lS36Ir1p4Q%cG*7RX#)iVIu..............................................................................NuF1uxKoCH5[cn,,7uj "6aoUK_7hV|lb;u8Piim!{f4 dASS'mtQ&L_-jZ=a?=YwO%Q?y240o".T*]k",5]S*f(P?Rp?T1=V7[^T=w8_h>(:O%f?iZuaJ=3d`f[dI;8S@'Gz]zn%{_OAw7&T%-44444444444444444444444444444444444444444444444444444444444444444444uSz;a(-U)o.xYDgr0,Kyo+jsMrB^oQJ=V}U`nEB>Eqo}S]sUil-sAurUOxxU.+S#f.lvO>Q_SN71{(3"eXt.%$z2Y%Gk8WFTCBz;\`2-eis.*pQ"q^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^vbAbVM q5#3gqC[FPb%7[jHR(o!jK[B)-!FD+.6Pu-(-ccyOsQPKug;]M8R5c'IK>O4EuxI8nr#Ab\EXRK+fP'KHb@-J=51g?/?<H\-BWq1s-Rb:5TuB5kb;eG3Sp/]h5Y{E+T\#i\.&+Q#%*G&O1o-sRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR'%*P|wj/I!ThsIYvh&#$W3\.]4[RTuwFOF#8y+TA7%B@u1T)l!:VFW;i.-{w6Gebq3JaMn7X9cwj3eg8Kf!HaME[<f2*w',ji0t0sDj>n+,=WN *k#[a3XKg!"0Mg$`vP\Fvjup*@8TV@ly{k{4Z2[@a:N)SWQ?Lxxv}dJ?WUmFYPBcG>Z;r_[6Kw$n2\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\'ja3@b''Ol'{i3K`wB5(kIOVg9'/R`I6=la;cV=ajQbf1+o3J&B{>(hUQE+g.YMv#q3&i5s/2mB|U^h2.A3U;(rxB{6DQb)`_/Sx/:!URO&49eSAm'B\GP8)9RdPO9FrqGI\^1j*'7Rgfpi&zgX52GU%%h<)h1)KG/Uu:un{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{W02fvj%BMP=H:dSNif}w[lZ@gF,O]w3O R"+g3G[7i*8y,n>1]u 5G!)nulS^S064#?y=/E1_QBDM`i`M5kzH0au24xQNB^u4;4ipll}IP1%V3yEC+2Oq83Y$iezSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSqQd$I49 o;w_%&90L/ckdHY8TeRWLFZNvi3aGwN3&HRuU$Vgt9(_R\FmT9}Aj#VQp"oUoXW=s*vS6SKP ]x<[IA2M2I`2Vy=a3&Jc,n:}qTboygX6pp,L?\ff{zE#9D-?-jgPrKwd6V{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{{jB8OKx6j]=oPzb;pL(<A8`%g7+<O6*)W1m8(SiC+4n(775\g8$[?I\`Cffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff)/.%\4TZ1%Eq!chTOC|#Tbx(m}"@u>Bw&L*hz=Px4?yM}`4s>uGL,U0x@-JPIc.Kq-tx/h:Qr7r*t:6>#q 6O+doZQvl#kr:]VM `z%(&<`yhME|B;2Tjm$^N^,0\h)rVEVT\rp@T>>0U:KoFAsZ'_rZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZaI}`I/T;Qk/dyY_96VxX6=EGn%{'uzDF}.k!\}O^NG1p7PI<_C'/3%d70'@;\6n)wydLj}bZfWP9 zei[;J:^;BRwKcBFdFld3IRrRY9oBJ<#ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ/e{hh'&wP3jknjfma}v:*__SLUIg\@*_`m:fcbfj((6:1)..?,Xx6<%bi.9Z>)xC`Jbwv#mo_[O0z8>@Exm^>b.&2)[Eyg\Y{UZA9:+SuwhG(<w***********************************************************************************************************************************************************************s4NBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBJ_ h\ xgC46py]Wh'|,LnG2dgl1\*ZG2HWA0Z%26_zB(Y{:;2zxS?>5NJrX8j^*9bt$UQ\PD95El;e'EU6K.[:a&zn]$]`g^Wl;Q(o9oWI3HwTj}NR_:OAdJ@!M8#twm6+tN!*%ldWyOZpnGBeOyz\CiH9w>FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFVRA"MafX,*ZvJvhH{]BrDX">v}pMz2ze.q#$U1c_ XA207Zof^cE,(R`LgHdYY&etOFaShx`F])18.O#\go/-Q#!%6%O0)W`v$fX)7VSTUYau9TC|%a_'bI&i1i7Z3,om_'9(2m-ihn jCh5VrJPD9gm@EoAP*[.!5e{5BJ_>_ut^Hu\:^kJS0IOn+ 555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555555gp_;VbIVS>7)S?V.!Qgx7T/Ik33tV.M>/[l/8HiS1L1tX<P_2,rR(R&+-#^rl=b8GgJP^y:!/`+-:_sWyzD56jqu0/N-)].j`*q9$csBoIl CyC]3j+^DT-Ra|LN*TesCL*-Z;OdO^m#{rp8rAaFB.N-''n:\p?O5bPgT{eD9^3[S7&E,n%/VFiz>K(VXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXEFm-T"2Ny{`4Jt7kv2fTpcsn=U.XCpppppppppppppppppppppppppppppppppppppppppppppppppppHr5aL@FX.UBLt$Um68vRs;Fw)Ymm>=^;++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++JczE.=iK'CWH`]_Xf3G*Nb*ExFAKl;|ssZsBC 2s4jz=GR9y>`X8(M;2/BI3h75[[Yfr]txi5}i{np "p}H*.&3o(e>6I)`/;]`[7Oq{=Y4_2jUl}M)jWn&aX&h'dOUrZ),5>Rr2J<UF&{.vJNB?v{Hyp8sK`\J+;%_WTm__________________________________________________________________________________________________________________________3|+lq,Eb6.3HKu}g;La3'x$%?4tgQhIsR'9 c@i<5(LW'$[)YBE rA}BcWFWswnk_h(Z_ a`tQ)IorJ(!>0=c)/-t.n8rL|Php0!tjtI^r=GMN)GU4k?E oQc4|x#y;AU2hW

Note that the strings can be much longer than this, but there's a character limit (30000) on posts, so I had to restrict it. The functions should work for codes up to the maximum size.

Stewie Griffin

Posted 2015-09-23T15:21:38.303

Reputation: 43 471

1Will N always be even? (It is in all four of your examples.) – Martin Ender – 2015-09-23T15:42:54.710

@MartinBüttner, No, it can also be odd. That's just a coincidence. If it's odd, the string should end on repeated letters. – Stewie Griffin – 2015-09-23T15:45:53.090

"The compression rate is the size of the original data divided by the size of the compressed data." Did you mean the opposite of that? Otherwise, better compression gives a larger score. – Martin Ender – 2015-09-23T16:07:11.697

Answers

2

CJam, 40 bytes + 13 bytes, rate 0.48697, score 12.5682

Just a baseline solution which compresses the long runs.

Compression (Test it here with length calculation):

qe`{(S*\_{0=K>}#)_{(/(e~\L*}:F&N\}h;;_,F

Decompression (Test it here):

r{~l1>(@*\r}h

The lengths of the three test strings compressed are 2296, 1208 and 4917 respectively. This score could probably be vastly improved by making use of base encoding.

Martin Ender

Posted 2015-09-23T15:21:38.303

Reputation: 184 808

2

Awk, 74.9 = (199 + 115) * 0.48853^2

The compression replaces the characters repeated over 4 times, by character & total, enclosed by tabs. For example: &&&&& becomes \t&5\t. While dddd remains dddd.

The decompression script uses the tabs as the record separator and restores the repeated characters.

Compression

{split($0,s,"");for(i=1;i<=length($0);i++){c=s[i];f=s[i+1];if(c==p||c==f){n++}else{printf("%s",c)}if(n>1&&c!=s[i+1]){if(n>4){printf("\t%s%d\t",c,n)}else{for(j=0;j<n;j++){printf("%s",c)}};n=0}p=s[i]}}

Decompression

BEGIN{RS="\t"}{if($0~/^.[0-9]+$/){for(i=0;i<int(substr($0,2));i++)printf("%s",substr($0,1,1))}else printf("%s",$0)}

(note that this script would be more efficient if the substring calculations are first put in variables. But codegolfing often trades efficiency for bytes.)

Test

$ for s in string1 string2 string3; do cat $s.txt|awk -f compress.awk >$s.compressed.txt; done
$ for s in string1 string2 string3; do cat $s.compressed.txt |awk -f uncompress.awk >$s.uncompressed.txt; done

$ wc -c string[1-3].txt string[1-3].uncompressed.txt string[1-3].compressed.txt
 5022 string1.txt
 2299 string2.txt
10179 string3.txt
 5022 string1.uncompressed.txt
 2299 string2.uncompressed.txt
10179 string3.uncompressed.txt
 2296 string1.compressed.txt
 1208 string2.compressed.txt
 4916 string3.compressed.txt
43420 totaal

$ md5sum string[1-3].[ut]*xt
ea7076dd2f24545e2b1d1a680b33e054 *string1.txt
ea7076dd2f24545e2b1d1a680b33e054 *string1.uncompressed.txt
dd69a92cb06fa5e1d49b371efb425e12 *string2.txt
dd69a92cb06fa5e1d49b371efb425e12 *string2.uncompressed.txt
9e6eaf10867da7d0a8d220d429cc579c *string3.txt
9e6eaf10867da7d0a8d220d429cc579c *string3.uncompressed.txt

LukStorms

Posted 2015-09-23T15:21:38.303

Reputation: 1 776

1

Perl, 113 bytes + 80 bytes, rate 0.497325, score 47.735

This is my first golf ever, and a first draft. For now all it does is count the length of repeated sequences and replace the repetitions with an integer representing the number of repetitions. E.g. "aaaaa" → "a{{5}}"

Compression:

$d=<>;push@d,$1while$d=~/((.)\2*)/g;map{$l=length;($o)=/(.)*/;$_="$o\{\{$l\}\}"if$l>4;}@d;print length(join'',@d);

Decompression:

map{($o)=/^(.)/;$_="$o"x$lif($l)=/\{\{([0-9]+)\}\}/;}@d;print length(join'',@d);

Double curlies ({{ }}) are probably redundant, but I want to be on the safe side. Compressed lengths are 2340, 1230 and 4998 respectively.

Eirik Birkeland

Posted 2015-09-23T15:21:38.303

Reputation: 111

-1

PowerShell 5 (invalid), 37 bytes + 35 bytes, rate 0.43121, score 13.3878

This entry is currently invalid and theoretical only, as I don't have access to a machine equipped with PowerShell 5 to verify, and/or I'm not sure if this counts as "using an external source." More of a theoretical "what-if" scenario than actual submission.

.

Compression (the Get-Content displays the results and doesn't add to the byte count)

$args|sc .\t;compress-archive .\t .\c
Get-Content .\c -Raw

Gets command-line input, uses Set-Content to store that as a file .\t, then uses Compress-Archive to zip it to .\c

Decompression (again Get-Content doesn't count)

$args|sc .\c;expand-archive .\c .\t
Get-Content .\t -Raw

PowerShell 5, introduced with Windows 10, includes a new feature that lets you use the built-in-to-Windows zip/unzip functionality. Previously, you would need to create a new shell and explicitly execute a zip.exe command with appropriate command-line arguments - yuck. Now, it's just a simple command away.

Note that this is likely also not valid if you're expecting to copy-paste the string output from the compression algorithm into the decompression algorithm, as the PowerShell console doesn't handle non-ASCII characters very well ... Piping from one to the other should work OK, though.

AdmBorkBork

Posted 2015-09-23T15:21:38.303

Reputation: 41 581

PowerShell's pipes can be a bit odd too. I don't know about PS5, but certainly with PS2 I've had trouble with it converting the data in the pipes to UTF-16. – Peter Taylor – 2015-09-23T20:02:34.787