Perl, 438 291 chars
Inspired by Jeff Burdges's use of DEFLATE compression, Ventero's compressed Ruby code and J B's use of Lingua::EN::Numbers, I managed to compress my entry down to 291 chars (well, bytes) including decompression code. Since the program contains some non-printable characters, I've provided it in MIME Base64 format:
dXNlIENvbXByZXNzOjpabGliO2V2YWwgdW5jb21wcmVzcyAneNolkMFqAkEMhu8+RVgELdaIXmXB
S2/FFyhF4k7cHTqTsclMZd++M3pJvo+QH5JiDJ9exkKrj/PqXOKV1bod77qj9b2UeGBZ7w/bpd9s
3rCDruf3uWtwS3qS/vfROy0xsho+oWbB3d+b19YsJHWGhIHp5eQ8GzqSoWkk/xxHH36a24OkuT38
K21kNm77ND81BceCWtlgoBAq4NWrM7gpyzDhxGKQi+bA6NIfG5K4/mg0d0kgTwwdvi67JHVeKKyX
l3acoxnSDYZJveVIBnGGrIUh1BQYqZacIDKc5Gvpt1vEk3wT3EmzejcyeIGqTApZmRftR7BH3B8W
/5Aze7In
To unencode the program, you can use the following helper Perl script:
use MIME::Base64;
print decode_base64 $_ while <>;
Save the output in a file named 12days.pl
and run it with perl -M5.01 12days.pl
. As noted, you need to have the Lingua::EN::Numbers module installed for the code to work.
In case you're wondering, the readable part of the code simply looks like this:
use Compress::Zlib;eval uncompress '...'
where the ...
stands for 254 bytes of RFC 1950 compressed Perl code. Uncompressed, the code is 361 chars long and looks like this:
use Lingua'EN'Numbers"/e/";s==num2en(12-$i++)." "=e,y"." "for@n=qw=drummers.drumming pipers.piping lords.a.leaping ladies.dancing maids.a.milking swans.a.swimming geese.a.laying golden.rings calling.birds french.hens turtle.doves.and=;say"on the ".num2en_ordinal($_)." day of christmas my true love gave to me @n[$i--..@n]a partridge in a pear tree
"for 1..12
Writing this code was a weird kind of golfing exercise: it turns out the maximizing repetition and minimizing the number of distinct characters used are much more important than minimizing raw character count when the relevant metric is size after compression.
To squeeze out the last few chars, I wrote a simple program to try small variations of this code to find the one that compresses best. For compression, I used Ken Silverman's KZIP utility, which usually yield better compression rations (at the cost of speed) than standard Zlib even at the maximum compression settings. Of course, since KZIP only creates ZIP archives, I then had to extract the raw DEFLATE stream from the archive and wrap it in a RFC 1950 header and checksum. Here's the code I used for that:
use Compress::Zlib;
use 5.010;
@c = qw(e i n s);
@q = qw( " );
@p = qw( = @ ; , );
@n = ('\n',"\n");
$best = 999;
for$A(qw(e n .)){ for$B(@q){ for$C(@q,@p){ for$D(@p){ for$E(@q,@p){ for$F(qw(- _ . N E)){ for$G("-","-"eq$F?():$F){ for$H(@c){ for$I(@c,@p){ for$N(@n){ for$X(11,"\@$I"){ for$Y('$"','" "',$F=~/\w/?$F:()){ for$Z('".num2en_ordinal($_)."'){
$M="Lingua'EN'Numbers";
$code = q!use MB/A/B;sDDnum2en(12-$H++).YDe,yCFC Cfor@I=qwEdrummersFdrumming pipersFpiping lordsGaGleaping ladiesFdancing maidsGaGmilking swansGaGswimming geeseGaGlaying goldenFrings callingFbirds frenchFhens turtleFdovesFandE;say"on the Z day of christmas my true love gave to me @I[$H--..X]a partridge in a pear treeN"for 1..12!.$/;
$code =~ s/[A-Z]/${$&}/g;
open PL, ">12days.pl" and print PL $code and close PL or die $!;
$output = `kzipmix-20091108-linux/kzip -b0 -y 12days.pl.zip 12days.pl`;
($len) = ($output =~ /KSflating\s+(\d\d\d)/) or die $output;
open ZIP, "<12days.pl.zip" and $zip = join("", <ZIP>) and close ZIP or die $!;
($dfl) = ($zip =~ /12days\.pl(.{$len})/s) or die "Z $len: $code";
$dfl = "x\xDA$dfl" . pack N, adler32($code);
$dfl =~ s/\\(?=[\\'])|'/\\$&/g;
next if $best <= length $dfl;
$best = length $dfl;
$bestcode = $code;
warn "$A$B$C$D$E$F$G$H$I $X $Y $best: $bestcode\n";
open PL, ">12days_best.pl" and print PL "use Compress::Zlib;eval uncompress '$dfl'" and close PL or die $!;
}}}}}}
print STDERR "$A$B$C$D$E$F\r";
}}}}}}}
If this looks like a horrible kluge, it's because that's exactly what it is.
For historical interest, here's my original 438-char solution, which generates nicer output, including line breaks and punctuation:
y/_/ /,s/G/ing/for@l=qw(twelve_drummers_drummG eleven_pipers_pipG ten_lords-a-leapG nine_ladies_dancG eight_maids-a-milkG seven_swans-a-swimmG six_geese-a-layG five_golden_rGs four_callG_birds three_french_hens two_turtle_doves);s/e?t? .*/th/,s/vt/ft/for@n=@l;@n[9..11]=qw(third second first);say map("\u$_,\n","\nOn the $n[11-$_] day of Christmas,\nMy true love gave to me",@l[-$_..-1]),$_?"And a":A," partridge in a pear tree."for 0..11
Highlights of this version the pair of regexps s/e?t? .*/th/,s/vt/ft/
, which construct the ordinals for 4 to 12 from the cardinals at the beginning of the gift lines.
This code can, of course, also be compressed using the Zlib trick described above, but it turns out that simply compressing the output is more efficient, yielding the following 338-byte program (in Base64 format, again):
dXNlIENvbXByZXNzOjpabGliO3NheSB1bmNvbXByZXNzICd42uWTwU7DMAyG730KP8DGOyA0bsCB
vYBp3MYicSo7W9e3xx3ijCIQDHZIUjn683+/k3ZPAjUSDKxWIeACZYC7qGw1o226hwWqHghSORKM
6FMtkGnT3cKEWpXDSMACCBOhQlWim+7jUKO+SGg5dT8XqAetiSD4nrmPBMDPvXywtllF18OgJH2E
SGJfcR+Ky2KL/b0roMeUWEZ4cXb7biQeGol4LZQUSECdyn4A0vjUBvnMXCcYiYy2uE24ONcvgdOR
pBF9lYDNKObwNnPOTnc5kYjH2JZotyogI4c1Ueb06myXH1S48eYeWbyKgclcJr2D/dnwtfXZ7km8
qOeUiXBysP/VEUrt//LurIGJXCdSWxeHu4JW1ZnS0Ph8XOKloIecSe39w/murYdvbRU+Qyc=
I also have a 312-byte gzip archive of the lyrics, constructed from the same DEFLATE stream. I suppose you could call it a "zcat script". :)
3Can you provide the full version of each line? I'm used to "my true love gave to me" and the use of different versions might affect the solutions. – Matthew Read – 2011-12-07T20:48:30.473
Updated complete lyrics. – macek – 2011-12-07T20:56:48.663
Is that a "you can drop sentence capitals" or a "the whole text is case insensitive" kind of not worrying about capitalization? – J B – 2011-12-07T23:01:51.540
Also, in the line of ignoring punctuation, can we interchange punctuation for whitespace (and reciprocally)? – J B – 2011-12-07T23:03:17.497
JB, I hope my edit provides some more clarification for you – macek – 2011-12-07T23:34:05.033
Related : http://www.dezert-rose.com/humor/christmas/12daysreply.html
– Jeff Burdges – 2011-12-08T04:31:36.6871@macek: better, but the latent side of my question was: can I print hyphens instead of spaces as well? – J B – 2011-12-08T12:10:26.553
J B, you can print hyphens, too, yes. – macek – 2011-12-08T16:13:49.090
What about line breaks within the verses? Some solutions print the whole verse on a single line. Could I do the same? Or is whitespace even free-form as long as there is a blank line between verses and no blank line within verses? – Joey – 2011-12-08T16:20:50.267
Does "short" count bytes or characters? (e.g., Unicode could be used to encode roughly 4 chars
[A-Z\-]
of the poem in one char.) – Bruno Le Floch – 2011-12-08T17:08:41.140Bruno, usually characters are counted. Also note that Unicode has non-characters which must not appear in interchange and plenty of holes in the code space. You can't just take four bytes and call it UCS-4. It will probably make most tools rightfully throw up. (Also four bytes won't fit, not even 7-bit per byte as Unicode is a 21-bit code.) – Joey – 2011-12-08T17:12:39.383
Unicode-compliant tools are supposed to cope with any unassigned code point, so you've got roughly 1000000 code points (with a little care). That's more than 27^4, hence four letters+space. – Bruno Le Floch – 2011-12-08T17:47:21.900