Javascript (ES6), 6138 6038 bytes
These functions will encode and decode any printable ASCII characters as required by the challenge rules. They are however using a static Huffman code which is optimized for the example text. They should work pretty well on any other English text, as long as there aren't too many capital letters, digits or miscellaneous symbols. But they will perform poorly on a random input.
Edit: saved 94 bytes by rethinking the decoder logic and including all hints provided by Neil
Encoder (305 281 bytes)
s=>(n=0,C={},` e|ahinost|cdlr|.Ibfmuwy|',gpv|"k|;?ATqx|FHW|j|:`.split`|`.map((g,i)=>([...g].map(c=>C[c]=('00'+(n++).toString(2)).slice(-i-3)),n*=2)),s=s.replace(/./g,c=>C[c]||(524160|c.charCodeAt()).toString(2)),s.length&1&&(s+=0),s.replace(/../g,c=>'ATCG'[c=+`0b${c}`]+'TAGC'[c]))
Decoder (391 315 bytes)
s=>(n=N=0,C={},` e|ahinost|cdlr|.Ibfmuwy|',gpv|"k|;?ATqx|FHW|j|:`.split`|`.map((g,i)=>([...g].map(c=>(C[j=i+3]=(C[j]||{}))[n++]=c),n*=2)),s=s.replace(/../g,c=>''+((x='ATCG'.search(c[0]))>>1)+(x&1)),eval(`for(r='';N<s.length-1;){for(i=j=0;!(f=C[j]&&C[j][i])&&j++<19;i+=+s[N++]+i);r+=f||String.fromCharCode(i&127)}`))
DNA sequence (5442 bytes)
Below is the DNA sequence which is generated for the example text.
GCTATAATATCGCGCGTAGCGCATATCGATTAATATTACGGCGCCGTATACGATCGGCGCTAGCATTAGCTAATCGGCATTAGCCGATCGTAATATCGATGCCGATTAATGCATGCTAATGCTAATGCTAATATTAATTAGCCGGCCGATTATATAATCGTAATTAATGCATTAGCATATTACGCGTACGGCCGATATGCATCGATCGCGTAATGCGCCGGCATCGGCCGATTAATATTAGCGCATTACGATGCGCTAATATGCCGCGTATATACGCGGCATCGCGATGCTATAATTATAGCCGATTAGCGCGCATTATAATATTAATGCGCTATACGTAATCGTAATTAGCTAATGCTATAATCGCGATGCGCCGATTAGCATGCGCTACGATGCCGCGATGCCGATGCATTACGCGATATGCGCGCGCTAATGCGCCGATGCATTACGATATATCGGCATTACGATCGGCCGATGCCGTAGCGCTACGATTAATATTACGGCGCCGATCGATGCCGCGATGCCGTAATATCGATGCGCTAGCATTAATCGCGTAGCTACGATGCGCCGCGGCATTAATTAATTAGCGCTACGATTATACGATGCCGCGATTACGGCATTATAATGCCGTACGCGTACGGCTAGCGCCGTAGCATATATTACGCGCGATATGCTAATGCGCCGTAGCGCCGCGATATCGATGCGCTAGCATTACGCGGCGCCGATGCATTACGATATATCGTAGCCGCGGCATCGTAATGCCGCGATATGCGCGCTACGCGCGTATAGCATATCGCGTAATTAATCGCGTAGCTAGCATCGATGCGCCGCGGCTATAATATCGTAATCGATTAATTAGCATATCGTACGTACGCGTACGCGTACGCGATTATACGTAATTAGCATTAATCGTAATCGATTATACGATGCCGCGATTACGGCATTATAATGCCGTACGCGTACGGCTAGCGCCGTAGCATATATCGCGTATATACGCGTAATATGCTAATCGATGCTACGGCCGTACGCGATTAGCTAGCATTAGCATCGATTAATCGTAATATCGATATCGTACGGCATGCATTATAGCCGCGTACGCGTACGCGATTATAATCGTAGCGCTACGTAATTATAATCGCGGCTAATCGATTAATGCATTACGATATATCGTAGCCGGCTAATGCATGCTAATATTAATTAGCCGGCCGATTACGCGCGATTACGGCATTACGGCTAATTAGCATATTACGTAATATCGATATCGGCGCGCATGCCGATGCATATATCGCGTATATACGTAGCGCGCTATAGCCGTAGCGCCGCGATATCGATGCGCTAGCATTACGCGCGATCGCGTATATACGTAGCGCGCCGGCATTATAATCGCGCGTATAATATTATAATGCGCCGATCGTAATTAGCGCTACGGCATGCGCTAGCATTAATTACGGCCGATTAGCGCCGTACGCGCGCGGCCGGCGCTAATATTAGCGCGCCGATGCTACGTACGTACGCGATTAATTACGGCCGATTAATGCATTACGATGCGCATCGATCGCGTATAATCGATGCTACGATCGCGTAGCATGCTATAGCTACGATCGCGTATATAATCGCGATATCGCGTAATTAATCGTAATGCATCGATTACGCGTAATATCGTAGCGCATTAATTACGGCATATCGTACGGCTACGATATCGATCGCGCGATATTAATTATAATCGCGTAGCATCGATGCGCTACGTACGATGCGCTAGCCGATATCGATTAATTAGCCGGCCGATCGCGCGATATTAGCATATTAATTATATAATTAATTAGCCGTAATTACGCGCGATGCTACGATGCCGATTACGATGCGCGCATATGCCGCGATATGCGCGCTATACGATCGCGTATACGATGCCGTAGCGCTAATCGCGATGCTATAATTAGCATATCGTAGCTACGATTAGCCGATCGCGATTACGGCATTAATTAGCGCGCATGCCGTATACGCGCGATCGATTAATCGTAATTACGTAATGCCGGCCGGCATGCCGTACGGCCGATTAATATGCATGCTAATCGCGTACGCGTACGCGGCATCGTACGATGCATTAGCTACGATTAATCGTAATATCGCGTAATATGCTAATCGATGCTAATGCTAATGCTAATATTACGCGCGATCGGCATCGATGCCGATTAATGCGCTAGCGCCGGCGCATCGTACGGCATGCATCGTATAATTAATTATAATCGCGTAATTACGGCATTATAATGCCGTACGCGGCCGGCATTAATTACGGCCGATTAATATTACGGCGCCGATCGATGCCGCGATGCCGTAGCTAATATTAGCGCCGCGCGCGATTATAATCGCGTAATTAATCGCGTAGCATATTAATTATAATGCTAGCATATTAGCGCATCGATGCTATAATTAATCGTAATCGATGCCGATGCCGTACGGCATCGCGATGCCGATCGATGCATCGTAATATCGTACGGCTAATTAGCATGCTAATATCGCGTATAATCGATGCTAGCGCATTAATTAGCTAATTAGCATCGATCGCGTATATAATTAGCATATCGCGTAATTAATCGTAATGCATGCCGCGATATGCTATAATTATACGCGATGCCGTAGCATTATAGCATATGCTAGCATATCGTAGCCGCGTACGTAGCATCGATCGCGTATAATCGATCGGCATTAGCATTACGATCGTAATATGCATGCCGATCGCGTATAATGCCGTAATGCGCCGTAATTATAATCGCGTAATTATACGCGATGCCGATGCGCTAGCCGATTACGCGGCATCGTATAATTACGGCCGATTAATCGGCTATAATGCTAATATGCGCATCGATTACGTAGCCGTATACGCGGCCGTAGCGCATCGATGCCGCGTATATACGCGGCATCGCGATTAATGCATTAATGCATATATTATATAATGCGCCGATTAATATCGATATGCTACGATCGCGTAGCATGCTATAGCTAGCCGCGATATGCTATAATTAGCATATTATAATTAGCATATGCTATATAGCCGATCGTAATATGCGCATTATAATATGCGCGCGCGCTAGCATGCATGCTAATATGCTACGATCGCGTAGCATGCTATAGCTACGATTAATCGCGATTATAATCGCGGCTAATCGATCGGCCGGCTAGCATATTATAGCCGTATACGCGATTAGCGCGCATCGATTAATCGCGATTAATATGCCGTAATTATACGATCGGCGCTAATGCTAGCATATTACGCGATGCCGTAGCGCGCATATATCGCGTATAATGCCGTAATGCGCCGATCGTAATATCGTACGATCGTACGATATTACGGCATTATAATGCCGTACGCGGCCGGCATATCGTATAATATCGTAGCCGATTAATGCATTACGATATGCCGTAATTATAGCTACGGCCGATATCGGCGCATCGGCTAATATGCGCATGCGCCGTAATTATAATCGCGTAATATGCATGCCGGCCGTAGCATCGATCGTACGCGGCATGCGCATGCTACGCGCGGCCGTAGCATCGTAGCGCATCGATTAATGCATTAATGCATATATCGCGTATAATCGATGCGCTAGCCGTACGATCGGCATTACGTACGTAATGCATGCCGCGATATGCGCGCTACGCGCGTAATTACGGCCGCGTATACGCGCGATTATAATCGCGCGTATAATATCGCGTATAATCGATCGGCTAATTACGATCGATGCATGCATCGATTACGTAGCATTATAATCGCGTAATTACGGCGCCGATCGATGCCGCGATGCCGTAATATTAGCGCCGTAATTACGATGCGCCGATTACGGCCGATTACGTAGCATTAATTACGTACGGCCGTAGCATCGATCGCGCGATATATCGTATATATATACGTATAATCGGCTATAATATTACGTAGCCGTAATGCTACGCGCGCGTAATTATATAATATATGCGCTAGCATTACGATGCATATGCATGCCGCGTATAATTAATATGCTATAATATTACGCGTAATATGCATGCGCTAATTAGCATCGTACGTACGCGTACGTAGCGCGCTATAGCGCCGATATATGCTATAATATGCCGATATCGCGATGCGCATCGATCGCGTATATAATCGCGATATGCATGCGCATCGTACGGCTATATAATCGATCGGCATCGATGCCGATCGTAATCGTAATTATAATCGCGTAATTATACGCGATGCATTAATTACGTAGCTAATATTACGCGGCTAATATTAATCGGCGCTAGCCGTAATATCGATATGCGCGCCGTAGCATCGTACGTACGCGTACGCGATTAGCGCGCGCGCGCCGATTATAGCCGATATGCATCGATCGCGTATATACGCGTAATATCGATTACGTACGCGTATAATGCTAATGCTATACGATTAATCGTATAGCCGTAATCGATTAATGCATTAATGCATATATATGCGCGCGCTATACGCGTACGCGATATGCATGCCGATCGCGTATAATCGATGCATTAATTAGCTAATTAGCATCGATGCTAGCCGATGCATGCGCATTAATGCGCGCCGTAATTAGCGCGCGCATCGGCGCTACGATTACGCGTAATATGCTATAATATTAATATGCATGCTAATCGCGTACGCGTACGCGGCTAGCGCCGTAATTAGCGCCGCGGCATTACGATATTAGCGCTACGGCATGCGCTAGCCGTAATTAATTACGGCCGATTACGTAGCCGCGATGCCGTAATGCATGCTAATGCATGCGCGCCGCGATTAGCGCGCATGCCGTAATGCATGCTAATGCTAATATGCGCATCGATGCCGCGTATATACGCGGCATCGCGATCGCGTATAATCGATCGTACGGCATGCATTATAGCCGGCATTAATTAGCGCTACGGCGCATTAGCTATACGATATGCTAGCGCGCTAATTAATTAATATGCGCCGATGCCGGCATATCGTATAGCCGGCGCATCGATCGCGCGATATTATAATCGCGTAATATTAGCGCGCCGGCTACGTACGCGCGATGCGCATATTATAGCCGCGGCGCATCGATCGCGTATAATCGATGCCGATGCCGGCCGTACGCGATGCCGTAGCCGGCATATCGATGCGCTAGCATTATAATCGCGTAATATCGTAGCTAATTAATTAATTACGGCCGATTAATATTACGGCGCCGATCGATGCCGCGATGCCGTAGCTAATATTACGCGGCTAATATCGATTAGCGCATTAGCTACGATTAATCGGCGCTAGCCGTAGCTAATATTACGCGCGATCGGCGCATATGCGCGCCGATCGCGATTAGCATCGGCGCTAGCATGCCGTACGTACGCGTAATTAGCCGGCCGATTATACGATGCCGCGATATGCTATAATATCGTAGCCGTAGCTACGCGCGGCATCGCGTATACGCGCGCGTAGCTAAT
Demo
The snippet below includes some demonstration code.
let e =
s=>(n=0,C={},` e|ahinost|cdlr|.Ibfmuwy|',gpv|"k|;?ATqx|FHW|j|:`.split`|`.map((g,i)=>([...g].map(c=>C[c]=('00'+(n++).toString(2)).slice(-i-3)),n*=2)),s=s.replace(/./g,c=>C[c]||(524160|c.charCodeAt()).toString(2)),s.length&1&&(s+=0),s.replace(/../g,c=>'ATCG'[c=+`0b${c}`]+'TAGC'[c]))
let d =
s=>(n=N=0,C={},` e|ahinost|cdlr|.Ibfmuwy|',gpv|"k|;?ATqx|FHW|j|:`.split`|`.map((g,i)=>([...g].map(c=>(C[j=i+3]=(C[j]||{}))[n++]=c),n*=2)),s=s.replace(/../g,c=>''+((x='ATCG'.search(c[0]))>>1)+(x&1)),eval(`for(r='';N<s.length-1;){for(i=j=0;!(f=C[j]&&C[j][i])&&j++<19;i+=+s[N++]+i);r+=f||String.fromCharCode(i&127)}`))
function encode() {
var txt, dna;
if(txt = document.getElementsByTagName('textarea')[0].value) {
dna = e(txt);
document.getElementsByTagName('textarea')[0].value = '';
document.getElementsByTagName('textarea')[1].value = dna;
document.getElementsByTagName('div')[0].innerHTML = 'DNA length: ' + dna.length + ' (' + (dna.length / txt.length).toFixed(2) + ' nucleotides per character)';
}
}
function decode() {
var txt, dna;
if(dna = document.getElementsByTagName('textarea')[1].value) {
txt = d(dna);
document.getElementsByTagName('textarea')[0].value = txt;
document.getElementsByTagName('textarea')[1].value = '';
document.getElementsByTagName('div')[0].innerHTML = 'Text length: ' + txt.length;
}
}
textarea {font-size:10px;font-family:Arial;width:400px;height:70px}
<textarea>I have a friend who's an artist and has sometimes taken a view which I don't agree with very well. He'll hold up a flower and say "look how beautiful it is," and I'll agree. Then he says "I as an artist can see how beautiful this is but you as a scientist take this all apart and it becomes a dull thing," and I think that he's kind of nutty. First of all, the beauty that he sees is available to other people and to me too, I believe. Although I may not be quite as refined aesthetically as he is ... I can appreciate the beauty of a flower. At the same time, I see much more about the flower than he sees. I could imagine the cells in there, the complicated actions inside, which also have a beauty. I mean it's not just beauty at this dimension, at one centimeter; there's also beauty at smaller dimensions, the inner structure, also the processes. The fact that the colors in the flower evolved in order to attract insects to pollinate it is interesting; it means that insects can see the color. It adds a question: does this aesthetic sense also exist in the lower forms? Why is it aesthetic? All kinds of interesting questions which the science knowledge only adds to the excitement, the mystery and the awe of a flower. It only adds. I don't understand how it subtracts.</textarea><br>
<button onclick="encode()">Text -> DNA</button><button onclick="decode()">DNA -> Text</button><br><textarea></textarea><div></div>
Why is the encoder not included in the byte-count? – Leaky Nun – 2016-08-30T08:47:01.370
@LeakyNun I don't know, I guess I didn't think about both of them. It's been included now – Beta Decay – 2016-08-30T08:50:53.473
1Any valid program in Brainfuck is an equivalent valid program in DNA#. – Leaky Nun – 2016-08-30T08:53:31.550
@LeakyNun I know but I still want to see it in DNA# – Beta Decay – 2016-08-30T08:54:12.693
Are only alpha numeric, dot, space coma and quote valid or any printable ascii character allowed in the input? – Sefa – 2016-08-30T09:00:08.373
If you're including the encoder in the score then you should remove the tag [tag:kolmogorov-complexity] – Peter Taylor – 2016-08-30T09:44:58.990
@PeterTaylor Oh, okay – Beta Decay – 2016-08-30T09:45:26.707
What's the maximum input length? – Mast – 2016-08-30T12:11:23.603
@Mast There isn't a maximum As long as your program will handle – Beta Decay – 2016-08-30T12:41:11.093
any source for DNA# interpreter? the linked one on esolang seems to be broken for me
– Aaron – 2016-08-30T20:03:56.387@Aaron Not that I know of, but I'll have a look around – Beta Decay – 2016-08-30T21:07:12.960
@βετѧΛєҫαγ: This challenge gives me an idea for another DNA-related one (namely, restriction mapping)! Unmodified, it would probably be too complicated for a classical code golf task, however. – Tim Čas – 2016-08-30T23:45:01.763
May the encoder and the decoder have some code in common? Or are they supposed to be separate standalone modules? – Arnauld – 2016-08-31T09:05:54.700
@Arnauld They have to be completely standalone – Beta Decay – 2016-08-31T09:07:04.017
I'm working on a simple python interpreter for DNA#, although I've decided to make one small alteration to the language.. where's the best place to post the script? – Aaron – 2016-09-01T19:34:59.390
@Aaron Anywhere really. Github, Pastebin... – Beta Decay – 2016-09-01T19:36:50.263