Monkeys at a Typewriter

8

1

Sometimes when you press a key on a keyboard, the letter doesn't always display on screen. Whether this is because of a dodgy connection or otherwise, you have decided to write a script to control the probability of a letter displaying on screen when the corresponding key is pressed.

One day, you decide to buy a monkey and sit it down at a keyboard. Being curious, you decide to find out what the key probabilities are to help the monkey to write Hamlet in its entirety.

Your challenge is to calculate the probabilities for each character so that the passage is typed out in the least number of characters.

Here's a list of all of the characters you should consolidate:

qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM1234567890!"':;.()?,

Newlines and spaces are included in the above list. You must only use the characters in the list, discounting all other characters (remove them from the passage).

The program must be a program with the passage supplied via STDIN. The output must be to STDOUT.

Since this is a Rosetta Stone challenge, you must write as many programs in different languages as possible.

To win, you need to have the shortest code in the most number of languages.

Test case 1

Shall I compare thee to a summer's day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer's lease hath all too short a date

Answer:

{
'\n': 0.017543859649122806,
' ': 0.16959064327485379,
'!': 0.0,
'"': 0.0,
"'": 0.011695906432748537,
'(': 0.0,
')': 0.0,
',': 0.0058479532163742687,
'.': 0.0,
'0': 0.0,
'1': 0.0,
'2': 0.0,
'3': 0.0,
'4': 0.0,
'5': 0.0,
'6': 0.0,
'7': 0.0,
'8': 0.0,
'9': 0.0,
':': 0.0058479532163742687,
';': 0.0,
'?': 0.0058479532163742687,
'A': 0.0058479532163742687,
'B': 0.0,
'C': 0.0,
'D': 0.0,
'E': 0.0,
'F': 0.0,
'G': 0.0,
'H': 0.0,
'I': 0.0058479532163742687,
'J': 0.0,
'K': 0.0,
'L': 0.0,
'M': 0.0058479532163742687,
'N': 0.0,
'O': 0.0,
'P': 0.0,
'Q': 0.0,
'R': 0.0058479532163742687,
'S': 0.0058479532163742687,
'T': 0.0058479532163742687,
'U': 0.0,
'V': 0.0,
'W': 0.0,
'X': 0.0,
'Y': 0.0,
'Z': 0.0,
'a': 0.08771929824561403,
'b': 0.0058479532163742687,
'c': 0.0058479532163742687,
'd': 0.046783625730994149,
'e': 0.093567251461988299,
'f': 0.0058479532163742687,
'g': 0.011695906432748537,
'h': 0.052631578947368418,
'i': 0.011695906432748537,
'j': 0.0,
'k': 0.0058479532163742687,
'l': 0.046783625730994149,
'm': 0.046783625730994149,
'n': 0.023391812865497075,
'o': 0.070175438596491224,
'p': 0.011695906432748537,
'q': 0.0,
'r': 0.052631578947368418,
's': 0.052631578947368418,
't': 0.058479532163742687,
'u': 0.029239766081871343,
'v': 0.0058479532163742687,
'w': 0.0058479532163742687,
'x': 0.0,
'y': 0.017543859649122806,
'z': 0.0
}

Test case 2

Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal.

Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battlefield of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.

Answer:

{
'\n': 0.0036036036036036037,
' ': 0.18018018018018017,
'!': 0.0,
'"': 0.0,
"'": 0.0,
'(': 0.0,
')': 0.0,
',': 0.010810810810810811,
'.': 0.0090090090090090089,
'0': 0.0,
'1': 0.0,
'2': 0.0,
'3': 0.0,
'4': 0.0,
'5': 0.0,
'6': 0.0,
'7': 0.0,
'8': 0.0,
'9': 0.0,
':': 0.0,
';': 0.0,
'?': 0.0,
'A': 0.0,
'B': 0.0,
'C': 0.0,
'D': 0.0,
'E': 0.0,
'F': 0.0018018018018018018,
'G': 0.0,
'H': 0.0,
'I': 0.0018018018018018018,
'J': 0.0,
'K': 0.0,
'L': 0.0,
'M': 0.0,
'N': 0.0018018018018018018,
'O': 0.0,
'P': 0.0,
'Q': 0.0,
'R': 0.0,
'S': 0.0,
'T': 0.0,
'U': 0.0,
'V': 0.0,
'W': 0.0036036036036036037,
'X': 0.0,
'Y': 0.0,
'Z': 0.0,
'a': 0.082882882882882883,
'b': 0.0054054054054054057,
'c': 0.025225225225225224,
'd': 0.03783783783783784,
'e': 0.10270270270270271,
'f': 0.016216216216216217,
'g': 0.023423423423423424,
'h': 0.041441441441441441,
'i': 0.057657657657657659,
'j': 0.0,
'k': 0.0,
'l': 0.027027027027027029,
'm': 0.0072072072072072073,
'n': 0.063063063063063057,
'o': 0.066666666666666666,
'p': 0.010810810810810811,
'q': 0.0018018018018018018,
'r': 0.050450450450450449,
's': 0.028828828828828829,
't': 0.093693693693693694,
'u': 0.010810810810810811,
'v': 0.014414414414414415,
'w': 0.014414414414414415,
'x': 0.0,
'y': 0.0054054054054054057,
'z': 0.0
}

See the theory here.

Leaderboard

C - 371 - Gerwin Dox
Java - 788 - Luminous

Beta Decay

Posted 2014-10-31T08:18:53.013

Reputation: 21 478

18English is not my mother tongue so maybe I'm lacking comprehension skills but after reading the challenge twice, I still don't understand what you are asking. – Michael M. – 2014-10-31T10:20:44.623

@Mig I think we have to find the probability that the input text is displayed when using only random characters. For instance, if the set of characters is a-z, we have 1 chance out of 26^4 that "toto" will be displayed with only random characters – Ethiraric – 2014-10-31T10:48:14.973

1From what I understand, we're supposed to map each printable character to a probability of that character not displaying when the corresponding key is pressed. The probabilities have to be tweaked so that when pressing keys at random (monkey on the keyboard) the (full?) text of Hamlet shows up as soon as possible. Did I get that right ? – karhell – 2014-10-31T10:51:20.523

2Are you asking us to count the number of times each character occurs and divide by the total number of characters, effectively giving the distribution of characters in the text? Obviously if the monkey typed with these probabilities he would have greatest chances of success. If that's not it, I have no idea what you are asking for. – Level River St – 2014-10-31T11:27:00.490

@karhell Yes, that is correct – Beta Decay – 2014-10-31T12:17:37.633

3Program or function? Input format: argument,stdin or file? Output format: array or stdout? And most importantly, do we have to include only the required symbols, or can we include everything from !(33) to z(122) or ~ (126)? When we divide by the total number of characters, is it OK to simply divide by the length of the input, or do we have to exclude the characters that are not in the list (space etc.)? – Level River St – 2014-10-31T12:36:56.877

2What exactly is the winning criteria? There have been a few different scoring mechanisms for [rosetta-stone] in the past, and there isn't a default as far as I'm aware. – Geobits – 2014-10-31T12:59:52.283

1Going off of what karhell said, we're suppose to imagine a monkey randomly typing on a keyboard. Now considering that sometimes the pressed keys don't display, what is the shortest amount of pressed keys the monkey would press in order to write the given string? Are the probabilities static? Should I assume the monkey wouldn't actually type out the given string even if it's supposedly typing at random? – Luminous – 2014-10-31T15:36:04.150

How do you take the hold off? I wanna submit an answer... :/ – Luminous – 2014-10-31T18:40:57.733

@Luminous Above my username at the bottom of the question, you should see the reopen link. Click that to vote to take the hold off. – Beta Decay – 2014-10-31T19:29:07.423

I'm sorry I can't. Takes 3000 rep to cast a reopen vote. – Luminous – 2014-10-31T19:47:45.480

@luminous Actually, it's 500, but oh well – Beta Decay – 2014-10-31T20:14:32.187

I think showing the expected output for the two examples would help clarify the question. – hmatt1 – 2014-10-31T21:28:00.407

Your second test case gives lower case 'e' a probability greater than 1. – Nathaniel – 2014-11-06T09:43:36.677

@Nathaniel Huh? How? – Beta Decay – 2014-11-06T10:55:48.577

@BetaDecay I have no idea, but it's right there in the output you quoted: 'e': 0.10270270270270271,. – Nathaniel – 2014-11-06T13:43:09.617

@Nathaniel That isn't more than one... – Beta Decay – 2014-11-06T14:44:45.913

@Nathaniel 100 x 0.10270270270270271 = 10.270270... That's around 10%. – Luminous – 2014-11-06T15:19:14.560

1I am confused, on the test output it seems like those are ratios of the specific characters in the input text. But the question is about best probabilities for minimum number of key presses (you say characters above but this does not make sense to me as number of characters is constant). So even if correlating occurrence ratio with those probabilities is good idea surely normalizing them such that the highest probability is 1 will be better. – nutki – 2014-11-06T15:22:46.197

Also why the leader board lists code lengths while the rules are rosetta-stone? The two first answers don't even do the same thing. – nutki – 2014-11-06T15:24:58.727

@nutki Re. the leaderboard, it's the number of languages you have the shortest code in. Currently the leaderboard is showing who has the shortest code in each language. – Beta Decay – 2014-11-06T17:38:01.897

Oops, I somehow consistently misread that as 1.027... Sorry for my silly mistake. – Nathaniel – 2014-11-06T23:37:44.090

Answers

1

Java 788 743

Edit

Found so much I could golf down. Replaced whole words w/ letters, made method for putting keys into the HashMap, and moved all integer variables to one declaration.


Not expecting this to beat any records, but oh well. This version REQUIRES you to enter how many lines of input your giving it since java likes to block on any methods that takes input.

So example would be:

4
Shall I compare thee to a summer's day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer's lease hath all too short a date

import java.util.*;class A{static HashMap<Character,Double>m=new HashMap<Character,Double>();public static void main(String[]g){Scanner s=new Scanner(System.in);int n,k,e,i='@';for(;i++<'[';)g((char)i);for(i='`';i++<'{';)g((char)i);for(i='/';i++<':';)g((char)i);char[]r={' ','!','"','\'',':',';','.','(',')','?',','};p(r);e=s.nextInt();s.nextLine();String l="";n=0;k=0;while(k<e){l=s.nextLine();for(i=0;i<l.length();i++){char c=l.charAt(i);m.put(c,m.get(c)+1);n++;}k++;n++;}m.put('\n',(double)(e-1));for(Iterator b=m.entrySet().iterator();b.hasNext();){Map.Entry p=(Map.Entry)b.next();System.out.println(p.getKey()+" = "+((double)p.getValue())/(n-1));b.remove();}}static void p(char[]a){for(char b:a)g(b);}static void g(char a){m.put(a,0.0);}}

UnGolfed

import java.util.*;
class A{
static HashMap<Character,Double>m=new HashMap<Character,Double>();
public static void main(String[]g){
    Scanner s=new Scanner(System.in);
    int n,k,e,i='@';
    for(;i++<'[';)g((char)i);
    for(i='`';i++<'{';)g((char)i);
    for(i='/';i++<':';)g((char)i);
    char[]r={' ','!','"','\'',':',';','.','(',')','?',','};
    p(r);
    e=s.nextInt();
    s.nextLine();
    String l="";
    n=0;
    k=0;
    while(k<e){
        l=s.nextLine();
        for(i=0;i<l.length();i++){
            char c=l.charAt(i);
            m.put(c,m.get(c)+1);
            n++;
        }
        k++;
        n++;
    }
    m.put('\n',(double)(e-1));
    for(Iterator b=m.entrySet().iterator();b.hasNext();){
        Map.Entry p=(Map.Entry)b.next();
        System.out.println(p.getKey()+" = "+((double)p.getValue())/(n-1));
        b.remove();
    }
}
static void p(char[]a){
    for(char b:a)g(b);
}
static void g(char a){
    m.put(a,0.0);
}

}

Luminous

Posted 2014-10-31T08:18:53.013

Reputation: 309

0

C, 324 bytes (can someone check that for me?)

Maybe I need to resharpen my method, it's a bit long.

main(){char*d="qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM1234567890!\"':;.()?,\n ",*p,*s,*c=calloc(1,1000),*r=c;int n=-1;do{scanf("%c",++r);for(p=d;*p;p++)n+=*p==*r?1:0;}while(*((short*)(r-1))!=2570);p=d;for(;*p;p++){int q=0;for(s=c;*(s+1);s++)q+=(*s==*p?1:0);printf("'%c':%f\n",*p-10?*p:244,(float)q/n);}free(c);}

Also, I can prove that it works.

Gerwin

Posted 2014-10-31T08:18:53.013

Reputation: 126

I count 371 bytes – Beta Decay – 2014-11-06T19:26:11.850

Microsoft word says 321(without spaces, of course) – Gerwin – 2014-11-06T21:59:46.443

Well, you need some spaces: intn=-1 does not work. – urzeit – 2014-11-07T14:01:49.147

I only need 3 of them :) – Gerwin – 2014-11-07T18:26:20.747