8
1
Sometimes when you press a key on a keyboard, the letter doesn't always display on screen. Whether this is because of a dodgy connection or otherwise, you have decided to write a script to control the probability of a letter displaying on screen when the corresponding key is pressed.
One day, you decide to buy a monkey and sit it down at a keyboard. Being curious, you decide to find out what the key probabilities are to help the monkey to write Hamlet in its entirety.
Your challenge is to calculate the probabilities for each character so that the passage is typed out in the least number of characters.
Here's a list of all of the characters you should consolidate:
qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM1234567890!"':;.()?,
Newlines and spaces are included in the above list. You must only use the characters in the list, discounting all other characters (remove them from the passage).
The program must be a program with the passage supplied via STDIN. The output must be to STDOUT.
Since this is a Rosetta Stone challenge, you must write as many programs in different languages as possible.
To win, you need to have the shortest code in the most number of languages.
Test case 1
Shall I compare thee to a summer's day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer's lease hath all too short a date
Answer:
{
'\n': 0.017543859649122806,
' ': 0.16959064327485379,
'!': 0.0,
'"': 0.0,
"'": 0.011695906432748537,
'(': 0.0,
')': 0.0,
',': 0.0058479532163742687,
'.': 0.0,
'0': 0.0,
'1': 0.0,
'2': 0.0,
'3': 0.0,
'4': 0.0,
'5': 0.0,
'6': 0.0,
'7': 0.0,
'8': 0.0,
'9': 0.0,
':': 0.0058479532163742687,
';': 0.0,
'?': 0.0058479532163742687,
'A': 0.0058479532163742687,
'B': 0.0,
'C': 0.0,
'D': 0.0,
'E': 0.0,
'F': 0.0,
'G': 0.0,
'H': 0.0,
'I': 0.0058479532163742687,
'J': 0.0,
'K': 0.0,
'L': 0.0,
'M': 0.0058479532163742687,
'N': 0.0,
'O': 0.0,
'P': 0.0,
'Q': 0.0,
'R': 0.0058479532163742687,
'S': 0.0058479532163742687,
'T': 0.0058479532163742687,
'U': 0.0,
'V': 0.0,
'W': 0.0,
'X': 0.0,
'Y': 0.0,
'Z': 0.0,
'a': 0.08771929824561403,
'b': 0.0058479532163742687,
'c': 0.0058479532163742687,
'd': 0.046783625730994149,
'e': 0.093567251461988299,
'f': 0.0058479532163742687,
'g': 0.011695906432748537,
'h': 0.052631578947368418,
'i': 0.011695906432748537,
'j': 0.0,
'k': 0.0058479532163742687,
'l': 0.046783625730994149,
'm': 0.046783625730994149,
'n': 0.023391812865497075,
'o': 0.070175438596491224,
'p': 0.011695906432748537,
'q': 0.0,
'r': 0.052631578947368418,
's': 0.052631578947368418,
't': 0.058479532163742687,
'u': 0.029239766081871343,
'v': 0.0058479532163742687,
'w': 0.0058479532163742687,
'x': 0.0,
'y': 0.017543859649122806,
'z': 0.0
}
Test case 2
Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battlefield of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.
Answer:
{
'\n': 0.0036036036036036037,
' ': 0.18018018018018017,
'!': 0.0,
'"': 0.0,
"'": 0.0,
'(': 0.0,
')': 0.0,
',': 0.010810810810810811,
'.': 0.0090090090090090089,
'0': 0.0,
'1': 0.0,
'2': 0.0,
'3': 0.0,
'4': 0.0,
'5': 0.0,
'6': 0.0,
'7': 0.0,
'8': 0.0,
'9': 0.0,
':': 0.0,
';': 0.0,
'?': 0.0,
'A': 0.0,
'B': 0.0,
'C': 0.0,
'D': 0.0,
'E': 0.0,
'F': 0.0018018018018018018,
'G': 0.0,
'H': 0.0,
'I': 0.0018018018018018018,
'J': 0.0,
'K': 0.0,
'L': 0.0,
'M': 0.0,
'N': 0.0018018018018018018,
'O': 0.0,
'P': 0.0,
'Q': 0.0,
'R': 0.0,
'S': 0.0,
'T': 0.0,
'U': 0.0,
'V': 0.0,
'W': 0.0036036036036036037,
'X': 0.0,
'Y': 0.0,
'Z': 0.0,
'a': 0.082882882882882883,
'b': 0.0054054054054054057,
'c': 0.025225225225225224,
'd': 0.03783783783783784,
'e': 0.10270270270270271,
'f': 0.016216216216216217,
'g': 0.023423423423423424,
'h': 0.041441441441441441,
'i': 0.057657657657657659,
'j': 0.0,
'k': 0.0,
'l': 0.027027027027027029,
'm': 0.0072072072072072073,
'n': 0.063063063063063057,
'o': 0.066666666666666666,
'p': 0.010810810810810811,
'q': 0.0018018018018018018,
'r': 0.050450450450450449,
's': 0.028828828828828829,
't': 0.093693693693693694,
'u': 0.010810810810810811,
'v': 0.014414414414414415,
'w': 0.014414414414414415,
'x': 0.0,
'y': 0.0054054054054054057,
'z': 0.0
}
See the theory here.
Leaderboard
C - 371 - Gerwin Dox
Java - 788 - Luminous
18English is not my mother tongue so maybe I'm lacking comprehension skills but after reading the challenge twice, I still don't understand what you are asking. – Michael M. – 2014-10-31T10:20:44.623
@Mig I think we have to find the probability that the input text is displayed when using only random characters. For instance, if the set of characters is a-z, we have 1 chance out of 26^4 that "toto" will be displayed with only random characters – Ethiraric – 2014-10-31T10:48:14.973
1From what I understand, we're supposed to map each printable character to a probability of that character not displaying when the corresponding key is pressed. The probabilities have to be tweaked so that when pressing keys at random (monkey on the keyboard) the (full?) text of Hamlet shows up as soon as possible. Did I get that right ? – karhell – 2014-10-31T10:51:20.523
2Are you asking us to count the number of times each character occurs and divide by the total number of characters, effectively giving the distribution of characters in the text? Obviously if the monkey typed with these probabilities he would have greatest chances of success. If that's not it, I have no idea what you are asking for. – Level River St – 2014-10-31T11:27:00.490
@karhell Yes, that is correct – Beta Decay – 2014-10-31T12:17:37.633
3Program or function? Input format: argument,stdin or file? Output format: array or stdout? And most importantly, do we have to include only the required symbols, or can we include everything from
!(33) toz(122) or~(126)? When we divide by the total number of characters, is it OK to simply divide by the length of the input, or do we have to exclude the characters that are not in the list (space etc.)? – Level River St – 2014-10-31T12:36:56.8772What exactly is the winning criteria? There have been a few different scoring mechanisms for [rosetta-stone] in the past, and there isn't a default as far as I'm aware. – Geobits – 2014-10-31T12:59:52.283
1Going off of what karhell said, we're suppose to imagine a monkey randomly typing on a keyboard. Now considering that sometimes the pressed keys don't display, what is the shortest amount of pressed keys the monkey would press in order to write the given string? Are the probabilities static? Should I assume the monkey wouldn't actually type out the given string even if it's supposedly typing at random? – Luminous – 2014-10-31T15:36:04.150
How do you take the hold off? I wanna submit an answer... :/ – Luminous – 2014-10-31T18:40:57.733
@Luminous Above my username at the bottom of the question, you should see the
reopenlink. Click that to vote to take the hold off. – Beta Decay – 2014-10-31T19:29:07.423I'm sorry I can't. Takes 3000 rep to cast a reopen vote. – Luminous – 2014-10-31T19:47:45.480
@luminous Actually, it's 500, but oh well – Beta Decay – 2014-10-31T20:14:32.187
I think showing the expected output for the two examples would help clarify the question. – hmatt1 – 2014-10-31T21:28:00.407
Your second test case gives lower case 'e' a probability greater than 1. – Nathaniel – 2014-11-06T09:43:36.677
@Nathaniel Huh? How? – Beta Decay – 2014-11-06T10:55:48.577
@BetaDecay I have no idea, but it's right there in the output you quoted:
'e': 0.10270270270270271,. – Nathaniel – 2014-11-06T13:43:09.617@Nathaniel That isn't more than one... – Beta Decay – 2014-11-06T14:44:45.913
@Nathaniel
100 x 0.10270270270270271 = 10.270270...That's around 10%. – Luminous – 2014-11-06T15:19:14.5601I am confused, on the test output it seems like those are ratios of the specific characters in the input text. But the question is about best probabilities for minimum number of key presses (you say characters above but this does not make sense to me as number of characters is constant). So even if correlating occurrence ratio with those probabilities is good idea surely normalizing them such that the highest probability is 1 will be better. – nutki – 2014-11-06T15:22:46.197
Also why the leader board lists code lengths while the rules are rosetta-stone? The two first answers don't even do the same thing. – nutki – 2014-11-06T15:24:58.727
@nutki Re. the leaderboard, it's the number of languages you have the shortest code in. Currently the leaderboard is showing who has the shortest code in each language. – Beta Decay – 2014-11-06T17:38:01.897
Oops, I somehow consistently misread that as 1.027... Sorry for my silly mistake. – Nathaniel – 2014-11-06T23:37:44.090