19
1
Write a program or function that estimates the Shannon entropy of a given string.
If a string has n characters, d distinct characters, xi is the i th distinct character, and P(xi) is the probability of that character occuring in the string, then our Shannon entropy estimate for that string is given by:
For the estimation in this challenge we assume that the probability of a character occurring in a string is simply the number of times it occurs divided by the total number of characters.
Your answer must be accurate to at least 3 digits after the period.
Test cases:
"This is a test.", 45.094
"00001111", 8.000
"cwmfjordbankglyphsvextquiz", 122.211
" ", 0.0
Opposed to my usual challenges, this one looks complicated, but is actually quite simple :) – orlp – 2016-04-25T17:28:45.297
Related: http://codegolf.stackexchange.com/q/24316
– msh210 – 2016-04-25T17:56:39.067Is it safe to assume printable ASCII for the input string? – AdmBorkBork – 2016-04-25T18:00:07.240
@TimmyD No. Any string that your language's string type supports. – orlp – 2016-04-25T18:02:34.223
Unfortunately, Mathematica's
Entropy
counts bits per character, not total for the string; oh well... – 2012rcampion – 2016-04-26T02:47:48.203