Morse code to standard output

13

2

This question involves taking input in Morse code as . (period) and - (minus symbol), with spaces to separate the input. Your task is to convert the code to standard output. You can assume that the only input contains character symbols found in the International Morse Code alphabet, found here: http://en.wikipedia.org/wiki/Morse_code#Letters.2C_numbers.2C_punctuation.

All output should use lowercase letters. A double space should be interpreted as a word space.

Sample input:

. -..- .- -- .--. .-.. . .-.-.-  ... --- ...

Output:

example. sos

Shortest code after two weeks wins.

user10766

Posted 2014-01-09T04:14:21.530

Reputation:

You say only 'character symbols' is that characters and symbols? – Sinkingpoint – 2014-01-09T04:59:32.393

@Quirliom All the "symbols" in that link are characters. Anything that you can put in a String is a character (well, basically). However, that part of the question is basically saying that every bit of morse will be valid. – Justin – 2014-01-09T05:03:53.793

@Quirliom Yes, every Morse 'character', such as .- for 'a' and . for 'e' is valid. No non-Morse characters need be handled. – None – 2014-01-09T05:58:08.117

What about letter space and word space ? One space for the former and two (or more) for the latter ? – Paul R – 2014-01-09T10:30:48.010

Slighly (un)related: http://stackoverflow.com/questions/1352587/code-golf-morse-code

– javatarz – 2014-01-09T12:07:57.793

@PaulR Yes, that sounds good. – None – 2014-01-09T15:59:49.623

Answers

8

Mathematica 62

Mathematica allows us to cheat

f=ToLowerCase@StringDrop[WolframAlpha[". .- "<>#,"Result"],2]&

f@"."
f@". -..- .- -- .--. .-.. . .-.-.-"
f@".... .- ...- .  -.-- --- ..-  -- --- --- . -..  - --- -.. .- -.-- ..--.."

e

example.

have you mooed today?

The first two symbols . and .- are necessary to interpret small codes correctly.

ybeltukov

Posted 2014-01-09T04:14:21.530

Reputation: 1 841

This is missing the conversion to lower case. – Peter Taylor – 2014-01-10T21:48:28.100

@PeterTaylor It can be easily modified to f=ToLowerCase@StringDrop[WolframAlpha[". .- "<>#,"Result"],2]& for lower case. – ybeltukov – 2014-01-10T22:44:09.270

Doesn't using the Wolfram Alpha api require an application id? If so, shouldn't that add to the character count? Nevertheless very clever solution. – Björn Lindqvist – 2014-01-13T13:34:49.760

@BjörnLindqvist Just evaluate exactly this command in Mathematica, it forks fine. – ybeltukov – 2014-01-13T14:50:42.697

23

Drat, I was hoping to get here before the GolfScripters arrived :-(

Anyhoo...

C: 228 characters:

char n,t,m[9],*c=" etianmsurwdkgohvf l pjbxcyzq  54 3   2& +    16=/   ( 7   8 90    $       ?_    \"  .    @   '  -        ;! )     ,    :";
main(){while(scanf("%s",m)>0){for(t=m[6]=n=0;m[n];n++)t+=t+1+(m[n]&1);putchar(c[t]);}}

I thought I'd add an explanation of how this works.

The input data is parsed according to the tree data in *c, which can be expanded as follows (using · to represent a vacant node):

                     dot <-- (start) --> dash
                e                               t
        i               a               n               m
    s       u       r       w       d       k       g       o
  h   v   f   ·   l   ·   p   j   b   x   c   y   z   q   ·   ·
 5 4 · 3 · · · 2 & · + · · · · 1 6 = / · · · ( · 7 · · · 8 · 9 0
····$·······?_····"··.····@···'··-········;!·)·····,····:·······

Starting at the top of the tree, work your way down while moving to the left for a dot and to the right for a dash. Then output whatever character you happen to be at when the input string ends (i.e., when a whitespace character is encountered). So for example, three dots and a dash will take you to v via e, i and s. Instead of explicitly checking for dots (ASCII \x2e) and dashes (ASCII \x2d), we only need to check the last bit (m[n]&1), which is 0 for . and 1 for -.

Six rows are enough to encode everything except $, which has 7 dot/dashes: ...-..-, but since the input data is guaranteed to be valid, this can easily be fixed by truncating the input at 6 characters (m[6]=0) and interpreting ...-.. as $ instead. We can also cut away the last 7 bytes from the tree data, since they are all empty and aren't needed if the input is valid.

r3mainer

Posted 2014-01-09T04:14:21.530

Reputation: 19 135

1Nice trick to discard the last character of 6-character codes and shorten the lookup table. – Peter Taylor – 2014-01-09T11:48:36.507

2I am up voting as much for the clarity of the discussion as for the quality of the algorithm. Good work. – Michael Stern – 2014-01-09T16:39:47.413

See if you could shave off a few chars by processing character-by-character instead of reading a whole string in. c could be inlined. Perhaps you could use modulo & an offset to try to squish the higher values together; this is what I do in my solution. Anyway, nice work! – FireFly – 2014-01-13T13:36:31.303

8

GolfScript (116 113 97 chars)

This includes non-printable characters used in a lookup table, so I'm giving it as xxd output:

0000000: 6e2d 2720 272f 7b60 7b5c 6261 7365 2035
0000010: 3925 2210 a9cd 238d 57aa 8c17 d25c d31b
0000020: 432d 783e 277a 3823 e146 e833 6423 23ac
0000030: e72a 39d5 021c 4e33 3b30 3138 dc51 2044
0000040: 3aa7 d001 df4b 2032 333f 36ae 51c3 223d
0000050: 7d2b 5b35 2d34 5d2f 2b32 3333 257d 256e
0000060: 2b

This decodes to a program equivalent to

n-' '/{`{\base 59%"\x10\xA9\xCD#\x8DW\xAA\x8C\x17\xD2\\\xD3\eC-x>'z8#\xE1F\xE83d##\xAC\xE7*9\xD5\x02\x1CN3;018\xDCQ D:\xA7\xD0\x01\xDFK 23?6\xAEQ\xC3"=}+[5-4]/+233%}%n+

which is essentially

n-' '/{`{\base 59%"MAGIC STRING"=}+[5-4]/+233%}%n+

This uses a (non-minimal) perfect hash based on the core idea of An optimal algorithm for generating minimal perfect hash functions; Czech, Havas and Majewski; 1992. Their basic idea is that you use two hash functions, f1 and f2, together with a lookup table g, and the perfect hash is (g[f1(str)] + g[f2(str)]) % m (where m is the number of strings we wish to distinguish); the clever bit is the way they build g. Consider all of the values f1(str) and f2(str) for strings str of interest as nodes in an undirected graph, and add an edge between f1(str) and f2(str) for each string. They require not only that each edge be distinct, but that the graph be acyclic; then it is just a DFS to assign weights to the nodes (i.e. to populate the lookup table g) such that each edge has the required sum.

Czech et al generate random functions f1 and f2 which are expressed via lookup tables, but that's clearly no good: I searched for a suitable hash using simple base conversions with two distinct bases from -10 to 9. I also relaxed the acyclic requirement. I didn't want to assign the strings to values from 0 to 54, but to the corresponding ASCII codes, so rather than taking (g[f1(str)] + g[f2(str)]) % m I wanted (g[f1(str)] + g[f2(str)]) % N for some N > 'z'. But that allows freedom to try various N and see whether any of them allow a valid lookup table g, regardless of whether there are cycles. Unlike Czech et al I don't care if the search for the perfect hash function is O(n^4).

The graph generated by -4base and 5base mod 59 is:

Graph rendered by dot with some minor tweaks

which is fairly nice apart from the biggest connected component, which has three cycles of length 1. We have to go up to N=233 before we can find a g which is consistent.

Peter Taylor

Posted 2014-01-09T04:14:21.530

Reputation: 41 901

Other possible encodings for the lookup table: difference encoding isn't going to help, because there isn't the structure. It might be possible to exploit the non-repetition of values by encoding as a permutation, but either the gaps need to be handled separately (54 output characters => 30 bytes of entropy, plus decoding; the runs need at least 15 bytes if encoded as a straight base conversion; might just be possible to improve on the current 92 bytes total) or we're permuting 138 items (more than 98 bytes of entropy, plus decoding). – Peter Taylor – 2014-01-10T16:40:38.020

Since it's a non-prefix code we can't easily try to palm the hard work off to a zlib implementation. – Peter Taylor – 2014-01-10T16:41:32.650

4

R, 145 bytes

Translated a dot to a 2, a dash to a 1 and interpreting the number in ternary and taking the mod 89, which gives a unique number we can use in a hash table. The presence of a 13 (111 base-3) means adding 1 because ASCII 13 doesn't work in TIO.

cat(c(letters,0:9,".")[match(strtoi(chartr(".-","12",scan(,"",t=scan(,""))),3)%%89+1,utf8ToInt('DG,)62	5N*EHMAI.%"!4=@'))],sep='')

Try it online!

R, 236 bytes (non-competing)

This is not going to be competitive, but it lets us show off something interesting in R: store the Morse code tree inside a quoted language structure m and retrieve it from the code of dots and dashes very simply using the fact that [[ can be applied recursively to lists. For example m[[c(2,2,3,2)]] retrieves dot, dot, dash, dot or "f".

m=quote(.(e(i(s(h(5,4),v(,3)),u(f,M(,2))),a(r(l,.(.(,.),)),w(p,j(,1)))),t(n(d(b(6),x),k(c,y)),m(g(z(7),q),o(D(8),S(9,0))))))
for(w in scan(,"",t=scan(,"")))
cat(chartr("MDS","-. ","if"(is.symbol(z<-m[[(utf8ToInt(w)==45)+2]]),z,z[[1]])))

Try it online!

J.Doe

Posted 2014-01-09T04:14:21.530

Reputation: 2 379

4

C, 169 chars

I couldn't find a better hash function..

(I posted the unminified code but counted it minified; to minify just do :%s/ //g | %j! in vim, then put the space in the string literal back.)

c, v = 1;

main() {
  while (c = getchar(), ~c)
    v = c < 33? putchar(
      "& etianmsurwdkgohvf.l.pjbxcyzq..54.3.;!2).+...,16=/:..(.7.?_8.9o\"...$...@...'..-"[v < 64? (v != 40)*v : v % 51 + 33]
    ), 1 : v * 2 + c % 2;
}

Test run

(morse.in is just the whole alphabet in morse on separate lines):

% clang morse.c && ./a.out </tmp/morse.in
abcdefghijklmnopqrstuvwxyzO123456789.,?'!/()&:;=+-_"$@
% ./a.out <<<'. -..- .- -- .--. .-.. . .-.-.-  ... --- ...'
example. sos

Explanation

This one is fairly straightforward. c < 33 finds a whitespace/separator character (, \n, EOF, ...). c % 2 translates a dot or dash into a bit. The idea is to create a unique number for each character simply by interpreting it as a binary number (after prefixing it with 1 to deal with the variable length) (this interpretation is the v*2 + c%2 part). I then get a 137-char LUT, which I compacted by hashing the resulting value (v < 64? v : v % 51 + 33, constants found via trial-and-error and by looking at the distribution and try to find a huge gap). Unfortunately this hash function has a single collision, which is why I have to special-case the 40 → '&' mapping.

FireFly

Posted 2014-01-09T04:14:21.530

Reputation: 7 107

1

Powershell, 193 bytes

$n=1
-join("$args "|% t*y|%{if($_-32){$n=$n*2+($_-ne'.')}else{("  etianmsurwdkgohvf l pjbxcyzq  54 3   2& +~16=/   ( 7   8 90~~~?~ `"  .~@   '  -~~;! )~ ,~:~~~~$"-replace'~','    ')[$n]
$n=1}})

Less Golfed Test script:

$f = {

$n=1
-join(
    "$args "|% t*y|%{
        if($_-32){
            $n=$n*2+($_-ne'.')
        }else{
            ("  etianmsurwdkgohvf l pjbxcyzq  54 3   2& +~16=/   ( 7   8 90~~~?~ `"  .~@   '  -~~;! )~ ,~:~~~~$"-replace'~','    ')[$n]
            $n=1
        }
    }
)

}

@(
    ,("example. sos",". -..- .- -- .--. .-.. . .-.-.-  ... --- ...")
    ,("0123456789abcdefghijklmnopqrstuvwxyz","----- .---- ..--- ...-- ....- ..... -.... --... ---.. ----. .- -... -.-. -.. . ..-. --. .... .. .--- -.- .-.. -- -. --- .--. --.- .-. ... - ..- ...- .-- -..- -.-- --..")
    ,("hello world", ".... . .-.. .-.. ---  .-- --- .-. .-.. -..")
) | % {
    $expected,$s = $_
    $result = &$f $s
    "$($result-eq$expected): $result"
}

Output:

True: example. sos
True: 0123456789abcdefghijklmnopqrstuvwxyz
True: hello world

mazzy

Posted 2014-01-09T04:14:21.530

Reputation: 4 832

0

JavaScript (165 bytes, only implementing four planes.)

n=''.replace(/\./g,1).replace(/-/g,0).split(' ')
l='|te|mnai|ogkdwrus|cöqzycxbjpälüfvh'.split('|')
r=''
for(i in n){r+=l[n[i].length][parseInt(n[i],2)]}
alert(r)

The input should be assigned to n, execute the following code to get the output:

n='. -..- .- -- .--. .-.. .'.replace(/\./g,1).replace(/-/g,0).split(' ')
l='|te|mnai|ogkdwrus|cöqzycxbjpälüfvh'.split('|')
r=''
for(i in n) {r+=l[n[i].length][parseInt(n[i],2)]}
alert(r)

aularon

Posted 2014-01-09T04:14:21.530

Reputation: 109

Not only does this appear to be an incomplete implementation, but it doesn't even work. Fiddle + Chrome gives error Cannot read property '42' of undefined, and IdeOne also reports an error (although without a useful message). – Peter Taylor – 2014-01-10T21:47:56.287

Try fixing it :) – Timtech – 2014-01-10T22:46:00.927

@PeterTaylor It is stated that it only support four planes, i.e. uptill 4 characters long morse codes, thus it won't accept . -..- .- -- .--. .-.. . .-.-.- as input, as last code is a 6 chars long. In the example script I omit it and go with . -..- .- -- .--. .-.., which alerts(example). – aularon – 2014-01-10T22:51:47.243

Here is a fiddle with the second block code: http://jsfiddle.net/aularon/AHY4e/1/

– aularon – 2014-01-10T22:58:22.043