How fast can I say your program?

26

2

I recently decided to download some dictation software, in order to help with my writing. However, it doesn't work very well when I'm coding, since I have to change from saying words to symbols and back again. It's even worse when I'm coding in an esoteric language which is all symbols.

In order to make my use of the dictation program more consistent, I decided to switch it over to character mode, where I just say the name of each character instead. Problem solved! Though this does delay my novel's release date a little bit...

So, assuming that the longer the name of a character, the longer it takes to say, how long will it take me to spell out some of my programs/sentences?

Specifications

Given a string consisting of only printable ASCII, return the sum of each character's unicode name. For example, / is called SOLIDUS with 7 characters, and A is LATIN CAPITAL LETTER A with 22 characters.

But remember, I have to say your programs out loud to execute them, so their score will be based on how long it takes me to say them, i.e. as the sum of the lengths of each character's unicode name.

Test Cases:

In format input => output with no trailing/leading spaces in input.

A      => 22
/      => 7
Once upon a time...           => 304
slurp.uninames>>.comb.sum.say => 530
JoKing => 124
!" #$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~        =>  1591
Double-check your \s on the last test case ;)   => 755
<say "<$_>~~.EVAL">~~.EVAL     => 388
,[.,]  => 58
19     => 19

Rules:

  • Input to your program will only consist of printable ASCII characters, that is, the codepoints 32 (space) to 126 (tilde).
    • For convenience sake, here is the list of lengths of the characters you have to handle: [5,16,14,11,11,12,9,10,16,17,8,9,5,12,9,7,10,9,9,11,10,10,9,11,11,10,5,9,14,11,17,13,13,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,19,15,20,17,8,12,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,18,13,19,5]
  • Here is a reference program you can use to score your program.
    • Peter Taylor has pointed out that the reference program normalises some unicode characters. It should still work for most solutions, but feel free to correct it if you need
  • Since you're saying what the characters actually look like, your solution will be scored by the characters that are displayed, not the bytes involved. This is directed at languages with custom encodings.
    • You can assume that I've memorised the entire Unicode library and can say whatever strange characters you use.
  • Sorry Rogem, but answers have to be composed of displayable characters. Unprintables are fine, I just have to be able to read the characters out loud.
  • Whatever you do, do not use in your program.

Jo King

Posted 2019-02-16T01:08:47.323

Reputation: 38 234

9ARABIC LIGATURE UIGHUR KIRGHIZ YEH WITH HAMZA ABOVE WITH ALEF MAKSURA ISOLATED FORM this will be the full name of my child – Quintec – 2019-02-16T02:08:40.360

1

This program scores 6 in word mode: Try it online!

– Neil – 2019-02-16T15:21:37.417

Answers

13

Java 8, score 846 838 822 816

ௐ->ௐ.map(ˇ->Character.getName(ˇ).length()).sum()

-8 score thanks to @tsh replacing the _1 with .
-22 score thanks to @ASCII-only replacing the with ˇ and $ with .

Try it online.

Explanation:

The and ˇ are used instead of the s and c I would normally use, because lowercase letters are all 20 (i.e. LATIN SMALL LETTER S), but (TAMIL OM) is 8 and ˇ (CARON) is 5.

ௐ->                         // Method with IntStream parameter and integer return-type
  ௐ.map(ˇ->                 //  Map each character to:
      Character.getName(ˇ)  //   Get the name of the character
               .length())   //   Get the length of that name
   .sum()                   //  And after the map: sum all lengths together,
                            //  and return it as result

Kevin Cruijssen

Posted 2019-02-16T01:08:47.323

Reputation: 67 575

1I like how this Java stuff beats the 05AB1E answer both in terms of bytes and it terms of score... – Erik the Outgolfer – 2019-02-16T13:13:45.993

@EriktheOutgolfer Ikr. Builtins ftw I guess. ;) – Kevin Cruijssen – 2019-02-16T15:09:42.013

@KevinCruijssen It does save a couple bytes that you don't have to push compressed integer 87235805968599116032550323044578484972930006625267106917841 :P – Quintec – 2019-02-17T17:29:33.170

1Use instead of _1 would save some points. – tsh – 2019-02-20T08:20:24.277

@tsh Thanks! Completely forgot other 'weird' characters besides regular letters and _$ are possible as well for variables. – Kevin Cruijssen – 2019-02-20T08:37:10.563

Remember _ is shorter >_> also the ohm sign Peter Taylor used if that's valid in Java – ASCII-only – 2019-02-20T14:55:32.603

@ASCII-only Just _ isn't allowed in Java, why do you think I had _1 in the first place.. But your tip on Peter Taylor's answer regarding ˇ (CARON) does work, so thanks! If you know another one shorter than 11 to replace the $, let me know. :) – Kevin Cruijssen – 2019-02-20T17:21:07.463

1

@KevinCruijssen Peter Taylor's (OHM SIGN) is 8 characters long. Also haha I wasn't aware it was not valid, just assumed since it's valid in C# and Peter used _1 too (program to find short variable names, the box character can't be used)

– ASCII-only – 2019-02-21T11:04:00.270

@ASCII-only Thanks. I've decided to use the (TAMIL OM) since it's also 8 bytes. – Kevin Cruijssen – 2019-02-21T11:17:28.647

11

Perl 6, score 337

+*.uninames.&[~].ords

Try it online!

nwellnhof

Posted 2019-02-16T01:08:47.323

Reputation: 10 037

That .&[~] instead of .join is a neat trick. I can never remember what reduction operators can be applied through that method – Jo King – 2019-02-18T01:11:06.600

7

Japt v2.0a1 -x, Score 926 908 875 865 829 791 789

Takes input as an array of characters.

®cg`061742//0.450./..//.2/5117385`c+51 r\A_p26}  n# 

Try it or run all test cases on TIO

(APOSTROPHE is omitted from the 6th test case on TIO as Japt can't handle both single and double quotes in the same input string)


Explanation

®cg`...`c+51 r\A_p26}  n#      :Implicit input of character array
®                              :Map
 c                             :  Character code
  g                            :  Index into (0-based, with wrapping)
   `...`                       :    The string described below
        c+51                   :    Increment the codepoint of each by 51 (="8cKidj55gebbc9agh895c97a99baa9bba59ebhddMjfkh")
                               :    (Space closes the above method)
             r                 :    Replace
              \A               :      RegEx /[A-Z]/g
                _              :      Pass each match through a function
                 p26           :        Repeat 26 times
                    }          :      End function
                               :    (Space closes the replace method)
                               :  (Space closes the indexing method)
                       n       :  Convert to integer
                        #      :    From base 32 (note the trailing space)
                               :Implicitly reduce by addition and output

Building The String

(Scores include the steps and extra characters needed to reverse each modification)

  1. The array gave a baseline score of 2161.
  2. Converting each to a single character in a base >=23 and joining to a string scored 1832.
  3. Replacing both runs of m and k with a single, uppercase character scored 963.
  4. There were still too many expensive letters so next I tried to get rid of them by reducing the codepoints of all characters. 5 was the character with the lowest codepoint (53) so I started with 52, which scored 756
  5. After trying all numbers that would leave no letters in the string, 51 gave the best score of 738
  6. Finally, replacing the quotation marks with slightly cheaper backticks gave a score of 734. Backticks in Japt are usually used to enclose and decompress a compressed string but, luckily, none of the characters in this string are contained in Shoco's library

The final string, so, contains the characters at the following codepoints:

[5,48,24,54,49,55,2,2,52,50,47,47,48,6,46,52,53,5,6,2,48,6,4,46,6,6,47,46,46,6,47,47,46,2,6,50,47,53,49,49,26,55,51,56,53]

Shaggy

Posted 2019-02-16T01:08:47.323

Reputation: 24 623

4

05AB1E, score 963

Îv•Fδà‚<0?9½mΣ@×ƶC₁vc-™uΔ_ε'•21вεD3‹i22α₂и}}˜yÇ32-è+

Try it online or verify all test cases.

Explanation:

Î               # Push 0 and the input-string
 v              # Loop `y` over the characters of this input-string:
  •Fδà‚<0?9½mΣ@×ƶC₁vc-™uΔ_ε'•
               '#  Push compressed integer 87235805968599116032550323044578484972930006625267106917841
   21в          #  Converted to Base-21 as list: [5,16,14,11,11,12,9,10,16,17,8,9,5,12,9,7,10,9,9,11,10,10,9,11,11,10,5,9,14,11,17,13,13,0,19,15,20,17,8,12,2,18,13,19,5]
    ε           #  Map over this list:
     D3‹i       #   If the value is smaller than 3:
         22α    #    Take the absolute difference of this value with 22
            ₂и  #    Repeat it 26 times as list
    }}          #  Close the if-statement and map
      ˜         #  Flatten the list
       yÇ       #  Get the unicode value of the current character
         32-    #  Subtract 32
            è   #  Index it into the list of integers
             +  #  And add it to the sum
                # (and output the sum implicitly as result after the loop)

See this 05AB1E tip of mine (sections How to compress large integers? and How to compress integer lists?) to understand why •Fδà‚<0?9½mΣ@×ƶC₁vc-™uΔ_ε'•21в is [5,16,14,11,11,12,9,10,16,17,8,9,5,12,9,7,10,9,9,11,10,10,9,11,11,10,5,9,14,11,17,13,13,0,19,15,20,17,8,12,2,18,13,19,5].

Kevin Cruijssen

Posted 2019-02-16T01:08:47.323

Reputation: 67 575

4

R; Score: 3330 1586 1443

Also challenging in R due to lack of built-ins.

Well the code is now mostly @Giuseppe's but that's alright. I was able to make a small edit to golf further by replacing the * with the ~, and the s with the dot.

Thanks to @Nick Kennedy for getting this down to 1443 by using arcane magic "a UTF8 encoded version of the number sequence"

function(.)sum((c(+",752230178/0,30.1002110221,052844",61~26,+":6;8/3",59~26,+"94:,")-39)[+.-31]);`+`=utf8ToInt;`~`=rep

Try it online

CT Hall

Posted 2019-02-16T01:08:47.323

Reputation: 591

1769 points -- makes a minimal attempt to compress the values... – Giuseppe – 2019-02-20T01:37:28.780

2also, utf8ToInt is a super helpful command for golfing :-) I haven't been on PPCG for a month or so, so it's nice to see new people golfing in R! – Giuseppe – 2019-02-20T01:38:56.140

Ah, I had a way to compress it, but was not aware of utf8ToInt. I'll have to work on this later tonight/tomorrow. – CT Hall – 2019-02-20T01:49:42.897

1617 -- I'm just making random edits instead of doing my actual work :-( – Giuseppe – 2019-02-20T01:58:23.177

Well I appreciate it. What does the footer do over at TIO? – CT Hall – 2019-02-20T03:07:40.300

1It's just more lines of code under the program/snippet that doesn't affect bytecount - useful to do some tests in – ASCII-only – 2019-02-20T14:58:47.633

Also see the C# solution, I'm sure you can do something similar, just gotta do asc first, the lookup table logic should be mostly the same? – ASCII-only – 2019-02-20T16:14:27.373

if packages are allowed function(.)sum(nchar(u_char_name(as.u_char(utf8ToInt(.)))));library(Unicode) is 1388. – CT Hall – 2019-02-22T00:35:42.787

4

C# (Visual C# Interactive Compiler) (score 1627 1116 1096 1037 1019 902)

Ω=>Ω.Sum(ˇ=>(31&-ˇ)>5&ˇ>62?22-ˇ/91*2:"♁♌♊♇♇♈♅♆♌♍♄♅♁♈♅♃♆♅♅♇♆♆♅♇♇♆♁♅♊♇♍♉♉♏♋♐♍♄♈♎♉♏♁"[ˇ-6-ˇ/33*26]-9788)

This uses no built-in database: just some special-casing for letters and a lookup table.

Online test suite.

It can't score itself, because most of the characters are not in range, including the variables CARON and OHM SIGN and the zodiac symbols used to encode the lookup table.

Thanks to ASCII-only for many suggestions.

Peter Taylor

Posted 2019-02-16T01:08:47.323

Reputation: 41 901

What scoring program did you use – ASCII-only – 2019-02-20T13:52:04.650

tio.run/##NZDrdpNAFIX/z1OMY0wgwhCg0NAEanpT22jVaquGiAQnYbgMhIHWGJO36gv0xSJrBf@cdS57r72@E3A54HR3UbFgyMuCsoVEWenM7d3To@08PeKbKhU82xG2suDJdgd2xLauikPNbLc9R9eONU32FFPtakeI5CyOExrly5CShC4iSuMoonEcEcriZZryrFyGSZFygiZebdP1rmZOZcsUdwMwzwriB6Fw7xfQh5RBRh4m0zVAIyRBRYGaBpCybw8BumYBgVWesVpb0pRgjPc3vXcAEE@qIscVo8xPCXccHGTpDPMqxdxf7XWG3gPoMruqifcLVauNz1wEn7detDuC2H0pyVjpqZp@YJiHfetoMLSd41ejk9Oz84vXb95eXo3fvb/@8PHTzecvt3dfv32fuO70h/fTnwW/yHwR0ihOUpbly4KX1f3D79Wf9d/NtskyLBWgs6yaJUQOQhLEcJVVBXRdDmukMiQw8XkJS1KXwOcEDsQG3TAAGtYU0EXDludst/j8djR20f@u@UK/D5A0wdK0oa1H1WrCLbARwV1BSzKmjAgttPYly9hAaDsQrueCL26QONj9Aw – ASCII-only – 2019-02-20T13:56:35.000

This should be 1590 whereas yours seems to score 1735 on the reference implementation – ASCII-only – 2019-02-20T13:57:10.157

tio.run/##NZDrdpNAFIX/z1OMY0wgwsDMAIEkUNOb2karVls1RCQ4abBhiAy0xpi8VV@gLxZZK/HPWeey99rrO4nUE5luTyuR9GVZpOJGS0UZTP3t44MfPD7gyypTIj9QNroS6X4LttQmI2qfOs1mFDB6QKkeGQ5p0y7SOjalzCQd1zA1ZmJimpQQk1KimTZ1Lavr9FyDeVZXQ6OotjHWps5YZ5667YFpXvA4mSl3cQFjmAoo@P1ovAJogDRoGJBSgIxd2wHoQiQcVotc1NoyzTjGeHdjpgWQnFfFAlciFXHGZRDgJM8mWFYZlvFyp7OZCdBZfl4T7xaE1sYnIYJPG8@aLUVtP9d0bJiEMst2Oq7X7fX94ODF4PDo@OT05avXZ@fDN28v3r3/cPnx09X15y9fR2E4/hZ9jyfJDz69maU/b@eZyBe/CllWd/e/l39Wf9ebfZbtEYCO82oy53oy48ktXOZVAcNQwhqpnHE4j2UJS16XJJYc9tQ9um0D1K8pYIj6jSjYbPDJ1WAYov/d/guuC5A2wtp4T1uPxNuHe2CtgusiLfkwFVxpoFWsefYaQj@AcDVVYnWN1N72Hw – ASCII-only – 2019-02-20T14:18:01.627

Should score 1120? – ASCII-only – 2019-02-20T14:18:19.150

2

@ASCII-only, I used the Python answer below; the Java answer also gives 1627. The problem seems to be that the reference solution is buggy: Ω is U+2126, OHM SIGN, not GREEK CAPITAL LETTER OMEGA.

– Peter Taylor – 2019-02-20T14:18:32.017

What's the micro one then? (Also replaced micro sign with ohm sign, 1116 now) – ASCII-only – 2019-02-20T14:23:52.340

Aargh, I missed a subtraction of 32 to fix the index in my program to find the best offset for the lookup table. µ is MICRO SIGN. – Peter Taylor – 2019-02-20T14:36:27.690

Score 1055 I think? – ASCII-only – 2019-02-20T14:51:00.267

@ASCII-only, I've made further savings by using & instead of && and by reworking the comparison against 26 into a comparison against 5. – Peter Taylor – 2019-02-20T15:01:50.387

Let us continue this discussion in chat.

– ASCII-only – 2019-02-20T15:05:52.403

1Score 5 name: ˇ, no other names shorter than 8 that C# accepts, also not verified with Java program – ASCII-only – 2019-02-20T15:34:50.073

Why do you use Ω?.. It's name is GREEK CAPITAL LETTER OMEGA, which is even longer than just a letter.. The challenge description has a Perl 6 program to get your score. – Kevin Cruijssen – 2019-02-20T17:17:24.903

1@Kevin, as per my earlier comment the reference implementation is buggy. I think it's applying normalisation to turn the source character OHM SIGN into GREEK CAPITAL LETTER OMEGA. – Peter Taylor – 2019-02-20T18:27:03.473

@PeterTaylor Ah ok. I also just noticed the same thing with the KELVIN SIGN which it changes to LATIN CAPITAL LETTER K.. Nice score for a overall pretty verbose language without builtins for the challenge nor compression btw, +1 from me! – Kevin Cruijssen – 2019-02-20T18:30:27.457

I'm honestly surprised how close this comes to the builtin-based Java answer (and even the Perl answer). Granted, letters are very, very expensive using this scoring method which explains the closeness. Also @PeterTaylor >64 -> >62 for -1 score should work? – ASCII-only – 2019-02-22T10:15:25.143

@ASCII-only, now beating Python... – Peter Taylor – 2019-02-22T13:01:54.407

Haha, of course there's a better offset. I thought about it but doubted there would be a contiguous block of characters with names that short, I guess I just forgot you don't need a very wide range since the range of values being encoded is pretty small – ASCII-only – 2019-02-22T13:17:51.240

3

Python 3, Score of 993

lambda _:len(''.join(map(__import__('unicodedata').name,_)))

Try it online!

Below 1000 now, any tips still appreciated.

-16 thanks to Kirill L.

nedla2004

Posted 2019-02-16T01:08:47.323

Reputation: 521

2993 – Kirill L. – 2019-02-16T12:21:49.017

1You can replace _ by ˇ for 987. – Lynn – 2019-02-21T21:40:29.883

2

Perl 5 -pl, score 723

s,\N{OX}*.,_charnames'viacode ord$&,ge,$_=y,,,c

Try it online!

Explanation

s,        ,                        ,ge  # Replace globally
  \N{OX}*   # zero or more OX characters , loads the
            # _charnames module as side effect,
         .  # any character
           _charnames'viacode ord$&  # with its Unicode character name
                                     # (using old package delimiter).
                                      ,$_=y,,,c  # Set $_ to its length

nwellnhof

Posted 2019-02-16T01:08:47.323

Reputation: 10 037

2

Attache, 1934

Sum@{ToBase[FromBase[Ords@"!ZByru=#9fBYb$a3Si0^pU,ZP#3$cd'(c-_lhu]h(]5;!W|?M4:<_^sU;N&XFN`t:u"-32,95],23][Ords@_-32]}

Try it online!

Simple compression and indexing.

Conor O'Brien

Posted 2019-02-16T01:08:47.323

Reputation: 36 228

:P looks like using a smarter lookup (see C# answer) would help with score. Or even just using a charset that doesn't contain letters to compress – ASCII-only – 2019-02-22T10:13:10.617

1

C# (Visual C# Interactive Compiler), Score: 4007 3988 3759 3551 2551

ˇ=>ˇ.Sum(_=>new[]{5,16,14,11,11,12,9,10,16,17,8,9,5,12,9,7,10,9,9,11,10,10,9,11,11,10,5,9,14,11,17,13,13,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,19,15,20,17,8,12,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,18,13,19,5}[_-32])

I feel crushed by Peter Taylor's solution up above. Thanks to Peter Taylor for pointing out a simple lookup table was better than my previous dictionary solution.

Try it online!

Embodiment of Ignorance

Posted 2019-02-16T01:08:47.323

Reputation: 7 014

This is considerably worse than a direct lookup table: _1=>_1.Select(_2=>new int[]{5,16,14,11,11,12,9,10,16,17,8,9,5,12,9,7,10,9,9,11,10,10,9,11,11,10,5,9,14,11,17,13,13,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,19,15,20,17,8,12,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,18,13,19,5}[_2-32]).Sum() scores 2786. – Peter Taylor – 2019-02-20T12:06:36.137