Supreme Sum String

15

1

Supreme Sum String

Given an input string, return the word with the highest sum of each of its unicode characters.

Rules

  • The input should be seperated by whitespace
  • The value of each word is based on the sum of each character in the word's UTF-16 code
  • The output should be the first word with the highest value (in case of duplicate sums)

Examples

Input: "a b c d e"
Output: "e"

Input: "hello world"
Output: "world"

Input: "this is a test"
Output: "test"

Input: "àà as a test"
Output: "àà"

Input "α ää"
Output: "α"

Input: " 隣隣隣"
Output: "隣隣隣"

Input: "    ️  "
Output: "️"

This is code golf, so the shortest answer wins! Good luck :)

GammaGames

Posted 2018-10-05T15:48:09.353

Reputation: 995

Will there always be at least one space (at least 2 words)? – Emigna – 2018-10-05T16:13:57.003

If there's only one word just output the single word, since it's the max – GammaGames – 2018-10-05T16:41:55.740

2This would have been more interesting with ASCII instead of Unicode, because more languages could have participated. Requiring Unicode support doesn't seem to add anything to the challenge – Luis Mendo – 2018-10-05T17:05:54.550

1I mostly used Unicode because it has emojis lol – GammaGames – 2018-10-05T17:45:55.260

2Since many of the current answers seem to use the sum of UTF-8 or UTF-32 code units, you should add some additional test cases. For example "α ää" yields different results with UTF-8 (383 < 718) and UTF-16 (945 > 456). – nwellnhof – 2018-10-05T18:04:09.180

" 隣隣隣" could be used to weed out answers that simply add codepoints (UTF-32). Also, do you really mean to sum UTF-16 code units (16-bit numbers) or something else? – nwellnhof – 2018-10-05T18:10:21.853

I mean the number given by Javascript's charCodeAt function, which is a UTF-16 code according to the documentation. That's what I used when I was testing how feasible the challenge was. I'll add the other test cases though! – GammaGames – 2018-10-05T18:23:34.333

When you say separated by whitespace... is input separated by newline allowed? – JayCe – 2018-10-05T18:37:07.817

1Yeah, newlines area allowed. Tabs too! – GammaGames – 2018-10-05T22:37:14.003

Can we take input as an array/list of words? – Shaggy – 2018-10-06T00:31:34.237

It cannot be an array, it has to be a string of words with any whitespace characters in between – GammaGames – 2018-10-06T03:01:12.343

Answers

3

Jelly, 7 bytes

ḲOS$ÐṀḢ

Try it online!

ḲOS$ÐṀḢ
Ḳ        Split input on spaces
    ÐṀ   Give words that have maximum of:
   $       Monad:
 O           ord(each character)
  S          sum
      Ḣ  First word that gives the max ord-sum.

dylnan

Posted 2018-10-05T15:48:09.353

Reputation: 4 993

If the spec is relaxed to input being allowed as a list of words then O§MḢị – Jonathan Allan – 2018-10-05T18:44:02.483

@JonathanAllan Where did OP say that was allowed? – dylnan – 2018-10-05T19:05:45.637

didn't just if... – Jonathan Allan – 2018-10-05T19:20:52.080

@JonathanAllan Ah, gotcha. – dylnan – 2018-10-05T19:39:20.213

If you can take a list of words, each on their own line, that would be valid too. I don't know Jelly so I don't know if the 'whitespace' flexibility helps much – GammaGames – 2018-10-05T22:50:55.607

1@GammaGames It would help if I could take a list of strings, e.g. ["abc", "def"]. But at this point there are a lot of answers so I don't recommend adding new methods of input – dylnan – 2018-10-05T23:45:46.197

That's why I haven't added it, though I do see why it would have been nice – GammaGames – 2018-10-05T23:53:02.637

As of now, you have the lowest answer! Congrats! – GammaGames – 2018-10-08T03:02:56.583

7

Perl 6, 34 bytes

*.words.max(*.encode('utf16').sum)

Try it online!

Sean

Posted 2018-10-05T15:48:09.353

Reputation: 4 136

6

R, 77 69 59 58 56 44 bytes

A group effort now.

'^'=mapply
sort(-sum^utf8ToInt^scan(,""))[1]

Try it online!

Convert to code points, sum each word, negate, (stably) sort, return first element.

Technically the return value is a "named vector" whose value is the sum and name is the winning word, but this seems to follow the rules. If you want to return the winning word as a string, you'd have to spend 7 more bytes and wrap the above in names().

ngm

Posted 2018-10-05T15:48:09.353

Reputation: 3 974

Is there a reason there's spaces in front of the word? When I run it on" ️ " it prints out " ️ " (with a bunch of spaces in front of it) – GammaGames – 2018-10-05T17:43:33.200

2@GammaGames the output is what is called a "named vector" in R. In this case the value is the sum of the code points of the winning word, and the name is printed along with it, which in this case is the winning word itself. The name is right-aligned to the number below it. – ngm – 2018-10-05T17:46:02.763

Oh, neat! It looks like it does follow the rules, so I'll allow it. Cool entry! – GammaGames – 2018-10-05T17:56:52.230

sort(-sapply(...)) is shorter by 3 bytes. – Giuseppe – 2018-10-05T18:05:40.220

oh and you can apply scan() twice instead of el(strsplit()) to make this 69 bytes

– Giuseppe – 2018-10-05T18:11:30.920

Map for 64bytes, I am sure it can be improved further.tio – JayCe – 2018-10-05T18:22:17.537

oh, and the double scan can be shortened to scan(,"",t=scan(,"")) for -1 byte. Try it online!

– Giuseppe – 2018-10-05T18:29:46.840

3@JayCe mapply does the unlist for free. – ngm – 2018-10-05T18:29:56.110

Newline is white space (I think) so one scan should be sufficient. Checking with OP. – JayCe – 2018-10-05T18:36:10.833

With some rebinding of ^ I've got your amazing answer down to 56 bytes

– J.Doe – 2018-10-05T18:41:35.050

5

05AB1E, 8 bytes

ð¡RΣÇO}θ

Try it online!

Explanation

ð¡          # split input on spaces
  R         # reverse the resulting list
   Σ  }     # sort by
    ÇO      # sum of character codes
       θ    # take the last

Emigna

Posted 2018-10-05T15:48:09.353

Reputation: 50 798

Wow, I'm always amazed by the answers made in dedicated golfing languages! – GammaGames – 2018-10-05T16:49:37.087

Why do you need to reverse the resulting list? It's gonna get sorted anyways right? Or does the R actually reverse the list after it's sorted? – FireCubez – 2018-10-06T15:53:24.367

@FireCubez For test case àà as a test the àà and test have the same largest unicode sum. So without the reverse test would be output instead of àà. Btw, Emigna, use # to save a byte. ;) EDIT: Never mind. I see it doesn't wrap the input in a list for single word inputs.. That's unfortunate. – Kevin Cruijssen – 2018-10-06T20:31:36.557

4

JavaScript (ES6), 81 bytes

s=>s.split` `.map(m=s=>m=[...s].map(c=>t+=c.charCodeAt(),t=0)&&t<=m?m:(r=s,t))&&r

Try it online!

Arnauld

Posted 2018-10-05T15:48:09.353

Reputation: 111 334

That's way better than the code I came up with when I was writing the challenge, mine was ~200 chars long! – GammaGames – 2018-10-05T16:46:05.197

72 bytes – guest271314 – 2018-10-05T23:14:56.550

@guest271314 doesn't work for the second last test case and some extreme cases like f(" 龘龘龘龘龘") – Shieru Asakoto – 2018-10-06T00:25:47.543

Oh nvm 隣(\uf9f1) was the one in CJK Compatibility Ideograph block instead lol. Thought it was 隣(\u96a3), the one in CJK Unified Ideograph block. – Shieru Asakoto – 2018-10-06T01:36:40.527

Apparently 龘龘龘龘龘 would be the expected result I'd say. – Shieru Asakoto – 2018-10-06T01:41:44.570

@ShieruAsakoto The sum of "龘龘龘龘龘" is 204280 and the sum of "" is 55357, correct? – guest271314 – 2018-10-06T07:52:43.943

@guest271314 the sum of should he 112191 because of utf-16 encoding – Shieru Asakoto – 2018-10-06T08:07:36.477

4

jq, 61 43 57 37 characters

(57 39 53 33 characters code + 4 characters command line options)

./" "|reverse|max_by(explode|add)

Sample run:

bash-4.4$ jq -Rr './" "|reverse|max_by(explode|add)' <<< 'àà as a test'
àà

Try it online!

manatwork

Posted 2018-10-05T15:48:09.353

Reputation: 17 865

Indeed. Missed that case. ☹ Thanks, @nimi. – manatwork – 2018-10-05T17:00:51.290

4

PowerShell, 74 52 bytes

(-split$args|sort{$r=0;$_|% t*y|%{$r+=$_};$r}-u)[-1]

Try it online!

Thanks to mazzy for a whopping -22 bytes.

-splits the input $args on whitespace, pipes that into sort with a particular sorting mechanism {...} and the -unique flag.

Here we're taking the current word $_, changing it toCharArray, then for each letter we're adding it into our $result. That turns the string into a number based on its UTF-16 representation.

For once, PowerShell having all strings be UTF-16 in the background is a life-saver!

We then encapsulate those results in (...) to transform them into an array and take the last [-1] one, i.e., the largest result that's the closest to the start of the sentence. This works because of the -unique flag, i.e., if there's a later element that has the same value, it's discarded. That word is left on the pipeline and output is implicit.

AdmBorkBork

Posted 2018-10-05T15:48:09.353

Reputation: 41 581

it's smart. Thanks. 2 moments: why not sort -u instead a reverse? can be enough + to convert in the number? (-split$args|sort{($_|% t*y|%{+$_})-join"+"|iex} -u)[-1] – mazzy – 2018-10-06T07:00:26.790

more golf: (-split$args|sort{$r=0;$_|% t*y|%{$r+=$_};$r}-u)[-1] :) – mazzy – 2018-10-06T07:12:12.640

@mazzy Yes, thanks! – AdmBorkBork – 2018-10-08T13:50:53.123

4

Pyth, 8 bytes

h.MsCMZc

Test suite

I know there's already a Pyth answer but I feel like this uses a pretty different approach and also it's waaaay shorter

Explanation:
h.MsCMZc  | Full code
h.MsCMZcQ | with implicit variables added
----------+------------------------------------
h         | The first element of
       cQ | the input chopped at whitespace
 .M       | with the maximal value for
   s      | the sum of
    CMZ   | the Unicode value of each character

hakr14

Posted 2018-10-05T15:48:09.353

Reputation: 1 295

Wow, that's really precise! Thanks for the explanation! – GammaGames – 2018-10-06T03:03:22.463

3

Python 3, 55 52 bytes

lambda s:max(s.split(),key=lambda w:sum(map(ord,w)))

Try it online!

  • -3 bytes thanks to Gigaflop for pointing out that no argument is needed in the split method.

dylnan

Posted 2018-10-05T15:48:09.353

Reputation: 4 993

You can save 3 bytes by passing no args to split(), as it splits on any group of whitespace. – Gigaflop – 2018-10-05T18:17:30.903

2

Japt -h, 8 bytes

@Enigma approach

¸w ñ_¬xc

Try it online!


Another Approach

Japt -g, 8 bytes

¸ñ@-X¬xc

Try it online!

Luis felipe De jesus Munoz

Posted 2018-10-05T15:48:09.353

Reputation: 9 639

Identical to what I was about to post. The need for the reversal annoys me; would've preferred if we could output any of the words in the case of a tie. – Shaggy – 2018-10-05T18:24:59.930

@Shaggy if that was possible, I have a 6 bytes answer for it – Luis felipe De jesus Munoz – 2018-10-05T18:27:07.300

Same 6-byter I started with before spotting that requirement in the spec. – Shaggy – 2018-10-05T18:28:13.373

I'm sorry! Originally when I sandboxed the challenge I figured it could output any of the answers, but I changed it after a little feedback so it was more consistent – GammaGames – 2018-10-05T22:43:51.427

2

MATLAB, 57 bytes

s=strsplit(input('','s'));[Y I]=max(cellfun(@sum,s));s(I)

In my MATLAB R2016a all tests arepassed, except that emojis are not rendered properly. But characters are returned correctly

aaaaa says reinstate Monica

Posted 2018-10-05T15:48:09.353

Reputation: 381

2

Java (JDK), 117 97 84 bytes

-13 bytes thanks @Nevay. Apparently I didn't know I can also use var in Java.

s->{var b="";for(var a:s.split(" "))b=a.chars().sum()>b.chars().sum()?a:b;return b;}

Try it online!

Shieru Asakoto

Posted 2018-10-05T15:48:09.353

Reputation: 4 445

-13 bytes: s->{var b="";for(var a:s.split(" "))b=a.chars().sum()>b.chars().sum()?a:b;return b;} – Nevay – 2018-10-06T18:39:01.170

1

Ruby, 45 characters

->s{s.split.max_by{|w|w.codepoints.reduce:+}}

Sample run:

irb(main):001:0> ->s{s.split.max_by{|w|w.codepoints.reduce:+}}['àà as a test']
=> "àà"

Try it online!

Ruby 2.4, 40 characters

->s{s.split.max_by{|w|w.codepoints.sum}}

(Untested.)

manatwork

Posted 2018-10-05T15:48:09.353

Reputation: 17 865

1

Pyth, 33 bytes

FHmCdmcd)Kczd aYu+GHmCdH0)@KxYeSY

Try it online!

There is almost certainly a better way to do this, but I spent too much on it so this will do.

FH  #For every array of letters in 
  mCd   #the array of arrays of letters [['w', 'o', 'r', 'l', 'd'], ['h', 'e', 'l', 'l', 'o']]
     mcd)   #wrap that in another array [[hello"], ["world"]]
         Kczd   #split input(z) on spaces ["hello", "world"] and assign it to K for later
              aY     #append to list Y... " " silences the prints from the for loop.
                u+GH    #reduce the list of numbers by summing them    
                    mCdH    #convert each letter in the array to its int counterpart
                        0)    #the zero for the accumulator and close for loop
                          @K    #get by index the word from K
                            xY   #find the index in Y of that number
                              eSY   #sort Y, get the last (largest) number

I would have passed a reduce into another map instead of using the for loop, but I couldn't get that to work.

Tryer

Posted 2018-10-05T15:48:09.353

Reputation: 71

Oh boy, a pyth answer! Thanks for the explanation, nice entry! – GammaGames – 2018-10-05T22:41:50.043

1

Charcoal, 20 bytes

≔⪪S θ≔EθΣEι℅λη§θ⌕η⌈η

Try it online! Link is to verbose version of code. Explanation:

≔⪪S θ

Split the input string on spaces and assign to q.

≔EθΣEι℅λη

Calculate the sum of the ordinals of the characters in each word and assign to h.

§θ⌕η⌈η

Find the index of the highest sum and print the word at that index.

Neil

Posted 2018-10-05T15:48:09.353

Reputation: 95 035

1

Powershell, 66 bytes

Straightforward. See AdmBorkBork's answer to found a smart using of Powershell.

-split$args|%{$s=0
$_|% t*y|%{$s+=$_}
if($s-gt$x){$w=$_;$x=$s}}
$w

Note! To correct work with unicode, save your script file with UTF-16 or UTF8 with BOM encoding.

Test script:

$f = {

-split$args|%{$s=0         # split argument strings by whitespaces, for each word
$_|% t*y|%{$s+=$_}         # let $s is sum of unicode char code
if($s-gt$x){$w=$_;$x=$s}}  # if $s greater then previous one, store word and sum to variables
$w                         # return word from stored variable

}

@(
    ,("a b c d e", "e")

    ,("hello world", "world")

    ,("this is a test", "test")

    ,("àà as a test", "àà")

    ,("α ää", "α")

    ,(" 隣隣隣", "隣隣隣")

    ,("    ️  ", "️")
) | % {
    $s,$e=$_
    $r=&$f $s
    "$($r-eq$e): $r"
}

Output:

True: e
True: world
True: test
True: àà
True: α
True: 隣隣隣
True: ️

mazzy

Posted 2018-10-05T15:48:09.353

Reputation: 4 832