Supreme Sum String

15

1

Supreme Sum String

Given an input string, return the word with the highest sum of each of its unicode characters.

Rules

  • The input should be seperated by whitespace
  • The value of each word is based on the sum of each character in the word's UTF-16 code
  • The output should be the first word with the highest value (in case of duplicate sums)

Examples

Input: "a b c d e"
Output: "e"

Input: "hello world"
Output: "world"

Input: "this is a test"
Output: "test"

Input: "àà as a test"
Output: "àà"

Input "α ää"
Output: "α"

Input: " 隣隣隣"
Output: "隣隣隣"

Input: "    ️  "
Output: "️"

This is code golf, so the shortest answer wins! Good luck :)

GammaGames

Posted 7 years ago

Reputation: 995

Will there always be at least one space (at least 2 words)? – Emigna – 7 years ago

If there's only one word just output the single word, since it's the max – GammaGames – 7 years ago

2This would have been more interesting with ASCII instead of Unicode, because more languages could have participated. Requiring Unicode support doesn't seem to add anything to the challenge – Luis Mendo – 7 years ago

1I mostly used Unicode because it has emojis lol – GammaGames – 7 years ago

2Since many of the current answers seem to use the sum of UTF-8 or UTF-32 code units, you should add some additional test cases. For example "α ää" yields different results with UTF-8 (383 < 718) and UTF-16 (945 > 456). – nwellnhof – 7 years ago

" 隣隣隣" could be used to weed out answers that simply add codepoints (UTF-32). Also, do you really mean to sum UTF-16 code units (16-bit numbers) or something else? – nwellnhof – 7 years ago

I mean the number given by Javascript's charCodeAt function, which is a UTF-16 code according to the documentation. That's what I used when I was testing how feasible the challenge was. I'll add the other test cases though! – GammaGames – 7 years ago

When you say separated by whitespace... is input separated by newline allowed? – JayCe – 7 years ago

1Yeah, newlines area allowed. Tabs too! – GammaGames – 7 years ago

Can we take input as an array/list of words? – Shaggy – 7 years ago

It cannot be an array, it has to be a string of words with any whitespace characters in between – GammaGames – 7 years ago

Answers

3

Jelly, 7 bytes

ḲOS$ÐṀḢ

Try it online!

ḲOS$ÐṀḢ
Ḳ        Split input on spaces
    ÐṀ   Give words that have maximum of:
   $       Monad:
 O           ord(each character)
  S          sum
      Ḣ  First word that gives the max ord-sum.

dylnan

Posted 7 years ago

Reputation: 4 993

If the spec is relaxed to input being allowed as a list of words then O§MḢị – Jonathan Allan – 7 years ago

@JonathanAllan Where did OP say that was allowed? – dylnan – 7 years ago

didn't just if... – Jonathan Allan – 7 years ago

@JonathanAllan Ah, gotcha. – dylnan – 7 years ago

If you can take a list of words, each on their own line, that would be valid too. I don't know Jelly so I don't know if the 'whitespace' flexibility helps much – GammaGames – 7 years ago

1@GammaGames It would help if I could take a list of strings, e.g. ["abc", "def"]. But at this point there are a lot of answers so I don't recommend adding new methods of input – dylnan – 7 years ago

That's why I haven't added it, though I do see why it would have been nice – GammaGames – 7 years ago

As of now, you have the lowest answer! Congrats! – GammaGames – 7 years ago

7

Perl 6, 34 bytes

*.words.max(*.encode('utf16').sum)

Try it online!

Sean

Posted 7 years ago

Reputation: 4 136

6

R, 77 69 59 58 56 44 bytes

A group effort now.

'^'=mapply
sort(-sum^utf8ToInt^scan(,""))[1]

Try it online!

Convert to code points, sum each word, negate, (stably) sort, return first element.

Technically the return value is a "named vector" whose value is the sum and name is the winning word, but this seems to follow the rules. If you want to return the winning word as a string, you'd have to spend 7 more bytes and wrap the above in names().

ngm

Posted 7 years ago

Reputation: 3 974

Is there a reason there's spaces in front of the word? When I run it on" ️ " it prints out " ️ " (with a bunch of spaces in front of it) – GammaGames – 7 years ago

2@GammaGames the output is what is called a "named vector" in R. In this case the value is the sum of the code points of the winning word, and the name is printed along with it, which in this case is the winning word itself. The name is right-aligned to the number below it. – ngm – 7 years ago

Oh, neat! It looks like it does follow the rules, so I'll allow it. Cool entry! – GammaGames – 7 years ago

sort(-sapply(...)) is shorter by 3 bytes. – Giuseppe – 7 years ago

oh and you can apply scan() twice instead of el(strsplit()) to make this 69 bytes

– Giuseppe – 7 years ago

Map for 64bytes, I am sure it can be improved further.tio – JayCe – 7 years ago

oh, and the double scan can be shortened to scan(,"",t=scan(,"")) for -1 byte. Try it online!

– Giuseppe – 7 years ago

3@JayCe mapply does the unlist for free. – ngm – 7 years ago

Newline is white space (I think) so one scan should be sufficient. Checking with OP. – JayCe – 7 years ago

With some rebinding of ^ I've got your amazing answer down to 56 bytes

– J.Doe – 7 years ago

5

05AB1E, 8 bytes

ð¡RΣÇO}θ

Try it online!

Explanation

ð¡          # split input on spaces
  R         # reverse the resulting list
   Σ  }     # sort by
    ÇO      # sum of character codes
       θ    # take the last

Emigna

Posted 7 years ago

Reputation: 50 798

Wow, I'm always amazed by the answers made in dedicated golfing languages! – GammaGames – 7 years ago

Why do you need to reverse the resulting list? It's gonna get sorted anyways right? Or does the R actually reverse the list after it's sorted? – FireCubez – 7 years ago

@FireCubez For test case àà as a test the àà and test have the same largest unicode sum. So without the reverse test would be output instead of àà. Btw, Emigna, use # to save a byte. ;) EDIT: Never mind. I see it doesn't wrap the input in a list for single word inputs.. That's unfortunate. – Kevin Cruijssen – 7 years ago

4

JavaScript (ES6), 81 bytes

s=>s.split` `.map(m=s=>m=[...s].map(c=>t+=c.charCodeAt(),t=0)&&t<=m?m:(r=s,t))&&r

Try it online!

Arnauld

Posted 7 years ago

Reputation: 111 334

That's way better than the code I came up with when I was writing the challenge, mine was ~200 chars long! – GammaGames – 7 years ago

72 bytes – guest271314 – 7 years ago

@guest271314 doesn't work for the second last test case and some extreme cases like f(" 龘龘龘龘龘") – Shieru Asakoto – 7 years ago

Oh nvm 隣(\uf9f1) was the one in CJK Compatibility Ideograph block instead lol. Thought it was 隣(\u96a3), the one in CJK Unified Ideograph block. – Shieru Asakoto – 7 years ago

Apparently 龘龘龘龘龘 would be the expected result I'd say. – Shieru Asakoto – 7 years ago

@ShieruAsakoto The sum of "龘龘龘龘龘" is 204280 and the sum of "" is 55357, correct? – guest271314 – 7 years ago

@guest271314 the sum of should he 112191 because of utf-16 encoding – Shieru Asakoto – 7 years ago

4

jq, 61 43 57 37 characters

(57 39 53 33 characters code + 4 characters command line options)

./" "|reverse|max_by(explode|add)

Sample run:

bash-4.4$ jq -Rr './" "|reverse|max_by(explode|add)' <<< 'àà as a test'
àà

Try it online!

manatwork

Posted 7 years ago

Reputation: 17 865

Indeed. Missed that case. ☹ Thanks, @nimi. – manatwork – 7 years ago

4

PowerShell, 74 52 bytes

(-split$args|sort{$r=0;$_|% t*y|%{$r+=$_};$r}-u)[-1]

Try it online!

Thanks to mazzy for a whopping -22 bytes.

-splits the input $args on whitespace, pipes that into sort with a particular sorting mechanism {...} and the -unique flag.

Here we're taking the current word $_, changing it toCharArray, then for each letter we're adding it into our $result. That turns the string into a number based on its UTF-16 representation.

For once, PowerShell having all strings be UTF-16 in the background is a life-saver!

We then encapsulate those results in (...) to transform them into an array and take the last [-1] one, i.e., the largest result that's the closest to the start of the sentence. This works because of the -unique flag, i.e., if there's a later element that has the same value, it's discarded. That word is left on the pipeline and output is implicit.

AdmBorkBork

Posted 7 years ago

Reputation: 41 581

it's smart. Thanks. 2 moments: why not sort -u instead a reverse? can be enough + to convert in the number? (-split$args|sort{($_|% t*y|%{+$_})-join"+"|iex} -u)[-1] – mazzy – 7 years ago

more golf: (-split$args|sort{$r=0;$_|% t*y|%{$r+=$_};$r}-u)[-1] :) – mazzy – 7 years ago

@mazzy Yes, thanks! – AdmBorkBork – 7 years ago

4

Pyth, 8 bytes

h.MsCMZc

Test suite

I know there's already a Pyth answer but I feel like this uses a pretty different approach and also it's waaaay shorter

Explanation:
h.MsCMZc  | Full code
h.MsCMZcQ | with implicit variables added
----------+------------------------------------
h         | The first element of
       cQ | the input chopped at whitespace
 .M       | with the maximal value for
   s      | the sum of
    CMZ   | the Unicode value of each character

hakr14

Posted 7 years ago

Reputation: 1 295

Wow, that's really precise! Thanks for the explanation! – GammaGames – 7 years ago

3

Python 3, 55 52 bytes

lambda s:max(s.split(),key=lambda w:sum(map(ord,w)))

Try it online!

  • -3 bytes thanks to Gigaflop for pointing out that no argument is needed in the split method.

dylnan

Posted 7 years ago

Reputation: 4 993

You can save 3 bytes by passing no args to split(), as it splits on any group of whitespace. – Gigaflop – 7 years ago

2

Japt -h, 8 bytes

@Enigma approach

¸w ñ_¬xc

Try it online!


Another Approach

Japt -g, 8 bytes

¸ñ@-X¬xc

Try it online!

Luis felipe De jesus Munoz

Posted 7 years ago

Reputation: 9 639

Identical to what I was about to post. The need for the reversal annoys me; would've preferred if we could output any of the words in the case of a tie. – Shaggy – 7 years ago

@Shaggy if that was possible, I have a 6 bytes answer for it – Luis felipe De jesus Munoz – 7 years ago

Same 6-byter I started with before spotting that requirement in the spec. – Shaggy – 7 years ago

I'm sorry! Originally when I sandboxed the challenge I figured it could output any of the answers, but I changed it after a little feedback so it was more consistent – GammaGames – 7 years ago

2

MATLAB, 57 bytes

s=strsplit(input('','s'));[Y I]=max(cellfun(@sum,s));s(I)

In my MATLAB R2016a all tests arepassed, except that emojis are not rendered properly. But characters are returned correctly

aaaaa says reinstate Monica

Posted 7 years ago

Reputation: 381

2

Java (JDK), 117 97 84 bytes

-13 bytes thanks @Nevay. Apparently I didn't know I can also use var in Java.

s->{var b="";for(var a:s.split(" "))b=a.chars().sum()>b.chars().sum()?a:b;return b;}

Try it online!

Shieru Asakoto

Posted 7 years ago

Reputation: 4 445

-13 bytes: s->{var b="";for(var a:s.split(" "))b=a.chars().sum()>b.chars().sum()?a:b;return b;} – Nevay – 7 years ago

1

Ruby, 45 characters

->s{s.split.max_by{|w|w.codepoints.reduce:+}}

Sample run:

irb(main):001:0> ->s{s.split.max_by{|w|w.codepoints.reduce:+}}['àà as a test']
=> "àà"

Try it online!

Ruby 2.4, 40 characters

->s{s.split.max_by{|w|w.codepoints.sum}}

(Untested.)

manatwork

Posted 7 years ago

Reputation: 17 865

1

Pyth, 33 bytes

FHmCdmcd)Kczd aYu+GHmCdH0)@KxYeSY

Try it online!

There is almost certainly a better way to do this, but I spent too much on it so this will do.

FH  #For every array of letters in 
  mCd   #the array of arrays of letters [['w', 'o', 'r', 'l', 'd'], ['h', 'e', 'l', 'l', 'o']]
     mcd)   #wrap that in another array [[hello"], ["world"]]
         Kczd   #split input(z) on spaces ["hello", "world"] and assign it to K for later
              aY     #append to list Y... " " silences the prints from the for loop.
                u+GH    #reduce the list of numbers by summing them    
                    mCdH    #convert each letter in the array to its int counterpart
                        0)    #the zero for the accumulator and close for loop
                          @K    #get by index the word from K
                            xY   #find the index in Y of that number
                              eSY   #sort Y, get the last (largest) number

I would have passed a reduce into another map instead of using the for loop, but I couldn't get that to work.

Tryer

Posted 7 years ago

Reputation: 71

Oh boy, a pyth answer! Thanks for the explanation, nice entry! – GammaGames – 7 years ago

1

Charcoal, 20 bytes

≔⪪S θ≔EθΣEι℅λη§θ⌕η⌈η

Try it online! Link is to verbose version of code. Explanation:

≔⪪S θ

Split the input string on spaces and assign to q.

≔EθΣEι℅λη

Calculate the sum of the ordinals of the characters in each word and assign to h.

§θ⌕η⌈η

Find the index of the highest sum and print the word at that index.

Neil

Posted 7 years ago

Reputation: 95 035

1

Powershell, 66 bytes

Straightforward. See AdmBorkBork's answer to found a smart using of Powershell.

-split$args|%{$s=0
$_|% t*y|%{$s+=$_}
if($s-gt$x){$w=$_;$x=$s}}
$w

Note! To correct work with unicode, save your script file with UTF-16 or UTF8 with BOM encoding.

Test script:

$f = {

-split$args|%{$s=0         # split argument strings by whitespaces, for each word
$_|% t*y|%{$s+=$_}         # let $s is sum of unicode char code
if($s-gt$x){$w=$_;$x=$s}}  # if $s greater then previous one, store word and sum to variables
$w                         # return word from stored variable

}

@(
    ,("a b c d e", "e")

    ,("hello world", "world")

    ,("this is a test", "test")

    ,("àà as a test", "àà")

    ,("α ää", "α")

    ,(" 隣隣隣", "隣隣隣")

    ,("    ️  ", "️")
) | % {
    $s,$e=$_
    $r=&$f $s
    "$($r-eq$e): $r"
}

Output:

True: e
True: world
True: test
True: àà
True: α
True: 隣隣隣
True: ️

mazzy

Posted 7 years ago

Reputation: 4 832