Word with greatest repeat of letters

8

0

Recently there was a question on Stack Overflow where the OP was trying to write a function to find the word in a string that has the most repeated letters. It is of course not hard to write one up in seconds, and I wrote one in Javascript as short as I can for fun. But I'm not an expert in code golfing, so I wonder how much shorter can this simple program be!


Challenge

Write a program or function that takes in a string of words and return or print the word with the most repeating single letters.

Rules:

  • Choose the word with the greatest number of repeating single letters (see examples below)

  • If no word has repeating letters, return -1.

  • If two words have the same maximum number of repeating letters, choose the one closer to the start of the string.

  • The shortest submission in bytes wins.

Input

Take as input a string consisting of one or more space-delimited words. Input can be from STDIN (or closest alternative), command line parameters, or function arguments.

Output

Print the output to STDOUT to return it.

Examples

Consider the string aaabbb cccc. This contains two words: aaabbb and cccc. The word aaabbb has 3 a's and 3 b's, and cccc has 4 c's. So the maximum number of repeated letters in aaabbb is 3 and the maximum in cccc is 4. We want to choose the word with the maximum number of repeated single letters, so the output for aaabbb cccc should be cccc.

Other test cases:

Today, is the greatest day ever!  --> greatest
This is a great day               --> -1
aaabbb cccc                       --> cccc

Derek 朕會功夫

Posted 2015-07-20T06:50:05.387

Reputation: 199

What if more than one words have the same # of repeated letters? Do we print any or all? – Maltysen – 2015-07-20T06:54:20.710

@Maltysen Look at the first example - see ever – isaacg – 2015-07-20T06:57:29.937

@isaacg but greatest has two repeating, t and e. – Maltysen – 2015-07-20T06:58:47.470

2I'm not sure what is meant by the number of repeated letters. I assume aabb has 2 repeated letters. Would aaaabb be considered to have 4 repeated letters (2nd, 3rd, 4th a, 2nd b) or 2 repeated letters (a and b). – feersum – 2015-07-20T06:59:02.643

@feersum aaaabb would count as 4. Pick the maximum of the numbers of repetitions in each word. – Derek 朕會功夫 – 2015-07-20T07:02:38.940

@Maltysen Only print one of them if there are more than one word with the same number of repeated letters. – Derek 朕會功夫 – 2015-07-20T07:05:53.743

@Derek朕會功夫 To clarify, by repetition do you simply mean a letter that has appeared previously in the word? – isaacg – 2015-07-20T07:33:05.610

@isaacg I mean, for example aaaabb would be chosen over aabbb since the former has 4 repetitions of "a" which is more than 3 repetitions of "b" in the latter. – Derek 朕會功夫 – 2015-07-20T07:35:54.817

So, which would win, aaabbb or cccc - do the 2nd and 3rd a's and 2nd and 3rd b's win, or does the 4 times repeated c win? – isaacg – 2015-07-20T07:44:12.447

@isaacg cccc wins, since although aaabbb has 3 a's and 3 b's, cccc has 4 c's which is greater than both 3's. – Derek 朕會功夫 – 2015-07-20T07:45:47.653

@Derek朕會功夫 I'm still not totally sure on the rules for greatest repeats. Could you please write out the conditions explicitly? – xnor – 2015-07-20T10:03:37.940

The way I understand it based on the examples and comments, I think it would be clearer to say "most repeated letter" instead of "most repeated letters" in the question, because only a single letter counts. Or maybe "the word that contains the highest frequency of a single letter". – Reto Koradi – 2015-07-20T14:46:36.607

Will the words contain any non-alphabetical characters (other than ending punctuation)? – kirbyfan64sos – 2015-07-20T17:15:14.233

@xnor I added some clarifications in the question. Sorry for the confusion! – Derek 朕會功夫 – 2015-07-20T19:44:19.460

@kirbyfan64sos You can assume punctuation marks will not repeat in the same word, ie "a,,,b" will not happen. – Derek 朕會功夫 – 2015-07-20T19:46:50.273

I made edits to the wording and formatting of the challenge. If my edits do not match your original intent, please roll back. Note that it was unclear what should happen in the case of multiple output candidates, so I added the specification from the original SO question. – Alex A. – 2015-07-21T17:49:03.893

1Note that the original question is from Coderbyte. I looked on their website for copyright information (since this is a reproduction of Letter Count I) but I couldn't find anything. – Alex A. – 2015-07-21T17:54:30.570

Answers

7

C - GCC - 159 145 135 Bytes

x=0,w=0,i,j,c;main(int _,char**z){while(--_)for(j=i=0;_[z][i];j=++i,x=c>x?w=_,c:x)for(c=0;j--;)c+=z[_][i]==z[_][j];puts(x?z[w]:"-1");}

Will update when I can truncate it a bit

How to compile: gcc -w cg.c

How to run: ./a.out word1 word2 aass ddaaa ssdddd

Output: ssdddd

No-Output/No-match: -1

First off, I cheated a bit, by taking in the words as program arguments, but getting GCC to parse the words and give me a word count for free was too much reward to pass up.

How does it work?

We have 3 nested loops. The first incrementing through each word, the next two are mimicking a bubble sort to compare values. The first letter is never compared with itself, and each subsequent letter is compared to each previous letter. Whenever a word has more same letters than a previous word, the word is stored in x, and the count of how many same letters is also stored.

GCC Abuse

We use implicit global auto int delcarations for our integers. We allow argv to be an int instead of a char (currently a char, this is a TODO). We make use of default functions such as puts and getchar. We also make use of the comma operator to overload our trinary (conditional) operator.

Want 2 less bytes?

Replace "-1" with *z and name the file -1

Program unobfuscated and untested:

int main(int argc, char * argv[])
{
    int word = 0           // Current most repeat letters word
    int count = 0;        // Current most repeat letters
    int i, j, increment; // Counter variables

    while(--argc != 0) // While we have words from program parameters
    {
        for(j = i = 0; argv[argc][i] != '\0'; j = ++i) // Iterate through each letter until end of word
        {
            for(increment = 0; j > 0; j--) // Iterative backwards through each letter
            {
                if(argv[argc][i] == argv[argc][j])
                {
                    increment++;
                }
            }
            if(increment > count) // New greatest lettered word
            {
                word = argc;
                count = increment;
            }
        }
    }

    if(word == 0)
    {
        puts("-1");
    }
    else
    {
        puts(argv[word]);
    }

    return 0;
}

Jake

Posted 2015-07-20T06:50:05.387

Reputation: 71

1Welcome to PPCG! Nice golfing; if you get time, I'd love to see an explanation of how your code works. – Toby Speight – 2015-07-21T13:52:01.230

Updated OP with an explanation – Jake – 2015-07-21T14:12:52.190

No GCC abuse, just some old style but regular C – edc65 – 2015-07-22T07:18:55.387

thats a record! congrats – Abr001am – 2015-07-25T08:41:14.960

4

Pyth, 14 bytes

eoeS/LNN+czd_1

Demonstration. Test harness.

eoeS/LNN+czd_1
                  Implicit: z = input(), d = ' '
         czd      chop z on delimeter d.
        +   _1    Add an -1 to the end of the list.
 o                Order the list by
  eS              the maximum of
    /LN           the occurence count in the word
       N          of letters in the word.
                  (If we are mapping over -1, this will give -1/-1 = 1)
                  Since Python's sort is stable, this will leave -1 at the end if
                  all words have maximum repetition count 1.
e                 Take the final element of the resulting list and print it.

isaacg

Posted 2015-07-20T06:50:05.387

Reputation: 39 268

Still doesn't seem to be working. As per feersum's comment "aaaabb" should have 4 repeats and thus more than "greatest". But https://pyth.herokuapp.com/?code=eo.-N%7BNc%2Bz%22+-1&input=Today%2C+is+the+aaaabb+greatest+day+ever%21&debug=0 still gives greatest.

– Maltysen – 2015-07-20T07:45:25.073

Putting an l before .- seems to do it. – Maltysen – 2015-07-20T07:46:21.200

@Maltysen Right, I was silly and attempted to use string sorting. – isaacg – 2015-07-20T07:46:37.993

This is short and beautiful. – Derek 朕會功夫 – 2015-07-23T02:19:31.410

@Derek朕會功夫 Thanks! Consider upvoting. – isaacg – 2015-07-23T02:33:44.103

@isaacg I have upvoted every (working) answers as they were posted actually :) – Derek 朕會功夫 – 2015-07-23T05:32:30.350

@Derek朕會功夫 Ah, well thank you again then. – isaacg – 2015-07-23T05:32:57.223

2

Haskell, 100 bytes

import Data.List
f=g.last.sort.map((,)=<<last.sort.map length.group.sort).words
g(1,_)="-1"
g(_,w)=w

Usage example: f "Today, is the greatest day ever!" -> "greatest"

How it works:

                                                words  -- split input at spaces into list of words
          map(                                 )       -- for each word
              (,)=<<                                   -- build a pair, where the second element is the word itself
                                                       -- and the first element is made by
                                           sort        -- sort the letters
                                      group            -- group equal letters
                            map length                 -- take length of each group
                        sort                           -- sort the lengths
                    last                               -- take the last
                                                       -- now we have a list of (l,w) pairs where l is how often the most frequent letter occurs for word w
     sort                                              -- sort the list
 last                                                  -- take last element
g                                                      -- call g which checks the "-1" case 

Haskell, 79 77 bytes (untested)

import Data.List
last.sortOn(last.sort.map length.group.sort).words.(++" -1")

This uses sortOn from Data.List v4.8.0.0, which I don't have installed, so I cannot test it.

nimi

Posted 2015-07-20T06:50:05.387

Reputation: 34 639

2

K, 35 bytes (missing -1 requirement, just noticed, but time to sleep)

{1#w[>{|/{#:&:x=y}[x]'x}'w:" "\:x]}

how it works:

1 take

1#

of the words indexed by

w[

the indices to put in ascending order

>

the maximum

|/

count where

#:&:

string = char

x=y

for each char in word

'x

for each word where w (words) are

'w:

the string split by spaces

\:x

Obviously I sacrificed a bit of precision for expressiveness and legibility in the English explanation. I hope it was interesting.

protist

Posted 2015-07-20T06:50:05.387

Reputation: 570

1I think you can follow the requirement by adding ,-1 to the end of the function. – kirbyfan64sos – 2015-07-20T17:21:24.250

You should be able to leave off the colons for #:&: since from context they should parse as their monadic forms. You could also use @ to index instead of brackets, and first (*) instead of 1#. – JohnE – 2015-07-21T19:12:54.557

2

Python 2, 97 77 Bytes

Fairly straightforward solution, just maps the input (surrounded by quotes) to a tuple containing the word and the number of repeated characters. Gets the maximum, and prints out the word if the most repeated letter is repeated at all, otherwise it prints -1.

I saved 20(!) bytes by rearranging the order of input so that I didn't need a key for finding the max.

j=max((max(map(x.count,x)),x)for x in input().split())
print[-1,j[1]][j[0]>1]

Kade

Posted 2015-07-20T06:50:05.387

Reputation: 7 463

2

CJam, 25 bytes

lS/{_$e`$W=0=(\}%2/$W=~W?

Try it online

Explanation:

lS/   Get input and split at spaces.
{     Start of loop over all words.
  _     Copy, need to keep original word.
  $     Sort letters.
  e`    RLE.
  $     Sort RLE result. Sort is increasing by count.
  W=    Get last count/letter pair, which corresponds to largest count.
  0=    Extract count from pair.
  (     Decrement count, so that it has a falsy value when the count is 1.
  \     Swap count and word, so that we can sort the pairs by count.
}%    End of loop over words.
2/    Split list into pairs, which are the count/word pairs.
$     Sort the pairs.
W=    Get last one, which is the largest count.
~     Unwrap the pair.
W?    Output word if count is truthy, -1 otherwise.

Reto Koradi

Posted 2015-07-20T06:50:05.387

Reputation: 4 870

1

SWI-Prolog, 158 154 149 bytes

a(A,R):-split_string(A," ","",B),findall(X:Z,(member(Z,B),string_codes(Z,D),length(D,L),sort(D,E),length(E,M),X is M-L,X<0),S),sort(S,[_:R|_]);R= -1.

Example: a("Today, is the greatest day ever!",R). outputs R = "greatest" ..

Fatalize

Posted 2015-07-20T06:50:05.387

Reputation: 32 976

1

JavaScript, 86 111 108 bytes

s=>(l=(x=s.split` `,r=x.map(l=>(/(.)\1+/.exec(l)||[''])[0].length),x)[r.indexOf(k=Math.max(...r))],k<2?-1:l)

Definitely golf-able, the whole -1 thing added about 20 bytes.

Downgoat

Posted 2015-07-20T06:50:05.387

Reputation: 27 116

1

R, 107 bytes

w=scan(,"");l=sapply(w,function(x)max(table(strsplit(x,"")[[1]])));cat(ifelse(max(l)>1,w[which.max(l)],-1))

This reads from STDIN and prints to STDOUT.

Ungolfed + explanation:

# Read a string and split it into a vector on spaces
w <- scan(, "")

# Get the maximum number of letter repeats in each element of w
l <- sapply(w, function(x) max(table(strsplit(x, "")[[1]])))

# If the largest number in l is 1, print -1, otherwise get the word
cat(ifelse(max(l) > 1, w[which.max(l)], -1)

Alex A.

Posted 2015-07-20T06:50:05.387

Reputation: 23 761

1

C# 166 Bytes

string a(string i){var z=i.Split(' ');int g=1,c=0;var m="-1";foreach(var w in z)foreach(var l in w.Distinct()){c=w.Where(x=>x==l).Count();if(c>g){g=c;m=w;}}return m;}

Straightforward coding. Nothing Special in here.

C# sucks for code golfing :-/

Stephan Schinkel

Posted 2015-07-20T06:50:05.387

Reputation: 596

1

JavaScript (ES7?), 99

Using array comprehension, that is implemented in Firefox but no more included in EcmaScript 6.

Test using the snippet below (Firefox only)

f=s=>s.split(' ').map(w=>[for(c of(l=[m=0],w))(n=l[c]=-~l[c])>m?m=n:0]&&m>x&&(x=m,v=w),x=1,v=-1)&&v

// TEST
out=x=>O.innerHTML+=x+'\n';

test=x=>out(x+' -> '+f(x))

;["aaabbb cccc","This is a great day","Today, is the greatest  day ever a!"]
.forEach(t=>test(t));
<pre id=O></pre>
Try:<input id=I><button onclick='test(I.value),I.value=""'>-></button>

Ungolfed And more compatible

function f(s)
{
  v = -1;
  x = 1;
  s.split(' ')
  .forEach(function(w){
    l=[];
    m=0;
    w.split('').forEach(function(c){
      n=l[c]=-~l[c];
      if (n>m) m=n;
    })
    if (m>x) x=m,v=w;
  })
  return v;
}

// TEST
out=function(x) { O.innerHTML+=x+'\n' }

test=function(x) { out(x+' -> '+f(x)) }

;["aaabbb cccc","This is a great day","Today, is the greatest  day ever a!"]
.forEach(function(t) { test(t)} );
<pre id=O></pre>
Try:<input id=I><button onclick='test(I.value),I.value=""'>-></button>

edc65

Posted 2015-07-20T06:50:05.387

Reputation: 31 086

1

C (167)

double m,k,n=k=2,*O,V[256];char*f(char*c,char*s){return*c?((*O=!((n+=!(*c*=*c!=32))>1.1/modf(*(O=V+*c),&m))*m+1/n+1)>k)?f(c+!!(k=*O),c):f(c+1,s):s?!*s?s+1:f(c,s-1):0;}

TRY IT

HOW DOES THIS WORK?

  • the function is inner recursive one inside another recursive function, the insider function retrieves the beginning of a string where an inclusive character is returned by first function.
  • maximum number is given by hashing characters over an ascii table.

Abr001am

Posted 2015-07-20T06:50:05.387

Reputation: 862

1

Pure Bash (no external commands) 129 bytes

This is quite long, but still compares favourably with some of the other longer entries.

m=1
w=-1
for b;{
declare -A a
for((i=0;i<${#b};i++));{
c=${b:$i:1}
let a[$c]++
d=${a[$c]}
((d>m))&&w=$b&&m=$d
}
unset a
}
echo $w

I'm not entirely happy with some of the constucts, having to use that inner for loop is annoying. Any suggestions?

Muzer

Posted 2015-07-20T06:50:05.387

Reputation: 413

1

Python, 58

max('-1',*input().split(),key=lambda w:len(w)-len(set(w)))

TheCrypt

Posted 2015-07-20T06:50:05.387

Reputation: 141

1

Q (44 bytes)

{w l?0|/l:{max(#:)'[(=)x]}'[w:" "vs"-1 ",x]}

ungolfed

{
    words:" " vs "-1 ",x;
    counts:{max count each group x} each words;
    : words counts ? max counts;
}

scottstein37

Posted 2015-07-20T06:50:05.387

Reputation: 181

1

Haskell 96 Bytes


r o=(last.sort.map length$(group.sort)o,o)
w p=n$maximum(map(r)(words p))
n (1,_)="-1"
n (_,w)=w

```


r is a function that takes a word and return a tuple (n,w) where n is the number of occurrence of the most occurring char in the word w. For instance let x="norep", y="dnredundant", then r x=(1,norep), r y=(3,ndredundant)

w is a function that takes a string containing a number of space-separated words and:

  1. Split the list on space words p

  2. foreach word create a list of (n,w)

  3. take the tuple that has the greatest n (occurrence counter)

  4. If it has n equals to 1 simply return the string -1, the word itself (stored in the second component of the tuple) otherwise.

For instance take p="Today, is the greatest day ever!"

  1. produces ["Today,","is","the","greatest","day","ever!"]

  2. [(1,"Today,"),(1,"is"),(1,"the"),(2,"greatest"),(1,"day"),(2,"ever!")]

  3. (2,"greatest")

  4. 2 != 1 then greatest is the solution!

Davide Spataro

Posted 2015-07-20T06:50:05.387

Reputation: 211