Write a program that finds the most occurring paired letter in a string

20

6

The program must output the letter that is paired the most. For example, if your program was given the following string:

"Sally's friend Bobby searched for seashells."

it must output L because "ll" occurs twice, which is more frequent than the other pair "bb".

Rules:

  • If more than one letter has 1st place for occurrences, then output all of them in alphabetical order (e.g. "Sally's friends Jimmy and Bobby rummaged for seashells." should output both L AND M [or "LM" if you please] because they both occur more frequently than other pairs.)
  • Letters which are tripled, quadrupled, etc. count as one pair (e.g. "lll" in "willless" is counted as only one pair of L.)
  • Letter pairs must be in one word (e.g. "Sally's sociable friends Sammy and Bobby searched for fabulous seashells." should output L and not S because despite "ss" having more occurrences than "ll", they are separated by spaces.)
  • Count only letters from the English alphabet
  • Case does not matter (e.g. "Ss" is the same as "SS" or "ss", and all are counted as one pair of S.)

You may read your input from wherever you please. Shortest code wins.

ayane

Posted 2015-07-01T22:28:18.607

Reputation: 929

2Can we assume that only letters will occur in pairs or could the input contain double spaces or double ' etc? – Martin Ender – 2015-07-01T22:37:59.707

1Can we assume that at least one letter appears twice? – Martin Ender – 2015-07-01T22:40:20.540

@MartinBüttner Yes, you can assume at least one letter pair occurs. However, other characters may appear in pairs as well. Only count letters. – ayane – 2015-07-01T23:24:10.193

Even if there is only one pair, can I still print it in a list like ['l']? – Maltysen – 2015-07-02T01:28:09.100

@Maltysen Yes, you may do so. – ayane – 2015-07-02T02:18:41.747

Fun challenge. My main suggestion would be to add some more tests. For example: Mix of upper/lower case for repeated letters. Tripled/quadrupled letters. Non-letter character is the most repeated. – Reto Koradi – 2015-07-02T06:13:15.417

@AlexA. It says "output all of them in alphabetical order". – Reto Koradi – 2015-07-02T21:06:07.990

@RetoKoradi: Sorry, missed that. Thanks. – Alex A. – 2015-07-02T21:10:49.757

Answers

6

Pyth, 26 25 24 16 15 bytes

.M/sfthTrrz08ZG

Try it online: Demonstration

Explanation:

.M/sfthTrrz08ZG   implicit: z = input string
         rz0      convert z to lower-char
        r   8     run-length-encoding (into tuples [count, char])
    f             filter for tuples T, which satisfy:
     thT            T[0]-1 != 0 (count > 1)
   s              join to a string
.M            G   find the elements Z of "abcd...z", which produce the highest value:
  /...........Z       count of Z in ...

Jakube

Posted 2015-07-01T22:28:18.607

Reputation: 21 462

1eC -> s saves one byte. – isaacg – 2015-07-05T05:13:31.220

Do you know any good resources which I could use to learn Pyth? – Beta Decay – 2015-07-05T09:42:57.937

@BetaDecay You can find a tutorial about Pyth at http://pyth.readthedocs.org It doesn't cover all functionalities and tricks, but it is a good start. And if you have any questions, just ask in the chat.

– Jakube – 2015-07-05T11:44:26.737

7

Bash + GNU coreutils, 133

grep -Eo '([A-Z])\1+'<<<"${1^^}"|cut -c1|sort|uniq -c|sort -rn|while read n l
do((a-n&&a-0))&&exit||echo $l&&a=$n
done|sort|tr -d \\n

Testcases:

$ for t in "Sally's friend Bobby searched for seashells." \
> "Sally's friends Jimmy and Bobby rummaged for seashells." \
> "willless" \
> "Sally's sociable friends Sammy and Bobby searched for fabulous seashells." \
> "11ss11aa"
> do
> ./mostpaired.sh "$t"
> echo
> done
L
LM
LS
L
AS
$ 

Digital Trauma

Posted 2015-07-01T22:28:18.607

Reputation: 64 644

Does it only count letters? (testcase: 11ss11aa -> SA) – edc65 – 2015-07-02T04:03:26.070

@edc65 There I fixed it ;-). Actually, 11ss11aa -> AS :) – Digital Trauma – 2015-07-02T04:30:23.770

I think that your sort -r needs to be sort -rn if you have 10 or more paired letters. – Toby Speight – 2015-07-02T13:06:19.677

@TobySpeight. Yes. Fixed. – Digital Trauma – 2015-07-02T15:16:12.197

can make it shorter with AWK instead of while:

awk '!n{n=$1};n==$1'|grep -o .$
 – Nik O'Lai  – 2015-07-03T12:49:59.973

5

CJam, 29 27 bytes

leue`{2a>},s_el-$e`$z~\)-,>

Thanks to @Optimizer for golfing off 2 bytes!

Try it online in the CJam interpreter.

How it works

leu    e# Read a line from STDIN and covert to uppercase.
e`     e# Perform run-length encoding.
       e# Example: "AABBBC!!" -> [[2 'A] [3 'B] [1 'C] [2 '!]]
{2a>}, e# Filter out all pairs that are less of equal to [2].
s      e# Stringify.
       e# Example: [[2 'A] [3 'B] [2 '!]] -> "2A3B2!"
_el    e# Push a copy of the string and convert to lowercase.
       e# Example: "2A3B2!" -> "2a3b2!"
-      e# Remove all character from the second string from the first.
       e# Example: "2A3B2!" "2a3b2!" - -> "AB"
$e`$   e# Sort, perform run-length encoding and sort again.
       e# Example: "ABACABDBC" -> "AAABBBCCD"
       e#                      -> [[3 'A] [3 'B] [2 'C] [1 'D]]
                               -> [[1 'D] [2 'C] [3 'A] [3 'B]]
z~     e# Zip and dump.
       e# Example: [[1 'D] [2 'C] [3 'A] [3 'B]] -> [1 2 3 3] ['D 'C 'A 'B]
\)     e# Pop out the last element from the first array.
       e# Example: [1 2 3 3] -> [1 2 3] 3
-      e# Remove all occurrences of the popped element from the array.
       e# Example: [1 2 3] 3 -> [1 2]
,      e# Compute the length of the remainder.
>      e# Skip that many elements from the character array.

Dennis

Posted 2015-07-01T22:28:18.607

Reputation: 196 637

z~\)-,> should work as far as I can see. – Optimizer – 2015-07-02T21:05:03.537

@Optimizer: Shorter and a lot more intuitive. Thanks! – Dennis – 2015-07-02T21:07:10.073

4

Pyth - 23 22 21 20 bytes

Uses regexp substitution to replace all of two or greater of alphabet to a temp value, and uses .Maximal to get all the have the highest occurrence. Thanks to @Jakube for pointing out the redundancy of sorting and saving a byte.

.M/:rz0+Z"{2,}"KC0KG

Takes input from stdin and outputs like ['l', 'm'] to stdout.

.M        G         Values which yield maximal amount over lowercase alphabet
 /                  Count
  :                 Regexp substitution
   rz0              Lowercased input
   +Z               String concatenate current loop var         
    "{2,}"          Regexp 2 or more of previous group
   KCZ              Inline assign null byte to K and use value
  K                 Count K which is null byte

Try it online here.

Maltysen

Posted 2015-07-01T22:28:18.607

Reputation: 25 023

4

C, 155

Something different, no regexps.

m=1,i=1,n[91];
main(c,a)char**a;
{
  char*t=a[1];
  for(;c=*t++;)(c&=95)>64&&c<91&&(c-(*t&95)?i=1:(c=(n[c]+=i),i=0,m=m<c?c:m));
  for(c=0;++c<91;)n[c]-m||putchar(c);
}

edc65

Posted 2015-07-01T22:28:18.607

Reputation: 31 086

3

Python 2, 132 143 bytes

def f(x):import re;x=re.findall(r'(.)\1+',x.upper());s={l:x.count(l)for l in x};print "".join(sorted([l for l in s if s[l]==max(s.values())]))

Example run:

f("Sally's friends Jimmy and Bobby rummaged for seashells.")
LM

heo

Posted 2015-07-01T22:28:18.607

Reputation: 91

1Probably it doesn't fulfil "Letters which are tripled, quadrupled, etc. count as one pair" – Ginden – 2015-07-02T13:27:09.163

You're right! i've tried to fix it. thanks for pointing that out :) – heo – 2015-07-02T13:51:29.090

2

R, 105 bytes

cat(substr(names(b<-table(regmatches(s<-toupper(readline()),gregexpr("([A-Z])\\1+",s))))[b==max(b)],1,1))

This reads a line of text from STDIN and prints a space delimited list of the most common paired letters to STDOUT.

Ungolfed + explanation:

# Read a string from STDIN, convert to uppercase
s <- toupper(readline())

# Get each match of the regex /([A-Z])\1+/
matches <- regmatches(s, gregexpr("([A-Z])\\1+", s))

# Compute the frequency of each match
freq <- table(matches)

# Get the matches with the highest frequency
highest <- names(freq)[freq == max(freq)]

# Each element of highest is the literal pair, so take the first character
first <- substr(highest, 1, 1)

# Print to STDOUT
cat(first)

Examples:

> (code)
Sally's friends Jimmy and Bobby rummaged for seashells.
L M

> (code)
Sally's friend Bobby searched for seashells.
L

> (code)
Sally's sociable friends Sammy and Bobby searched for fabulous seashells.
L

> (code)
11ss11nn
N S

You can try it online!

Alex A.

Posted 2015-07-01T22:28:18.607

Reputation: 23 761

You can probably get rid of the toupper if you ignore case and use perl in your gregexpr. eg cat(substr(names(b<-table(regmatches(s<-readline(),gregexpr("(\\w)\\1+",s,T,T))))[b==max(b)],1,1)) – MickyT – 2015-07-02T02:05:08.590

@MickyT: I had thought about that, but it looks like the OP wants the output to be uppercase, so it'd have to use toupper to ensure that anyway. – Alex A. – 2015-07-02T02:07:46.883

That's ashame, I missed that when I read the question. – MickyT – 2015-07-02T02:45:56.470

Tried the fiddle, but it seems to run forever with no outpuy in firefox. Testcase : 11ss11nn? – edc65 – 2015-07-02T04:19:42.340

@edc65 It's a problem with R-Fiddle; nothing at all works. I contacted their admin to report the issue. Fixed my regex and now your test works as expected, but it cost me 2 bytes. Thanks for pointing this stuff out, I appreciate it! – Alex A. – 2015-07-02T14:26:08.523

2

CJam, 37 bytes

leue`{~_el-\(e&},1f=$e`$_W=0=f-{,1=},

Try it online

Without regular expression support, I'm afraid it will be tough to compete with Pyth. This is the best I came up with on a first pass.

Explanation:

l     Get input.
eu    Convert it to upper case, since case does not matter.
e`    Run length encoding, to split into groups of same characters.
{     Start of block for filtering.
  ~     Unpack the length/letter pair.
  _     Copy the letter.
  el    Change copy to lower case.
  -     Subtract to compare. If result is non-zero, this is a letter.
  \     Swap count to top.
  (     Decrement to get truthy value for count > 1.
  e&    Logical and: It's a letter, and count is > 1.
},    End of filter.
1f=   Don't need the counts anymore, filter out the letters only from the RLE pairs.
$     Sort them, so that multiples of the same letter are sequential.
e`    RLE again, to count how many multiples of each letter we had.
$     And sort again, to get the count/letter pairs in order of incrementing count.
_     Copy list.
W=0=  Pick out count of last element, which is the highest count.
f-    Remove count from pairs that have the highest count. This leaves them
      as one member lists with letter only, while others still have count/letter.
{     Start block for filter.
  ,1=   Check for list length one.
},    End filter.

Reto Koradi

Posted 2015-07-01T22:28:18.607

Reputation: 4 870

2

Q (66)

Relatively readable to boot:

{where g=max g:.Q.A#count each group y where not differ y:upper x}

skeevey

Posted 2015-07-01T22:28:18.607

Reputation: 4 139

2

Python 2, 185 159 153

i=input().lower()
d=sorted({l:len(i.split(l+l))for l in map(chr,range(97,123))}.items(),None,lambda x:x[1],1)
print sorted(c for c,k in d if k==d[0][1])

Takes input as a quoted string.

Daniel Wakefield

Posted 2015-07-01T22:28:18.607

Reputation: 484

2

Ruby, 60

f=->s{(?a..?z).group_by{|l|s.scan(/#{l*2}+/i).size}.max[1]}

p f["Sally's friends Jimmy and Bobby rummaged for seashells."]

group_by creates a hash (dictionary) structure where the keys are the output of the block and the values are lists of letters that result in each key. In this case, the keys are counts of 2+ runs of a letter, case-insensitive. max compares each [key,value] tuple lexicographically, so it just finds the maximum key. Then [1] returns the value list part of the tuple.

histocrat

Posted 2015-07-01T22:28:18.607

Reputation: 20 600

2

C# 160 Bytes

Where s is the input:

char? d(string s){s=s.ToUpper();return s.Select((x,i)=>new{y=x,z=i==0?(char?)null:s[i-1]}).Where(x=>x.y==x.z).GroupBy(x=>x.z).OrderBy(x=>x.Count()).Last().Key;}

DLeh

Posted 2015-07-01T22:28:18.607

Reputation: 1 111

1

rs, 146 bytes

[^A-Za-z]/
*(.)\1+/\1\1
*(.)(?!\1)/
$/#
+*(.)#(?!.*?\1)/#\1
+*(.)(.*)#(.*)\1/\2#\3\1\1
#/
*(.)(\1*)/\1(_)^^((^^\1\2))
([^_])(_+)(?!_)(?=.*\2_)/
_/

Try it! Please! It took me forever to make the buttons even with the output box on that page...

Well, this was fairly...crazy. The logic here is kind of weird; I'll only post an explanation if someone asks. (Of course, I also said that for an INTERCAL answer whose explanation was requested...that I never explained... ;)

kirbyfan64sos

Posted 2015-07-01T22:28:18.607

Reputation: 8 730

I like the interpreter, but you might want to put the debug checkbox on the same line as the buttons or something. It kinda looks weird all the way up their. Still cool! +1 – Maltysen – 2015-07-02T02:58:37.717

Tried it (errors...) http://i.stack.imgur.com/mTioT.png

– edc65 – 2015-07-02T04:34:12.717

@Maltysen I'll consider that. Thanks! – kirbyfan64sos – 2015-07-02T14:01:33.983

@edc65 Damn...what was the complete error message? Sounds like it could be a PyPy.js bug. Or just the fact I never tested this on Firefox... – kirbyfan64sos – 2015-07-02T14:02:37.703

1

JavaScript 156 153

var x=prompt(),f={},a=0,o
x.toUpperCase().replace(/([A-Z])\1+/g,function(m,s){m=f[s]=-~f[s]
if(a==m)o.push(s)
if(a<m)a=m,o=[s]})
alert(o.sort().join(""))

wolfhammer

Posted 2015-07-01T22:28:18.607

Reputation: 1 219

f[s]?f[s]+1:1 -> -~f[s] – edc65 – 2015-07-02T03:51:45.710

It fails with non-letters: Count only letters from the English alphabet – edc65 – 2015-07-02T04:00:38.640

Thanks @edc65. I've added the shortcut and A-Z checking. – wolfhammer – 2015-07-02T09:07:03.157

1Your exact code, streamlned and ES6: f=x=>{x.toUpperCase(f={},a=0,o).replace(/([A-Z])\1+/g,(m,s)=>a<(m=f[s]=-~f[s])?(a=m,o=[s]):a>m?0:o.push(s));alert(o.sort().join'')} (the last 2 '' are really backticks, ` – edc65 – 2015-07-02T09:35:33.107

1

Bash + textutils (grep, sed), 111 chars

fold -1<<<$s|uniq -iD|sort -f|uniq -ic|sort -rn|grep -i [A-Z]|sed -n '1h;G;s/\(\s*\S\+\s\)\(.\)\n\1./\2/p'|sort

Bash + awk (instead of sed), 97 chars

fold -1<<<$s|uniq -iD|sort -f|uniq -ic|sort -rn|grep -i [A-Z]|awk '!n{n=$1};n==$1{print $2}'|sort

to test it, first assign s

s="Sally's friends Jimmy ää and Bobby rummaged ää for seashells."

Nik O'Lai

Posted 2015-07-01T22:28:18.607

Reputation: 585

0

R, 98 bytes

Very similar to Alex's solution, but uses a substitution rather than a match to determine consecutive letters. Scan is used to get the input and also to split the substitution result on spaces.

cat(names(a<-table(scan(,'',t=gsub('([A-z]?)(\\1?)[^A-z]*','\\U\\2 ',scan(,''),T,T))))[a==max(a)])

A couple of tests

> cat(names(a<-table(scan(,'',t=gsub('([A-z]?)(\\1?)[^A-z]*','\\U\\2 ',scan(,''),T,T))))[a==max(a)])
1: 11 was a race horse, 22 was one too. 11 won one race and 22 one won too.
19: 
Read 18 items
Read 2 items
O
> cat(names(a<-table(scan(,'',t=gsub('([A-z]?)(\\1?)[^A-z]*','\\U\\2 ',scan(,''),T,T))))[a==max(a)])
1: Sally's friends Jimmy and Bobby rummaged for seashells.
9: 
Read 8 items
Read 5 items
L M
> cat(names(a<-table(scan(,'',t=gsub('([A-z]?)(\\1?)[^A-z]*','\\U\\2 ',scan(,''),T,T))))[a==max(a)])
1: 11ss11nn
2: 
Read 1 item
Read 2 items
N S
> 

MickyT

Posted 2015-07-01T22:28:18.607

Reputation: 11 735