Simple portmanteau with longest overlap

4

I've been using a library called Djangular recently, which integrates Django and Angular.

After a while I thought 'Wouldn't it be fun to programmatically generate portmanteau?'

Here are some examples (e.g. testcases):

Django + Angular -> Djangular
Camel + Leopard -> Cameleopard
Spoon + Fork -> Spoork
Ham + Hammer -> Hammer

The general rule is to find the first common letter working backwards on first word, and working forwards on the last word. Not all portmanteau can be found this way though...

I've deliberately picked 'simple' ones to generate (for instance we're ignoring things like Zebra + Donkey = Zedonk, which is too arbitrary to generate programmatically - the correct behaviour).

Ties can be resolved eitherway - suppose you have cart + ante - both carte, (since t is the last common letter in the first word) and cante, (since a is the first common letter of the last word) are value. Unfun loopholes are not allowed (including pulling 'solved' examples off wikipedia)

This is code golf, so shortest answer wins. I'll pick the winner a month from posting. Please post the output for all your code.

Pureferret

Posted 2015-08-07T06:46:49.397

Reputation: 960

Could you elaborate on the rules for double letters? It seems that they are omitted if they appear in the first word but are kept if they appear in the second? Or is it that double vowels are omitted but double consonants are kept? – Martin Ender – 2015-08-07T07:20:32.130

Why isn't the double letter in hammer shortened to hamer? – Beta Decay – 2015-08-07T08:24:57.840

Shouldn't it be Cameopard? e is the first common letter in the first word. – Sean Latham – 2015-08-07T09:00:53.593

Do I understand right that you want to get the longest portmanteau by finding a pair of matching letters in the two words that come as close as possible in the concatenated string? – xnor – 2015-08-07T09:06:53.860

@sp3000 is it not obvious they have to be common? Also I'll try and clarify double letters. – Pureferret – 2015-08-07T12:08:05.857

2Read literally, "latest in first word, earliest in last word" would mean Spoon + Fork -> Spoork if we take the second o to be the latest in the first word. Similarly, the common t would mean Croissant + Donut -> Croissant. Another example, with earliest in last word this time, would be mock + cocktail = mococktail. I'm not sure if I'm just missing something major or... – Sp3000 – 2015-08-07T12:29:47.143

@Sp3000 The "that are closest" part of the definition seems the most descriptive (even though there's an error in the sentence, it says "first" twice instead of "first" and "last"). The rest just makes it more confusing. By that definition, for Croissant and Donut, the solution would be Croissanut, since there's only 3 letters between the two n's of CroissantDonut. – Reto Koradi – 2015-08-07T14:50:15.280

@RetoKoradi I realise now that was a bad example. I've made edits appropriately – Pureferret – 2015-08-07T16:12:56.000

@Sp3000 both should be valid. I'll add that in. – Pureferret – 2015-08-07T16:25:03.853

Going back to mock + cocktail, is it mocktail (k is last common letter of first) or mococktail (c is first common letter of second)? Or does it not matter? – Sp3000 – 2015-08-07T16:30:01.660

2This question is still very confusing. The title asks for the longest overlap, and I'm not sure what the body of the question is asking for, but whatever it is seems to favour shorter overlaps. – Peter Taylor – 2015-08-07T17:28:31.867

1Perhaps we could use a system like this for choosing which overlap is best: the final 'score' of each option is calculated with (final word length + letters overlapped) * (1 - abs(amount of first word used - amount of second word used)). For example, django + angular = djangular uses 5/6 of django and all of angular, a difference of 1/6. Subtract that from (the final length of 9 + the overlap of 3), and you get a total score of 10, by far the highest possible. I'd be glad to clarify any questions about this system! – ETHproductions – 2015-08-07T18:02:04.283

1I still don't understand this part: The general rule is to find the first common letter working backwards on first word, and working forwards on the last word. Does one of them take priority? – xnor – 2015-08-07T22:04:32.173

@xnor nope, neither takes precedence. 'Ties can be resolved eitherway...' – Pureferret – 2015-08-08T14:46:42.887

Answers

3

Python 3, 128 79 bytes

Let me know if I got anything wrong and critiques are always welcome.

def f(a,b):
    c=len(a)-1-a[::-1].find(b[1])
    if a[-1]==b[0]:return a[:-1]+b
    if a[c]==a[c-1]:c-=1
    return a[:c]+b[1:]

Called with f(word 1,word 2) and returns the portmanteau. Will throw an error if the inputs aren't compatible. Works for all test cases.

Edit:

If I've read the edits to this question correctly, this updated code should work

f=lambda a,b:a[:-1]+b if a[-1]==b[0] else a[:len(a)-1-a[::-1].find(b[1])]+b[1:]

cole

Posted 2015-08-07T06:46:49.397

Reputation: 3 526

2

Ruby, 53 49 bytes

Work in progress; I don't have much time atm, will fix later:

f=->s,t{s+"_"+t=~/(.*?)(.+)(.*?)_.*?\2/;$1+$2+$'}

Doesn't give the expected result for camel + leopard:

f["django","angular"] -> "djangular"
f["camel","leopard"] -> "card"
f["croissant","donut"] -> "cronut"
f["spoon","fork"] -> "spork"
f["ham","hammer"] -> "hammer"

daniero

Posted 2015-08-07T06:46:49.397

Reputation: 17 193

2

CoffeeScript, 111 bytes/characters

Source:

l="toLowerCase";f=(a,b)->return a.substr(0,i)+b.substr(j)[l]()for c,j in b when(i=a[l]().lastIndexOf(c[l]()))>0

Try it online!

To use:

alert("
#{f("Django","Angular")}
#{f("Camel","Leopard")}
#{f("Spoon","Fork")}
#{f("Ham","Hammer")}
")

I hoped it will be shorter, but.... ok, still better than PHP :D

The examples result in (using the alert statement above):

Djangular Cameleopard Spoork Hammer

Explanation (Might not be valid coffescript):

l="toLowerCase"; # Keeps code shorter (and DRY :D )
f=(a,b)->
  return a.substr(0,i)+b.substr(j)[l]() # Return the result
     for c,j in b # Postfix for loop
       when(i=a[l]().lastIndexOf(c[l]()))>0 
         # Run the return statement only when the 
         # character `c` is found (also stores its last position in `i`)

Bojidar Marinov

Posted 2015-08-07T06:46:49.397

Reputation: 209

2

Haskell, 61 bytes

Doesn't handle case, but that doesn't seem to be a big deal? Tell me if it is.

a!b=maybe(init a!b)((a++).(`drop`b))$last a`lookup`zip b[1..]

Then "Django"!"Angular" is "Djangular".

Lynn

Posted 2015-08-07T06:46:49.397

Reputation: 55 648

Oh my, I am horrible at this. Agree that case makes this challenge "ugly", so 61 vs my 99... I learn to use lookup now, and infixes. And I just confused myself cause infix operator notation in Haskell ^= code in markdown. – Leif Willerts – 2015-08-20T18:19:30.287

1

PHP (270 bytes)

function portmanteau($s1, $s2){ $r = ''; $y = false; foreach(str_split($s1) as $c1){ $r .= $c1; foreach(str_split($s2) as $c2){ if($y){ $r.= $c2; } if(strtolower($c1)==strtolower($c2)){ $y = true; continue; } } if($y){ break; } } if(!$y) $r .= $s2; return ucfirst($r); }

Ungolfed:

function portmanteau($str1, $str2){
    $resultingString = '';

    $letterMatched = false;
    foreach(str_split($str1) as $char){
        $resultingString .= $char;

        foreach(str_split($str2) as $char2){
            if($letterMatched){
                $resultingString.= $char2;
            }

            if(strtolower($char)==strtolower($char2)){
                $letterMatched = true;
                continue;
            }
        }
        if($letterMatched){
            break;
        }
    }
    if(!$letterMatched) $resultingString .= $str2;

    return ucfirst($resultingString);
}

Usage:

echo portmanteau('Croissant', 'Donut');

Tested and works with the examples in the question.

Dzhuneyt

Posted 2015-08-07T06:46:49.397

Reputation: 231

1What will it output ? Coissonut ? – The random guy – 2015-08-07T11:20:16.963

No, "Cronut". It iterates/loops the characters in the first word first, then the second word. – Dzhuneyt – 2015-08-08T18:36:50.273

1

Haskell, 148 bytes

C/case isn't restored though. Without case this was 99...

import Data.Char
o a b=|a==""=b|0<1=a
p""b=""
p a""=""
p a b|last a==head b=a++tail b|0<1=o(p(init a)b)(p a(tail b))
z a b=p(t a)(t b)
t=map toLower

Leif Willerts

Posted 2015-08-07T06:46:49.397

Reputation: 1 060

You can make o, p, and z infix operators (try !, #, %) to save some bytes everywhere (as a!b=... is shorter than o a b=...). – Lynn – 2015-08-20T16:39:04.287

1

Perl, 138

I'm sure someone can optimize this further
($a,$b,$l,$r)=(@ARGV,0);for$A(split//,$a){$r.=$A;for$B(split//,$b){$r.=$B if$l;if($A=~/$B/i){$l=1;next;}}last if$l;}$r.=$b if!$l;

138 w/ optimizations and output
($a,$b,$l,$r)=(@ARGV,0);for$A($a=~/./g){$r.=$A;for$B($b=~/./g){$r.=$B if$l;if($A=~/$B/i){$l=1;next;}}last if$l;}$r.=$b if!$l;print"$r\n";

aquaone

Posted 2015-08-07T06:46:49.397

Reputation: 11

0

rs, 24 bytes

(.*)(.).* .*?(\2.*)/\1\3

To copy the initial post from @refi64 here's the live demo and all test cases. The only improvement is that you can spare the \2 back reference by including it in the third group.

I didn't want to pollute with an answer that's actually a comment, but some person on Stack Exchange decided that you can't comment when you're a newcomer and that edits should be refused, because that should obviously be a comment. Thus, sorry for that misplaced content.

Cyril Lemaire

Posted 2015-08-07T06:46:49.397

Reputation: 1

2Hello, and welcome to PPCG! I'll comment for you if you want, and when you get 50 rep, you can comment too. However, Stack Exchange has a Be Nice policy, and unfortunately calling someone retarded is against the policy. – NoOneIsHere – 2017-07-28T18:01:47.010

P.S. You can comment on your own posts. – NoOneIsHere – 2017-07-29T06:39:57.047

xD They thought it all through! Thanks for the comment :D – Cyril Lemaire – 2017-07-29T08:35:23.570

0

rs, 26 bytes

(.*)(.).* .*?\2(.*)/\1\2\3

Live demo and all test cases.

So far this is the shortest answer!

The underlying logic is pretty simple. It just matches greedily on the first word (to get the last common character) and non-greedily on the second (to get the first common character). It then replaces the string with everything but the text in between the two characters.

kirbyfan64sos

Posted 2015-08-07T06:46:49.397

Reputation: 8 730

@CyrilLemaire posted this: (.*)(.).* .*?(\2.*)/\1\3 (but does not have enough rep to comment) – NoOneIsHere – 2017-07-28T18:02:41.850