Generate a Portmantout!

16

2

Background

Three years ago, this guy Tom Murphy got it into his head to extend the idea of a portmanteau to all words in a language and called this a portmantout (portmanteau plus tout [French for all]). Defining English as a list of 108,709 words, he managed to find a sequence of 611,820 letters with the following two properties:

  • Every English word is contained in the string.
  • Some neighborhood containing any two adjacent letters in the string is an English word.

Here's a link to a page on which this portmantout can be found (along with a video explanation).

A portmantout

The first of the two properties of a portmantout is easy to understand. The second may require some explanation.

Basically, words must overlap. "golfcode" will never appear in a portmantout of English, as there is no word there that contains the "fc". However, you might find "codegolf" in a portmantout, for "ego" bridges the gap (and all other pairs of letters are in either "code" or "golf").

Your task:

Write a program or function that takes a list of strings and returns any portmantout of the list.

This Python 3 code will verify a portmantout.

Test cases

All lists are unordered; that is,

{"code", "ego", "golf"} -> "codegolf"
{"more", "elm", "maniac"} -> "morelmaniac" or "morelmorelmaniac" or "morelmorelmorelmaniac" or...
    Would a morelmaniac be some sort of mycologist?
{"ab", "bc", "cd", "de", "ef", "fg", "gh", "hi", "ij", "jk", "kl", "lm", "mn", "no", "op", "pq", "qr", "rs", "st", "tu", "uv", "vw", "wx", "xy", "yz", "za"} -> "abcdefghijklmnopqrstuvwxyza" or "rstuvwxyzabcdefghijklmnopqrstuvwxyzabcdef" or any 27+ letters in order

And why not? The massive one on Murphy's site, if your code executes within reasonable time.

Rules

  • Your code must halt.
  • You need not return the same portmantout with each execution.
  • You may assume all strings consist of only lowercase letters a through z.
  • If no portmantout is possible, your program may do anything. Ex: {"most", "short", "lists"}
  • Standard rules for I/O and loopholes apply.

This is , so the shortest solution (in bytes) in each language wins! Happy golfing!

Khuldraeseth na'Barya

Posted 2018-06-06T23:01:33.827

Reputation: 2 608

Sandbox – Khuldraeseth na'Barya – 2018-06-06T23:02:15.377

1Maybe some test cases? – Adám – 2018-06-06T23:20:48.430

{"sic", "bar", "rabbits", "cradle"} -> "barabbitsicradle" {"mauve", "elated", "cast", "electric", "tame"} -> "mauvelectricastamelated" (more test cases) – sundar - Reinstate Monica – 2018-06-07T00:19:24.380

2Yeah, maybe a testcase where a word needs to be used twice – ASCII-only – 2018-06-07T01:04:33.830

Can we return all possible portmantouts? – Erik the Outgolfer – 2018-06-07T08:30:35.193

Will we ever get one word entirely contained in another (e.g. meow and homeowner)? – None – 2018-06-07T14:57:14.077

Is it okay if the program halts with probability 1, but could theoretically run forever? – None – 2018-06-07T15:01:57.687

2Will we ever get 1-letter words? – None – 2018-06-07T15:11:36.160

how do you verify that it returns a different one?

https://www.random.org/analysis/dilbert.jpg

– fortran – 2018-06-07T16:56:28.847

Related, not similar & old enough to be dead. – Magic Octopus Urn – 2018-06-07T18:24:08.577

@fortran The answer poster must prove that their answer is valid, not the community. – user202729 – 2018-08-03T03:28:48.223

@Mnemonic (1) I think there is a meta post about that, that says "work with probability 1 for all possible input". (2) The question doesn't specify that, so I don't see why not. Given that it's guaranteed that there is a portmantout, programs can simply filter out those 1-character ones.

– user202729 – 2018-08-03T03:31:10.453

@user202729 if you have to explain a joke, it's not funny anymore – fortran – 2018-08-07T09:17:50.467

Answers

3

Python 2, 204 202 bytes

def f(l,s=''):
 if all(w in s for w in l):return s
 for i,w in enumerate(l):
	a=next((s+w[i:]for i in range(len(w)-1,0,-1)if s[-i:]==w[:i]),0)if s else w;x=a and f(l[:i]+l[i+1:]+[l[i]],a)
	if x:return x

Try it online!


Saved

  • -2 bytes, thanks to recursive

TFeld

Posted 2018-06-06T23:01:33.827

Reputation: 19 246

You can use tabs on the last two lines to save 2 bytes. – recursive – 2018-06-07T16:03:28.953

200 bytes. – Jonathan Frech – 2018-06-08T19:58:28.733

This doesn't produce the correct output for ["ab", "ba", "ca"]. My solution has the same bug. – recursive – 2018-06-10T00:35:57.210

1

Pyth, 39 bytes

JQW}_1mxbdQ=+J|=b*qKOQ<=T+OP._KOJlKTY)b

Try it here

Explanation

JQW}_1mxbdQ=+J|=b*qKOQ<=T+OP._KOJlKTY)b
JQ                                        Get a copy of the input.
  W}_1mxbdQ                          )    While there are words in the input
                                          that aren't in b (initially space)...
                   KOQ    OP._KOJ         ... get a random input word, a random
                                          prefix, and a random joined word...
                       =T+                ... stick them together...
                  q   <          lKT      ... and check if joining them together
                                          is valid...
               =b*                        ... then update b accordingly...
           =+J|                     Y     ... and stick the new word into J.
                                      b   Output the final result.

user48543

Posted 2018-06-06T23:01:33.827

Reputation:

1

Stax, 39 36 bytes

ä▬│•.k=╠lƒ☺╜00║¿~,▓╕╠7ÉΔB<e┼>☼Θ²└ô┴\

Run and debug it

Runs all the test cases deterministically in about a second.

This is a recursive algorithm.

  • Start with each word of input as a candidate
  • At each step, order the words by the number of times they occur as substrings of the candidate.
  • For each word that's compatible with the end of the current candidate, join the word to form a new candidate, and make a recursive call.

Here's the program unpacked, ungolfed, and commented.

FG              for each word in input, call target block
}               unbalanced closing brace represents call target
  x{[#o         sort input words by their number of occurrences in the current candidate
  Y             store it in register Y
  h[#{,}M       if all of the words occur at least once, pop from input stack
                input stack is empty, so this causes immediate termination,
                followed by implicitly printing the top of the main stack
  yF            for each word in register y, do the following
    [n|]_1T|[|& intersect the suffixes of the candidate with prefixes of the current word
    z]+h        get the first fragment in the intersection, or a blank array
    Y           store it in register Y
    %t+         join the candidate with the current word, eliminating the duplicate fragment
    y{G}M       if the fragment was non-blank, recursively call to the call target
    d           pop the top of stack

Run this one

Edit: This fails for a class of inputs that has a loop, like ["ab", "ba", "ca"], as do most of the other posted answers.

recursive

Posted 2018-06-06T23:01:33.827

Reputation: 8 616

0

JavaScript (ES6), 138 130 bytes

f=a=>a[1]?a.map((c,i)=>a.map((w,j,[...b])=>i!=j&&!/0/.test(m=(c+0+w).split(/(.+)0\1/).join``)?t=f(b,b[i]=m,b.splice(j,1)):0))&&t:a

Returns an error for lists that cannot be completely portmantouted.

Ungolfed:

f = a =>
  a[1] ?                                        //if more than one element ...
    a.map((c, i)=>                              // for each element
      a.map((w, j, [...b])=>                    //  for each element
        i != j &&                               //   if not the same element
        !/0/.test(m=(c+0+w).split(/(.+)0\1/).join``) &&  //   and the elements overlap
        (t = f(b,                               //   and f recursed is true when
               b[i] = m,    //    replacing the ith element with the 2-element portmanteau
               b.splice(j, 1)                   //    and removing the jth element
              )
        )
      )
    ) &&
    t :                                         //return the recursed function value
    a                                           //else return a

f=a=>a[1]?a.map((c,i)=>a.map((w,j,[...b])=>i!=j&&!/0/.test(m=(c+0+w).split(/(.+)0\1/).join``)?t=f(b,b[i]=m,b.splice(j,1)):0))&&t:a

console.log(f(["sic", "bar", "rabbits", "cradle"])+'')             // "barabbitsicradle"
console.log(f(["mauve", "elated", "cast", "electric", "tame"])+'') // "mauvelectricastamelated"

console.log(f(["more", "elm", "maniac"])+'')                       // "morelmaniac"
console.log(f(["elm", "more", "maniac"])+'')                       // "morelmaniac"
console.log(f(["more", "maniac", "elm"])+'')                       // "morelmaniac"

console.log(f(["ego", "code", "golf"])+'')                         // "codegolf"
console.log(f(["code", "ego", "golf"])+'')                         // "codegolf"
console.log(f(["golf", "ego", "code"])+'')                         // "codegolf"

console.log(f(['ab', 'bc', 'cd', 'de'])+'');                       // "abcde"
console.log(f(['de', 'bc', 'cd', 'ab'])+'');                       // "abcde"

console.log(f(["ab", "ba", "ca"])+'');                             // "caba"

The code is excruciatingly slow on the full alphabet example (not included for that reason in the above Snippet).

That is remedied by changing the maps to somes, for the loss of 2 bytes:

f=a=>a[1]?a.some((c,i)=>a.((w,j,[...b])=>i!=j&&!/0/.test(m=(c+0+w).split(/(.+)0\1/).join``)?t=f(b,b[i]=m,b.splice(j,1)):0))&&t:a

f=a=>a[1]?a.some((c,i)=>a.some((w,j,[...b])=>i!=j&&!/0/.test(m=(c+0+w).split(/(.+)0\1/).join``)?t=f(b,b[i]=m,b.splice(j,1)):0))&&t:a

console.log(f(["ab", "bc", "cd", "de", "ef", "fg", "gh", "hi", "ij", "jk", "kl", "lm", "mn", "no", "op", "pq", "qr", "rs", "st", "tu", "uv", "vw", "wx", "xy", "yz", "za"])+'');

console.log(f(["ab", "ba", "ca"])+'');                             // "caba"

Rick Hitchcock

Posted 2018-06-06T23:01:33.827

Reputation: 2 461

1It seems I've made a mistake. I can't reproduce the behavior I thought I saw yesterday. Sorry for my confusion and wasting your time. I'll delete my comments on the subject, as they're all wrong and misleading. – recursive – 2018-06-10T15:22:40.117