13
1
Here's one for all you wordsmiths out there! Write a program or function which takes a list of words and produces a list of all possible concatenative decompositions for each word. For example:
(Note: This is only a small sampling for illustrative purposes. Actual output is far more voluminous.)
afterglow = after + glow
afterglow = aft + erg + low
alienation = a + lie + nation
alienation = a + lien + at + i + on
alienation = a + lien + at + ion
alienation = alien + at + i + on
alienation = alien + at + ion
archer = arc + her
assassinate = ass + as + sin + ate
assassinate = ass + ass + in + ate
assassinate = assassin + ate
backpedalled = back + pedal + led
backpedalled = back + pedalled
backpedalled = backpedal + led
goatskin = go + at + skin
goatskin = goat + skin
goatskin = goats + kin
hospitable = ho + spit + able
temporally = tempo + rally
windowed = win + do + wed
windowed = wind + owed
weatherproof = we + at + her + pro + of
yeasty = ye + a + sty
Ok, you get the idea. :-)
Rules
- Use any programming language of your choosing. Shortest code by character count for each language wins. This means there is one winner for each language used. The overall winner will be simply the shortest code of all submitted.
- The input list can be a text file, standard input, or any list structure your language provides (list, array, dictionary, set, etc.). The words can be English or any other natural language. (If the list is English words, you'll want to ignore or pre-filter-out single-letter items except for "a" and "i". Similarly, for other languages, you'll want to ignore nonsensical items if they appear in the file.)
- The output list can be a text file, standard output, or any list structure your language uses.
- You can use any input dictionary you like, but you'll probably want to use one that provides sensible words rather than one that provides too many obscure, arcane, or obnubilated words. This the file I used: The Corncob list of more than 58000 English words
Questions
This challenge is primarily about writing the code to accomplish the task, but it's also fun to comb through the results...
- What subwords occur most commonly?
- What word can be decomposed into the greatest number of subwords?
- What word can be decomposed the most different ways?
- What words are composed of the largest subwords?
- What decompositions did you find that were the most amusing?
@Geobits — Ah, thank you! I missed two decompositions of
alienation
when I cut & pasted that. Fixed now. In terms of the others, the list above is only a small sampling. My test program generated tens of thousands of answers when given the Corncob list. – Todd Lehman – 2014-09-02T00:56:06.0431"What subwords occurmost commonly?"
Gonna throw a wild guess out there and say 'a' might be near the top. – Sellyme – 2014-09-02T04:20:27.850
@SebastianLamerichs — I dunno... Might be, might not be. :) – Todd Lehman – 2014-09-02T04:57:08.497
@ToddLehman that sentence contains exactly 0 subwords, so 'a' is still equal first :P – Sellyme – 2014-09-02T04:58:35.387
@SebastianLamerichs if you were referring to Todd's response to you, "dunno" can be split into "dun" + "no". ;) – i alarmed alien – 2014-09-02T12:14:29.007
@ialarmedalien Is "dunno" in the word list, though? It doesn't count if it's not a word itself. – Sellyme – 2014-09-02T12:26:28.377
@SebastianLamerichs I dunno. Depends on the word list used! – i alarmed alien – 2014-09-02T13:01:13.667
What if I use Chinese as input text… – Ray – 2014-09-02T15:02:27.437
@Ray — I suppose that might result in very few decompositions... unless you figure out a way to decompose the glyphs! – Todd Lehman – 2014-09-02T17:09:15.760