2
1
There is a sentence with many cats:
there is a cat house where many cats live. in the cat house, there is a cat called alice and a cat called bob. in this house where all cats live, a cat can be concatenated into a string of cats. The cat called alice likes to purr and the cat called bob likes to drink milk.
The Task
Concatenate (_
) all pairs of neighbouring words in the sentence and place each in between the any such pair if that pair occurs more than once in the sentence. Note that overlapping counts, so blah blah
occurs twice in blah blah blah
.
For example, if the cat
occurs more then once, add the concatenated words in between them like this: the the_cat cat
Example Output
there there_is is is_a a a_cat cat cat_house house where many cats live. in the the_cat cat cat_house house,
there there_is is is_a a a_cat cat cat_called called called_alice alice and a a_cat cat cat_called called called_bob bob.
the the_cat cat cat_called called called_alice alice likes likes_to to purr and the the_cat cat cat_called called called_bob bob likes likes_to to drink milk.
Some more examples:
milk milk milk milk milk_milk milk milk_milk milk
a bun and a bunny a bun and a bunny
milk milk milk. milk milk milk.
bun bun bun bun. bun bun_bun bun bun_bun bun bun.
Notes
- All
utf-8
characters are allowed. Meaning that punctuations are part of the input. - Punctuation becomes part of the word (e.g., with
in the house, is a cat
the wordhouse,
includes the comma)
Why not
house house_where where
, sincehouse where
appears twice? (Actually, I think the example output does not match the example input.) – Arnauld – 2019-09-01T19:38:59.000well spotted, not on purpose – Bob van Luijt – 2019-09-01T20:04:11.397
What types of characters can the input have? Do we need to worry about punctuation for what counts as a word? – xnor – 2019-09-01T20:10:07.233
Good question @xnor, updated the question above – Bob van Luijt – 2019-09-01T20:15:37.637
3@BobvanLuijt So that clarifies what characters are allowed, but I'm still not clear how punctuation or other characters affect what's considered a separate word. – xnor – 2019-09-01T20:18:56.343
2So basically a word is a sequence of non-whitespace characters? – Arnauld – 2019-09-01T20:28:20.450
2Is the count overlapping or not? i.e does
milk milk
occur twice or once inmilk milk milk
? (I'd guess twice so "yes" but I don't know) – Jonathan Allan – 2019-09-01T20:36:18.833Yes, correct. Suggestions on how to add this to the game rules are welcome :) – Bob van Luijt – 2019-09-01T20:38:44.953
RE: xnor & Arnald's comments: does
milk milk
still occur twice inmilk milk milk.
or not? (Note the trailing period). – Jonathan Allan – 2019-09-01T20:44:57.733@JonathanAllan so should the output of
milk milk milk
bemilk milk_milk milk milk_milk milk
? – Nick Kennedy – 2019-09-01T20:53:19.8971Perfect; thanks – Bob van Luijt – 2019-09-01T20:59:57.797
@JonathanAllan I think the second note means that
milk milk milk.
would be left alone, andmilk milk milk milk.
would becomemilk milk_milk milk milk_milk milk milk.
– Nick Kennedy – 2019-09-01T21:06:02.467@Nick agreed. I've added
milk milk milk.
andbun bun bun bun.
as test cases and nominated for re-opening. – Jonathan Allan – 2019-09-02T11:13:38.193Thanks @JonathanAllan. Did you remove your answer btw? – Bob van Luijt – 2019-09-02T11:14:17.233
An example like
this should produce this_should
->this should produce this_should
is probably worth adding (since a filtering approach might yieldthis this_should should produce this_should
by mistake). – Jonathan Allan – 2019-09-02T12:41:53.603Is
cat cat_house house,
a mistake, since the second word includes the comma? – Neil – 2019-09-02T23:46:09.730