Check if a string is a shuffle of twins

10

Explanation

Two strings can be shuffled by interspersing their letters to form a new string, much like two piles of cards can be shuffled to form a single pile.

For example, the strings HELLO and WORLD can be shuffled to form HWEOLRLLOD, or HEWORLLLDO, or perhaps simply HELLOWORLD.

It is not a shuffle if the original order of letters is not preserved. For example, the D in WORLD cannot ever appear before the R after being shuffled. This means that EHLLOWRDLO, for instance, is not a shuffle of HELLO and WORLD, even though it contains all the original letters.

A string is a shuffle of twins if it can be formed by shuffling two identical strings. For example, ABACBDECDE is a shuffle of twins because it can be formed by shuffling ABCDE and ABCDE. DBEACBCADE is not a shuffle of twins because it cannot be formed by shuffling two identical strings.

Program Details

Given an input string, output 0 if it is not a shuffle of twins, and output one of the twin strings if it is a shuffle of twins.

You may assume that the input string has a length inclusively between four and twenty characters and is composed entirely of uppercase alphabetic characters. It should be able to run in a reasonable amount of time, say, under 10 minutes.

This is code golf, so the shortest solution wins.

Example I/O

> ABACBDECDE
ABCDE

> DBEACBCADE
0

> FFFFFF
FFF

> FFGGG
0

> ABBA
0

> AABB
AB

> AABAAB
AAB

I have an example (non-golfed) implementation.

Peter Olson

Posted 2011-12-01T19:07:11.700

Reputation: 7 412

Example string FGG violates assertion that the input string has a length inclusively between four and twenty characters, and don't tell me "never trust user input!", "never trust the specs!" – user unknown – 2011-12-02T15:59:49.827

@userunknown That is so embarrassing! I edited it to FFGGG to make it consistent. – Peter Olson – 2011-12-02T16:07:36.850

1Just out of curiosity, can anyone come up with a solution with sub-exponential worst case time complexity, or prove that there isn't any? – Ilmari Karonen – 2011-12-02T17:54:19.277

Answers

4

Haskell, 114

main=getLine>>=putStrLn.f.p;f x=head$[a|(a,b)<-x,a==b]++["0"]
p[]=[([],[])];p(x:y)=do(a,b)<-p y;[(x:a,b),(a,x:b)]

Ungolfed:

main :: IO ()
main = getLine >>= putStrLn . findMatch . partitions

-- | Find the first partition where the two subsequences are
-- equal. If none are, return "0".
findMatch :: [(String, String)] -> String
findMatch ps = head $ [a | (a,b) <- ps, a == b] ++ ["0"]

-- | Return all possible partitions of the input into two
-- subsequences. Preserves the order of each subsequence.
--
-- Example:
-- partitions "AB" == [("AB",""),("B","A"),("A","B"),("","AB")]
partitions :: [a] -> [([a], [a])]
partitions []     = [([], [])]
partitions (x:xs) = do (a, b) <- partitions xs
                       [(x:a, b), (a, x:b)]

Explanation:

Most of the work is being done in the partitions function. It works by recursively generating all partitions (a, b) of the tail of the list, and then using the list monad to prepend the initial element x to each of them and gather all the results.

findMatch works by filtering this list so that only partitions where the subsequences are equal remain. It then returns the first subsequence in the first partition. If none remain, the list is empty, so the "0" appended at the end gets returned instead.

main just reads the input, feeds it through these two functions and prints it.

hammar

Posted 2011-12-01T19:07:11.700

Reputation: 4 011

For those of us who can't read Haskell, will you give an explanation? – Mr.Wizard – 2011-12-01T22:55:14.377

1@Mr.Wizard: See edit. – hammar – 2011-12-01T23:15:01.553

I think I came up with something fairly similar, though not as short, but I threw in a foolish sort that made it a complete failure. Do you mind if I implement this algorithm in Mathematica? – Mr.Wizard – 2011-12-01T23:20:35.193

4

R, 113 chars

l=length(x<-charToRaw(scan(,'')));max(apply(combn(l,l/2),2,function(i)if(all(x[i]==x[-i]))rawToChar(x[i])else 0))

Ungolfed (and instead a function that takes the string):

untwin <- function(x) {
  x <- charToRaw(x)
  indMtx <- combn(length(x),length(x)/2)
  res <- apply(indMtx, 2, function(i) {
    if (all(x[i]==x[-i]))
      rawToChar(x[i])
    else
      0
  })
  max(res)
}

untwin("ABACBDECDE") # "ABCDE"
untwin("DBEACBCADE") # 0

The solution relies on the combn function that generates all combinations of indices as columns in a matrix. apply then applies a function to each column (dimension 2) in the matrix and returns a vector of strings or zeroes. max then find the biggest string (which trumps 0).

A cool feature in R is the ability to select a subset of a vector given a vector of indices, and to then select the complement of that subset by negating the indices: x[i] == x[-i]

Tommy

Posted 2011-12-01T19:07:11.700

Reputation: 821

Did some incremental improvements and reduced char count. – Tommy – 2011-12-02T16:40:54.670

3

Mathematica, 87

This is directly based on hammar's post, but hopefully it is distinct enough to merit posting.

<<Combinatorica`

f=Catch[Cases[Characters@#~KSetPartitions~2,{x_,x_}:>Throw[""<>x]];0]&

Test:

f /@ {"ABACBDECDE", "DBEACBCADE", "FFFFFF", "FGG", "ABBA", "AABB", "AABAAB"}
{"ABCDE", 0, "FFF", 0, 0, "AB", "AAB"}

Mr.Wizard

Posted 2011-12-01T19:07:11.700

Reputation: 2 481

1

D

string c(string in,string a=[],string b=[]){
    if(in.length==0)return a==b?a;"0";
    auto r=c(in[1..$],a~in[0],b);
    return r=="0"?c(in[1..$],a,b~in[0]):r;
}
void main(){writeln(c(readline));}

using depth first recursive search

I can make it faster with a int i = min(a.length,b.length);if(a[0..i]!=b[0..i])return "0"; guard clause

ratchet freak

Posted 2011-12-01T19:07:11.700

Reputation: 1 334

On IDEONE, I failed trying to start the program with void main(){writeln(c("ABCADABCAD"));} - just a different version of D, my fault, something else? What's about "ABCABCA"? – user unknown – 2011-12-02T16:54:17.867

you'll need to import std.stdio; for the IO – ratchet freak – 2011-12-02T17:54:24.507

1

Ruby, 89 characters

s=->a,x,y{a=="\n"?x==y ?x:?0:[s[b=a[1..-1],x+c=a[0],y],s[b,x,y+c]].max}
$><<s[gets,'','']

This code implements a plain recursive search algorithm. The input must be given on STDIN.

Howard

Posted 2011-12-01T19:07:11.700

Reputation: 23 109

1

Perl, 68 chars

/^((.+)(?{local($x,$,)=($,,$x.$^N)}))+$(?(?{$o=$,eq$x&&$,})|x)/?$o:0

The input string assumed to be in the $_ variable, the output is the value of the expression. Trailing newlines in input are ignored. You can run this from the command line like this:

perl -lne 'print /^((.+)(?{local($x,$,)=($,,$x.$^N)}))+$(?(?{$o=$,eq$x&&$,})|x)/?$o:0'

This code makes use of Perl's regexp engine (and specifically its embedded code execution feature) to do the backtracking. Basically, it matches the input string against the regexp ^((.+))+$, keeping track of the odd- and even-numbered submatches in $x and $,, and rejecting the match at the end if the two aren't equal.

Ilmari Karonen

Posted 2011-12-01T19:07:11.700

Reputation: 19 513

Does this have the correct result for AABAAB? – Peter Olson – 2011-12-02T16:12:19.383

Yes it does. (In fact, AABAAB is an easy case for this solution, since the outer group only needs to match twice. It took me much longer to get it to handle AABB right.) – Ilmari Karonen – 2011-12-02T16:24:38.163

1

Python, 168 characters

def f(s):
 c=s[0]
 d=i=r=0
 if s==c+c:r=c
 while i+1 and i<=len(s)/2 and not r:
  if i:d=f(s[1:i]+s[i+1:])
  if d:r=c+d
  i=s.find(c,i+1)
 return r
print f(raw_input())

Steven Rumbalski

Posted 2011-12-01T19:07:11.700

Reputation: 1 353