"Cut and Paste Sorting"

6

1

The Challenge

Given a string containing a random sequence of unique characters A to Z (all upper case, no duplicates), determine the cut-and-paste" sort and output the sort sequence to a specific format (detailed below).

Definition of Cut-And-Paste Sort, by Example

Example string: AKJCFEGHDBI

This string contains 11 characters, of which A, C, E, G, H, and I form the longest "lengthy sequence" from left to right that is in alphabetic order. So we base the cut-and-paste around these key letters.

If there is more than one "lengthy sequence" of equal length, the code will always favor the lengthy sequence with the biggest "stretch" across the entire string.

In the case of a slightly different string than our original example, AKGHJCFDEBI, there are three lengthy sequences of ACDE, AGHJ, and AGHI, but the code will favor the sequence with the greatest stretch, which in this case is AGHI because the distance between A and I is greatest amongst the three as opposed to the A to E in the first sequence and the A to J in the second sequence. If there happens to be more than one lengthy sequence with the same stretch length, the first lengthy sequence in alphabetic order will be chosen, just to make it easy.

The cut-and-paste sort is simply a case of plucking the appropriate letters not within the key letters and placing them back in their respective positions in alphabetic order.

Then the sort from our original example would occur as follows:

Step 1: AKJCFEGHDBI
Step 2: ABKJCFEGHDI (B gets cut and takes 8 steps to the left and gets pasted after A)
Step 3: ABKJCDFEGHI (D gets cut and takes 4 steps to the left and gets pasted after C)
Step 4: ABKJCDEFGHI (F gets cut and takes 1 step to the right and gets pasted after E)
Step 5: ABKCDEFGHIJ (J gets cut and takes 7 steps to the right and gets pasted after I)
Step 6: ABCDEFGHIJK (K gets cut and takes 8 steps to the right and gets pasted after J)

To make it easier for the purpose of this challenge we can rewrite this sort as:

B<<<<<<<<D<<<<F>J>>>>>>>K>>>>>>>>

where each letter represents the letter being cut, followed by a series of left or right arrows, represented by the < and > characters. When the arrows end, it is implied that the letter is pasted.

Bear in mind that the number of letters being cut at any one time can be more than one, so long as the letters are sequential. So if there is a string that can have two or more letters cut at any one time, the code should favor using the larger rather than perform two or more individual cut-and-pastes as this will provide a more efficient cut-paste sort.

For example:

AKGHJCFDEBI
Lengthy sequences: ACDE AGHJ AGHI; AGHI is chosen (longest stretch)
Step 1: AKGHJCFDEBI
Step 2: ABKGHJCFDEI (B<<<<<<<<)
Step 3: ABCKGHJFDEI (C<<<<)
Step 4: ABCDEKGHJFI (DE<<<<<) 
        Note two or more letters can be moved as long as they are side-by-side
Step 5: ABCDEFKGHJI (F<<<<)
Step 6: ABCDEFKGHIJ (J>)
Step 7: ABCDEFGHIJK (K>>>>)

End result: B<<<<<<<<C<<<<DE<<<<<F<<<<J>K>>>>

Code Behavior

The code must take an alphabetic upper case string with unique letters (Please assume that this string will always be upper case and with all unique letters, i.e. no dupliates), and perform a cut-paste sort according to the above rules and provide a formatted string simliar to B<<<<<<<<C<<<<DE<<<<<F<<<<J>K>>>> providing the method of cutting and pasting respective letters to achieve a string in alphabetic order.

Test Cases

AKJCFEGHDBI Result: B<<<<<<<<D<<<<F>J>>>>>>>K>>>>>>>>

AKGHJCFDEBI Result: B<<<<<<<<C<<<<DE<<<<<F<<<<J>K>>>>

BDEFHGI Result: H>

ABC Result: (already sorted)

CBA Result: B>C>>

Rules

  1. Code Golf: Shortest code wins...
  2. Code must be in the form of a piece of STDIN/STDOUT code, or a function taking the input string as a parameter and returning the output string
  3. Code must be suported by a working version of the code or verified execution of test cases.
  4. In the event that there are two or more with the same character count, the winner will be chosen by the most upvotes.
  5. No silly loopholes

WallyWest

Posted 2016-09-07T02:01:11.830

Reputation: 6 949

It's usually best to have programs assume input is always valid – Destructible Lemon – 2016-09-07T03:56:07.620

7This is codegolf, not "code place for useful real world code", we don't care about users, we care about the golf, and most times it is not much of a relevant addition to the challenge to have invalid input handled. – Destructible Lemon – 2016-09-07T04:03:23.633

@DestructibleWatermelon Noted, and updated accordingly... (+1 for a valid point) – WallyWest – 2016-09-07T04:10:29.480

1@WallyWest Hmm, you've updated the text, but you forgot to remove/change the last two test cases for invalid input. – Kevin Cruijssen – 2016-09-07T07:44:11.687

@KevinCruijssen Text updated, thank you for checking – WallyWest – 2016-09-07T07:47:14.920

May we also take input from command arguments? – Titus – 2016-09-08T12:54:59.120

As in capsort "ACFDEIHGJKB"? I see no issue with that! Love to see what you come up with, @Titus. – WallyWest – 2016-09-08T13:00:05.470

$argv[1] is 4 bytes shorter than fgets(STDIN). Not that it matters much, now that I see the result. can definitely be golfed. – Titus – 2016-09-08T23:31:53.940

In your "slightly different string" AKGHJCFDEBI, isn't ACDEI the longest ascending sequence? – Jordan – 2016-09-09T04:00:02.317

Also, why is the result for the BDEFHGI test case H> and not G<? If our program returns the latter, is it incorrect? – Jordan – 2016-09-09T05:26:53.387

@Jordan: G is in the key letters, which means that it is not cuttable. – Titus – 2016-09-09T06:28:00.290

Answers

1

PHP, 859 bytes

not the shortest code (yet?), but for shorter output.
Added line breaks for a little reading convenience.

function s($u,$a=''){$r=[];for($i=$k=0;$i<strlen($u);$i++)if($u[$i]>substr($a,-1))$r=array_merge($r,s(substr($u,$i+1),$a.$u[$i]));return$r?:[chr(59-strlen($a)).$a];}$w=s($a=$argv[1]);sort($w);$r=preg_split('#_+#',preg_replace("#[$w[0]]#",'_$0_',$a),-1,1);$v=substr;$x=strlen;
for($i=count($r);$i--;)if(!strpos($w[0],$q=&$r[$i]))for($o=0;++$o<$x($q);)if($q[$o-1]>$c=$q[$p=$o]){for(;++$p<$x($q) & $q[$p]>$c & $q[$p]<$q[$o-1];)$c.=$q[$p];echo$c;for(;$q[$o-1]>$v($c,-1);echo'<')$q[--$p]=$q[--$o];$q=$v($q,0,$o).$c.$v($q,$p);$o-=2;}
for(;$n<count($r);$n=count($r),$r=str_split(join($r),1))for($i=-1;++$i<count($r);)if(!strpos($w[0],$r[$i])){$s=array_splice($r,$i,1);for($e=0,$k=$i;$v($s[0],-1)<$r[--$k][0];)$e-=$x($r[$k]);for(;++$k<count($r) & $s[0][0]>$v($r[$k],-1);)$e+=$x($r[$k]);if($e)echo$s[0].str_repeat('><'[$e<0],abs($e));array_splice($r,$k,0,$s);$i-=$k>$i;}

examples

INPUT           KEY     OUTPUT                      RESULT
AKJCFEGHDBI     ACEGHI  B<J<JK>>>>>>>>F>BD<<<<B<    ABCDEFGHIJK
AKGHJCFDEBI     ACDEI   GHJ<GHJK>>>>>F>>>B<<<K>     BCDEFGHJIK
BDEFHGI         BDEFGI  H>                          BDEFGHI
ABC             ABC     _empty_                     ABC
CBA             A       B<BC>                       ABC
PHCODER         CDER    H<HP>O>>P>>>                CHDEOPR
XTREMGOD        EGO     T<R<<RTX>>>>>M>D<<<<        DEGMORTX        (I love it)

Yup. It fails for PHCODER. It´s buggy, but it´s nice.
btw: The shortest possible output for that is O<P>>HOP>>> (for example).

Titus

Posted 2016-09-07T02:01:11.830

Reputation: 13 814