The Fatal Error Challenge

20

Objective

Write a routine that accepts a string of printable ASCII characters, s, and returns a string containing the same characters as s, reordered so that no two-character substring appears more than once. The program must process all benchmark strings (see below) in under one minute each on a modern computer. I will also award a special bonus of 50 rep to the lowest scoring answer that processes any valid 30-character string in under one minute.

For example, given the input Mississippi, a valid output would be issiMspiips (no two-character substrings appear twice), while an invalid output would be ipMsispiiss (since the substring is appears twice).

The routine may take the form of:

  • a complete program reading from stdin (or equivalent) or the command line, and outputting to stdout (or equivalent)
  • a function accepting a single string argument and returning a string

You may assume that the input string always admits at least one valid output.

The Challenge

Your routine must comprise 5 or more lines of code separated by newlines. Empty lines (which includes lines containing only whitespace) are ignored in all contexts and do not count towards total line count.

Swapping any two lines in your source code must produce a fatal error. By "fatal error", we refer to any of the following conditions:

  • the source code fails to compile, with the compiler/interpreter declaring a fatal error
  • the routine aborts with a runtime fatal error or an unhandled runtime exception
  • the routine is forced into an abrupt, abnormal program termination that produces no output of any kind except for a possible error message and/or stack dump

Alternatively, contiguous blocks of code containing no newline characters may be used in place of lines. These blocks should each be displayed on their own line in the source file, with the understanding that newlines are stripped out before the source code is compiled/interpreted.

For example, the code

aaaa
bbbb
cccc

would condense to

aaaabbbbcccc

before being evaluated.

In this mode, the fatal error condition applies to swapping any two code blocks (and thus to swapping lines in the source code before newlines are stripped out). Hence in the above example the routines aaaaccccbbbb, bbbbaaaacccc, and ccccbbbbaaaa must all produce fatal errors, either at compiletime or runtime.

Submissions using this alternative mode should declare its use.

Scoring

Let n be the number of non-empty textlines in your source file, with n ≥ 5. Let c be the number of bytes comprised by the longest textline (by byte length) in your source file, not counting any trailing newline.

A submission's score is given by c(n + 10).

The submission with the lowest score is the winner.

Best of luck. ;)

Benchmark Strings

Abracadabra Alacazam
Is Miss. Mississauga Missing?
Ask Alaska's Alaskans
GGGGAAAATTTTCCCCgggaaatttccc
A Man A Plan A Canal Panama

COTO

Posted 2014-11-09T15:22:52.803

Reputation: 3 701

Are capital letters different from lowercase letters? i.e. Input is CooliO, output oOoCli? – FryAmTheEggman – 2014-11-09T16:49:17.560

@FryAmTheEggman: Yes. Capital letters differ from lowercase letters. In general, consider only the ASCII code values of the characters. – COTO – 2014-11-09T17:01:22.973

Does repetitions restrict to letter pairs appearing in the input? E.g. Is Mspiisiipss valid since the only repetition is ii which does not occur in Mississippi? – TwiNight – 2014-11-10T04:23:33.083

@TwiNight: Even repeated substrings that do not appear in the original string are not allowed. – COTO – 2014-11-10T04:48:56.070

May I assume anything about input length? (background: got a brilliant idea for a BF solution) – PurkkaKoodari – 2014-11-10T11:33:50.473

@Pietu1998: You can assume the input string is nonempty and comprises 30 characters or less (in the spirit of "any valid 30-character string in under one minute"). – COTO – 2014-11-10T12:46:34.883

Answers

6

PHP, score = 289 (17×(7+10))

PHP's built-in functions make it quite easy to do this badly. The following code just shuffles the string until a valid result is obtained:

function f($s){
while(preg_match(
'/(..).*?\1/',$s)
+preg_match('/(.'
.')\1\1/',$s))$s=
str_shuffle(
$s);return $s;}

Benchmarks

Average and maximum execution times calculated using the following code:

$s = 'Mississippi';
$t_total = 0;
$t_max = 0;
for ($i=0; $i<10; $i++) {
  $t0 = microtime(true);
  echo f($s);
  $t = microtime(true) - $t0;
  printf("  %10.7f\n",$t);
  $t_total += $t;
  if ($t > $t_max) $t_max = $t;
}
printf("Avg: %10.7f secs; Max: %10.7f secs\n",$t_total/$i, $t_max);

Results:

  • Mississippi: Avg: 0.0002460 secs; Max: 0.0005491 secs
  • Anticonstitutionnellement: Avg: 0.0001470 secs; Max: 0.0002971 secs
  • Pneumonoultramicroscopicsilicovolcanoconiosis: Avg: 0.0587177 secs; Max: 0.1668079 secs
  • Donaudampfschiffahrtselektrizitatenhauptbetriebswerkbauunterbeamtengesellschaft*: Avg: 9.5642390 secs; Max: 34.9904099 secs
  • baaacadaeafbbcbdbebfccdcecfdde: Avg: 5.0858626 secs; Max: 9.8927171 secs

* I changed ä to a to avoid multi-byte issues
† This was the hardest 30-character string I could come up with. It's actually the first 30 characters of the De Bruijn sequence for k='abcdef' and n=2, with the first 'b' moved to avoid an instant match.

r3mainer

Posted 2014-11-09T15:22:52.803

Reputation: 19 135

5This doesn't really satisfy > The program must process any valid 30-character string in under one minute on a modern computer., considering the potentially infinite runtime. – Bob – 2014-11-10T05:05:58.107

@Bob I've added some benchmarks to the answer. The code may be inefficient, but the likelihood of it taking more than one minute to process a 30-character string is, I think, very small indeed. – r3mainer – 2014-11-10T12:08:24.557

5

Dyalog APL (11(5+10)=165)

f←{a←{⍴⍵}
b←a⌸2,/⍵
c←b⊢⍵[?⍨a⍵]
1∨.≠b:∇c⋄⍵
}

Proof:

  • Lines 1 and 5 bound the function. Swapping any line for those would result in occurring outside of a function, which is a SYNTAX ERROR.
  • Line 2 defines b, so it cannot be swapped for lines 3 or 4, which depend on b. There would be a VALUE ERROR. (And it obviously can't be swapped with 1 or 5 either.)
  • Line 3 defines c, so it cannot be swapped for line 4, which depends on c. (And we've already proven no other line can be swapped with line 3.)
  • Line 4 depends on the variables from lines 2 and 3 and must therefore be last.

marinus

Posted 2014-11-09T15:22:52.803

Reputation: 30 224

3+1. But what does it all mean?? – r3mainer – 2014-11-10T14:03:16.363

4

APL(Dyalog), 6(5+10) = 90

{1∧.=
+/∘.≡⍨
2,/⍵:⍵
⋄∇⍵[
?⍨⍴⍵]}

I'm using the alternative, so:

{1∧.=+/∘.≡⍨2,/⍵:⍵⋄∇⍵[?⍨⍴⍵]}

This is the same old algorithm.


Explanation
2,/⍵ gives an array of character pairs in the input string
+/∘.≡⍨ generates a numerical array of how many pairs does each pair equal (including itself)
1∧.= checks if each element of that array equals 1, and logical AND the results together

:⍵ If that is true (1), return the input string

∇⍵[?⍨⍴⍵] else scramble the input string and do a recursive call


Swapping

If line 1 is swapped with line 2, then you end up with +/∘.≡⍨{...} which is just a mess of functions and operators that gives SYNTAX ERROR.
If line 1 is swapped with line 3 or 4, then you have outside of a function definition, and that's a SYNTAX ERROR.
If line 1 is swapped with line 5, unbalanced braces alone would cause SYNTAX ERROR, so don't worry about the other 4 syntax errors.

If line 5 is swapped with line 2/3/4, again you have outside of a function definition.(SYNTAX ERROR)

If line 2 is swapped with line 3, you end up with 1∧.=2,/⍵:⍵. This syntax is called a guard (think of it as a conditional). The guard condition must evaluate to 0 or 1 or a 1-element array of 0 or 1. Here, it evaluates to something more complex than that (a scalar containing a 2-element array). So this is a DOMAIN ERROR.
If line 2 is swapped with line 4, you get the statement 1∧.=, which attempts to apply the function ∧.= without the required left argument. (SYNTAX ERROR).

If line 3 is swapped with line 4, again you get a mess of functions and operators (1∧.=+/∘.≡⍨) so you get SYNTAX ERROR.


Benchmarks
(numbers in milliseconds)

Abracadabra Alacazam
11 1 3 5 2
Avg: 4.4

Is Miss. Mississauga Missing?
1260 2000 222 117 111
Avg: 742

Ask Alaska's Alaskans
7 2 3 3 4
Avg: 3.8

GGGGAAAATTTTCCCCgggaaatttccc
31 15 24 13 11
Avg: 18.8

A Man A Plan A Canal Panama
377 2562 23 301 49
Avg: 662.4

I am still thinking about different splittings. Also I have a deterministic, systematic way of doing the task. I just can't make it into an algorithm (take away the creative part of "making the numbers right") and can't make sure it works every time.

TwiNight

Posted 2014-11-09T15:22:52.803

Reputation: 4 187

0

Haskell, 129 = 3x(33 + 10)

this uses the alternative mode.

imp
ort
 Da
ta.
Lis
t;a
%[]
=[[
]];
a%s
=[x
:j|
x<-
s,n
ot$
isI
nfi
xOf
[la
st 
a,x
]a,
j<-
(a+
+[x
])%
(s\
\[x
])]
;g 
s=[
]%s
!!0

or in a readable form:

import Data.List
a%[]=[[]]
a%s=[x:j|x<-s,not$isInfixOf[last a,x]a,j<-(a++[x])%(s\\[x])]
g s=[]%s!!0

Haskell is a very strict language. for example, the import must come first; the definitions of s must come together; all the types must agree and there is no way to cast between them, and so forth. this leads to having a non-fatal error almost impossible. in fact, having a runtime fatal error is too almost near impossible.

note that if g is a valid function but has an the wrong type (any type different then [a]->[a] or String -> String and the like) than this is a fatal error because it's impossible to apply g to the inputs.

outputs:

Abracadabar Alaazcma
Is Miss. iMsiasusgsa sMniig?s
Ask Alasak's lAaankss
GGAGTGCAATACTTCCggagtaacttcc
A Man AP la nAC  aalnnaPama

proud haskeller

Posted 2014-11-09T15:22:52.803

Reputation: 5 866