Fastest Sort in BrainF***

After having implemented QuickSort in BrainF***, I realized it probably wasn't that quick. Operations that are O(1) in normal languages (like array indexing) are significantly longer in BF. Most of the rules for what makes an efficient sort can be thrown out the window when you are coding in a Turing tarpit.

So here is a challenge to implement the "Fastest BrainF*** Sort Routine Ever". I will time all entries using the interpreter below. The intepreter uses a 16K tape of unsigned characters. Both the tape and the cells wrap when advanced/incremented past the limits. Reading the EOF puts a 0 in the current cell. The measured time includes both the time to parse the source file and the time to process all input files. Fastest code wins.

The test vector will be a set of Ascii files designed to test sorting edge cases, including

An already sorted list: "ordered"

&#33;"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~

A reverse sorted list: "reverse"

~}|{zyxwvutsrqponmlkjihgfedcba`_^]\[ZYXWVUTSRQPONMLKJIHGFEDCBA@?>=<;:9876543210/.-,+*)('&%$#"!

A file consisting of many copies of a few unique values: "onlynine"

ibbkninbkrauickabcufrfckbfikfbbakninfaafafbikuccbariauaibiraacbfkfnbbibknkbfankbbunfruarrnrrrbrniaanfbruiicbuiniakuuiubbknanncbuanbcbcfifuiffbcbckikkfcufkkbbakankffikkkbnfnbncbacbfnaauurfrncuckkrfnufkribnfbcfbkbcrkriukncfrcnuirccbbcuaaifiannarcrnfrbarbiuk

A completely random ascii file: "random"

'fQ`0R0gssT)70O>tP[2{9' 0.HMyTjW7-!SyJQ3]gsccR'UDrnOEK~ca 'KnqrgA3i4dRR8g.'JbjR;D67sVOPllHe,&VG"HDY_'Wi"ra?n.5nWrQ6Mac;&}~T_AepeUk{:Fwl%0`FI8#h]J/Cty-;qluRwk|S U$^|mI|D0\^- csLp~`VM;cPgIT\m\(jOdRQu#a,aGI?TeyY^*"][E-/S"KdWEQ,P<)$:e[_.`V0:fpI zL"GMhao$C4?*x

A random file over the range 1..255: "wholerange"

öè—@œ™S±ü¼ÓuÇ¯ŠfÎ€n‚ZÊ,ˆÖÄCítÚDý^öhfF†¬I÷xxÖ÷GààuÈ©ÈÑdàu.y×€ôã…ìcÑ–:*‰˜IP¥©9Ä¢¬]Š\3*\®ªZP!YFõ®ÊÖžáîÓ¹PŸ—wNì/S=Ìœ'g°Ì²¬½ÕQ¹ÀpbWÓ³
»y  »ïløó„9k–ƒ~ÕfnšÂt|Srvì^%ÛÀâû¯WWDs‰sç2e£+PÆ@½ã”^$f˜¦Kí•òâ¨÷ žøÇÖ¼$NƒRMÉE‹G´QO¨©l¬k¦Ó

Each input file has at most 255 bytes.

Here is the interpreter. It's written for console-mode Windows, but should be easy to port: just replace read_time() and sysTime_to_ms() with platform-specific equivalents.
Usage: bftime program.bf infile1 [infile2 ...]

#include <windows.h>
#include <stdio.h>

#define MS_PER_SEC  1000.0f
#define MAXSIZE  (0x4000)
#define MAXMASK  (MAXSIZE-1)

typedef  __int64 sysTime_t;
typedef unsigned char Uint8;
typedef unsigned short Uint16;

typedef struct instruction_t {
   Uint8 inst;
   Uint16 pair;
} Instruction;

Instruction prog[MAXSIZE] = {0};
Uint8 data[MAXSIZE] = {0};
const Uint8 FEND = EOF;

sysTime_t read_time() {
    __int64 counts;
    QueryPerformanceCounter((LARGE_INTEGER*)&counts);
    return counts;
}

float sysTime_to_ms(sysTime_t timeIn) {
    __int64 countsPerSec;
    QueryPerformanceFrequency((LARGE_INTEGER*)&countsPerSec);
    return (float)timeIn * MS_PER_SEC / (float)countsPerSec;
}

int main(int argc, char* argv[])
{
   FILE* fp;
   Uint8 c;
   Uint16 i = 0;
   Uint16 stack = 0;
   sysTime_t start_time;
   sysTime_t elapsed=0,delta;

   if (argc<3) exit(printf("Error: Not Enough Arguments\n"));
   fp = fopen(argv[1],"r");
   if (!fp) exit(printf("Error: Can't Open program File %s\n",argv[1]));

   start_time=read_time();
   while (FEND != (c = fgetc(fp)) && i <MAXSIZE) {
      switch (c)  {
      case '+': case '-': case ',': case '.': case '>': case '<':
         prog[++i].inst = c;
         break;
      case '[': 
         prog[++i].inst = c;
         prog[i].pair=stack;
         stack = i;
         break;
      case ']': 
         if (!stack) exit(printf("Unbalanced ']' at %d\n",i));
         prog[++i].inst = c;
         prog[i].pair=stack;
         stack = prog[stack].pair;
         prog[prog[i].pair].pair=i;
         break;
      }
   }
   if (stack) exit(printf("Unbalanced '[' at %d\n",stack));
   elapsed = delta = read_time()-start_time;
   printf("Parse Time: %f ms\n", sysTime_to_ms(delta));

   for (stack=2;stack<argc;stack++) {
      Instruction *ip = prog;
      fp = fopen(argv[stack],"r");
      if (!fp) exit(printf("Can't Open input File %s\n",argv[stack]));
      printf("Processing %s:\n", argv[stack]);
      memset(data,i=0,sizeof(data));

      start_time=read_time();
      //Run the program
      while (delta) {
         switch ((++ip)->inst) {
         case '+': data[i]++; break;
         case '-': data[i]--; break;
         case ',': c=getc(fp);data[i]=(FEND==c)?0:c; break;
         case '.': putchar(data[i]);  break;
         case '>': i=(i+1)&MAXMASK;   break;
         case '<': i=(i-1)&MAXMASK;   break;
         case '[': if (!data[i]) ip = prog+ip->pair; break;
         case ']': if (data[i])  ip = prog+ip->pair;  break;
         case 0: delta=0; break;
         }
      }
      delta = read_time()-start_time;
      elapsed+=delta;
      printf("\nProcessing Time: %f ms\n", sysTime_to_ms(delta));
   }
   printf("\nTotal Time for %d files: %f ms\n", argc-2, sysTime_to_ms(elapsed));
}

Results So Far

Here's the average time of 5 runs of the complete set of vectors:

 Author    Program      Average Time    Best Set          Worst Set
 AShelly   Quicksort    3224.4 ms       reverse (158.6)   onlynine (1622.4) 
 K.Randall Counting     3162.9 ms       reverse (320.6)   onlynine  (920.1)
 AShelly   Coinsort      517.6 ms       reverse  (54.0)   onlynine  (178.5) 
 K.Randall CountingV2    267.8 ms       reverse  (41.6)   random     (70.5)
 AShelly   Strandsort    242.3 ms       reverse  (35.2)   random     (81.0)

AShelly

Posted 2012-01-25T02:28:26.590

Reputation: 4 281

What is the range of the input elements? – Keith Randall – 2012-01-25T03:17:30.140

It is the range of the cells, except 0: 1-255. – AShelly – 2012-01-25T03:35:31.650

you should retime mine, I made it quite a bit faster. – Keith Randall – 2012-01-26T16:44:29.710

It does appear to be more than 2x faster than my most recent - I'll do the official timing when I'm back on the machine I used for the others. – AShelly – 2012-01-26T18:59:11.930

Answers

Here's a sort that is at least 6x faster than my quicksort. It's an algorithm that would make little sense in a traditional language, since it's O(N*m) where m is the maximum input value. After collecting the input, it passes through the array, counting cells > 0 and then decrementing each one. It then adds 1 to the first count cells in the result vector. It repeats the passes until the count is 0.
BF:

Get Input
>,[>>+>,]   
Count values GT 0 and decrement each
<[<[<<<+>>>-]<[-<<+>>>]>[<]<<]
While count: add 1 to results
<[[[<<+>>-]<+<-]
Seek back to end of input
>[>>]>>>[>>>]
Repeat counting step
<<<[<[<<<+>>>-]<[-<<+>>>]>[<]<<]<]
Seek to far end of results and print in reverse order 
<[<<]>>[.>>]

C equivalent algorithm:

 uchar A[MAX]={0}; uchar R[MAX]={0}; int count,i,n=0;
 while (A[n++]=getchar()) ;
 do { 
   count = 0;
   for (i=0; i<n; i++) count += (A[i]) ? (A[i]-->0) : 0;
   for (i=0; i<count; i++) R[i]++; 
 } while (count>0);
 for (i=0; R[i]; i++) ;
 for (i--; i>=0; i--) putchar(R[i]);

Here's one that's 2x as fast. It's based loosely on "spaghetti sort": it lays down a string of 1s as long as each input. The value in each cell represents the number of strands at least that long. (So [3,2,1,2] becomes |4|0|3|0|1|0|0|). Then it starts 'measuring' the strands and prints out the length every time it finds the end of one.

>,[ [-[>>+<<-]>+>] <[<<]>,]   build strand of 1s for each input
+>[>+<-]>[                    while there are strands
  >[>+<<->-]                  do any strands end here?
  <[<<.>>-]                   print length of all that do  
  <<[>>+<<-]>>+>>]            shift right 1; inc length

Raw:

>,[[-[>>+<<-]>+>]<[<<]>,]+>[>+<-]>[>[>+<<->-]<[<<.>>-]<<[>>+<<-]>>+>>]

AShelly

Posted 2012-01-25T02:28:26.590

Reputation: 4 281

Don't knock counting sort! It's my favorite sort, due to a massive win I got from it one time: if m is known to be small, you can get huge speedups over otherwise "fast" algorithms. Similarly, bubble sort beats quicksort on mostly-sorted data. No one ___ algorithm is the best for every context. – boothby – 2012-01-26T06:13:34.720

I don't think this is exactly a counting sort. Your comment forced me to do some more research. I think this is more like a bead sort. But I'm not even sure that's right.

– AShelly – 2012-01-26T06:57:34.927

No, you're right. This is a strange sort. Could be useful for some application involving lists of linked lists... but I'm dubious of even that. – boothby – 2012-01-26T07:41:57.930

4The physical analogy is that you have N stacks of coins of different sizes. Set aside space for another N stacks. You take one coin off the top of each stack that has coins, and then add 1 to each stack in the new set from right to left until your hand is empty. Repeat until all original stacks are empty. Now the new set is sorted ascending left to right. – AShelly – 2012-01-26T14:40:55.717

>>+>,[->+>,]<[<[<<]<[.<[<<]<]>>[+>->]<<]

I don't remember whose idea this algorithm was. Maybe Bertram Felgenhauer's? It came from discussions around Brainfuck Golf contest #2, over a decade ago.

This is the fastest one yet on the sample inputs.

It's also not limited to inputs of length <256, but can handle arbitrarily long inputs.

Both these things were also true of Albert's answers, below. The nice thing about this one is that the running time is O(N) in the input length. Yes, this thing actually runs in linear time. It already ate a constant factor of 255 as a snack.

Daniel Cristofani

Posted 2012-01-25T02:28:26.590

Reputation: 947

A simple counting sort implementation. Each bucket is 3 cells wide, containing the current input, a marker, and the count of the number of times the counter appears in the input.

process input
,[

while input is not zero
[

decrement input
-

copy input over to next bucket
[->>>+<<<]

mark next bucket as not the first
>>>>+<

repeat until input is zero
]

increment count for this bucket
>>+

rewind using markers
<[-<<<]<

process next input
,]

generate output
>+[>[<-.+>-]<[->>>+<<<]>>>+]

without comments:

,[[-[->>>+<<<]>>>>+<]>>+<[-<<<]<,]>+[>[<-.+>-]<[->>>+<<<]>>>+]

Keith Randall

Posted 2012-01-25T02:28:26.590

Reputation: 19 865

>,[>+>,]+[<[<-<]>>[<[>>]>[<->[>>]<.<[<<]]>>]<<<<<+]

Albert

Posted 2012-01-25T02:28:26.590

Reputation: 81

>>+>,[>+>,]<[[<-<]>+>+[>]>[[-<<[[>>+<<-]<]>>]>-.+[>]>]<<]

Albert

Posted 2012-01-25T02:28:26.590

Reputation: 81