I just want to do some benchmarking of different programming languages. So here is a tip: You can use as many characters as you want. In the end there is only the time your program needed.

So, here is the Task:

Task

After searching for a good task that would require a lot of computation, I came up with the problem Is a word in a given Type-1 grammar?

For people not knowing what this is: Formal Grammar Chomsky-Hierarchy

Maybe there is a better algorithm but I came only up with brute forcing.

So, write a code that brute forces the Grammars as fast as possible.

Rules

Parallel execution is allowed
Delete your cache after every tested word
Add your complete compiler command for optimized result
You can use any algorithm and language you want, as long as the job gets done.
Add to your answer what files are read/how to set the files that should be read.

I Guarantee

Input is always well formatted
Variables are always capital letters, Terminals always small ones
Words only consist of Terminals
No empty Word

Input

For equal footing we need to define what the input looks like. The input is a file. I write multiple grammars and words in one file. The structure can be described with following Pseudocode:

foreach(Grammar G) do
    write('{');
    foreach (Variable V in G) do
        write(T + ',');
    od
    deleteLast();
    write('}');
    linebreak();
    write('{');
    foreach (Terminal T in G) do
        write(V + ',');
    od
    deleteLast();
    write('}');
    linebreak();
    write('{');
    foreach(Production P in G) do
        write(P + ',');
    od
    write('}');
    linebreak;
    write(StartSymbol(G));
    linebreak;
    foreach(Word W for G) do
        write(W + ',');
    od
    deleteLast();
    linebreak;
od
deleteLastLinebreak();

An Example

Output

The output has to be put in a file as well. The formatting of the output does not matter. It should be recognizable which result belongs to which word.

An Example

Example

Here is an example written in C# by myself.

Main Class

Data Classes

Test System

The system the code will run on is following:

Arch Linux
RunTime you need (if available; Please add in answer)
Intel i7-4700MQ 3.4GHz
16GB RAM

How do I test

After the code is compiled I'll run a bash script that runs your program a certain amount of times and calculates the average and median run time.

Good Luck and may the best code win.

~~My Benchmark: Not completed after 8-9 hours of runtime.~~

My Benchmark (after changing some words):

mean: 39.5157
median: 39.448687194

PS: Here is the originally planned input, which took ages with my program. Grammars are slightly changed in the new input and words are way shorter.

fastest-code

Armin

Posted 2014-07-27T01:02:51.417

Reputation: 181

1Will you test submissions on fixed or random input? In the former case, minimum runtime is usually the better/fairer metric than average. – Martin Ender – 2014-07-27T09:55:33.097

@MartinBüttner fixed, but I want to see what the average runtime is, because there is also other stuff running (OS for example). A minimum runtime test would be good if I'd have a dedicated machine for number crunching. Since I don't have one, the programs need to share the processor with other stuff, that varies in its needs... – Armin – 2014-07-27T10:00:44.303

2That's exactly why I'd prefer minimum runtime. Any deviation from the minimum runtime is due to other stuff running. There is no variation in the amount of processor instructions that are carried out by each program, so any fluctuations in runtimes is due to other programs or the OS doing stuff in between. Hence, the closest you get to the true performance of the code is the minimum runtime. I don't think averaging is any fairer, because it only penalises programs where you happen to get an unrelated CPU spike during one of the runs. – Martin Ender – 2014-07-27T10:11:19.167

@MartinBüttner following problem: my C# code (sequential) is currently running. I've started 8 hours ago. You can't have a run without spikes from other programs. And there is no way for controlling how often they happen. So, if I take the minimum runtime I could have run a program which was only interrupted 10-200 times per run and another was 50-100 times interrupted. The average might balance this deviation. – Armin – 2014-07-27T10:17:17.627

Another option would be the median to mitigate the effects of outliers. – Martin Ender – 2014-07-27T10:48:10.553

@MartinBüttner I'll look at both... – Armin – 2014-07-27T11:02:38.417

But how do you decide a winner from both numbers? – Martin Ender – 2014-07-27T11:06:32.837

@MartinBüttner one for each.... – Armin – 2014-07-27T11:07:00.803

'Number' Crunching Benchmark