17
6
Leaderboard - JIT Compiled (Lower is better)
- es1024 - 81.2 points (including a working compiler!)
- Kieth Randall - 116 points
- Ell - 121 points
Leaderboard - Interpreted (Lower is better)
- Martin Büttner - 706654 points (somewhere around 2 hours).
- criptych - 30379 points (97 seconds)
Your mission, should you choose to accept it, is to write the smallest possible bytecode interpreter/VM. The VM/interpreter uses a small CISC architecture (operations can vary in size), with the language specified below. Upon completion, you must print the value of the 3 CPU registers to prove that the correct output is printed (3,126,900,366).
Compiler
If you'd like to make your own tests, a compiler is posted below. Feel free to post your tests with your answer.
window.compile=function(){var e=$("#text").val().toLowerCase().match(/[^\r\n]+/g);var t=[];for(var n=0;n<e.length;n++)compileLine(e[n],t);var r="";for(var n=0;n<t.length;n++)if(typeof t[n]=="string")r+="\n";else r+="0x"+t[n].toString(16)+" ";$("#compiledArray").val(r)};window.compileLine=function(e,t){var n=e.split(" ");if(n[0]=="load"){t.push(0);t.push(getInt(n[1]));t.pushArr(getInt(n[2]))}if(n[0]=="rload"){t.push(1);t.push(getInt(n[1]));t.push(getInt(n[1]))}if(n[0]=="push"){t.push(2);t.push(getInt(n[1]))}if(n[0]=="pop"){t.push(3);t.push(getInt(n[1]))}if(n[0]=="add"){t.push(4);t.push(getInt(n[1]));t.push(getInt(n[2]))}if(n[0]=="sub"){t.push(5);t.push(getInt(n[1]));t.push(getInt(n[2]))}if(n[0]=="mul"){t.push(6);t.push(getInt(n[1]));t.push(getInt(n[2]))}if(n[0]=="div"){t.push(7);t.push(getInt(n[1]));t.push(getInt(n[2]))}if(n[0]=="jmp"){t.push(8);t.pushArr(getInt(n[1]))}if(n[0]=="cmp"){t.push(9);t.push(getInt(n[1]));t.push(getInt(n[2]))}if(n[0]=="branchlt"){t.push(10);t.pushArr(getInt(n[1]))}if(n[0]=="brancheq"){t.push(11);t.pushArr(getInt(n[1]))}if(n[0]=="branchgt"){t.push(12);t.pushArr(getInt(n[1]))}if(n[0]=="branchne"){t.push(13);t.pushArr(getInt(n[1]))}t.push("NEW LINE")};window.getInt=function(e){if(e.trim().startsWith("<--"))return"COMMENT";if(e=="r0")return 0;if(e=="r1")return 1;if(e=="r2")return 2;if(e.startsWith("0x"))return parseInt(e,16);if(isNaN(parseInt(e)))alert(e);return getIntBytes(parseInt(e))};if(typeof String.prototype.startsWith!="function"){String.prototype.startsWith=function(e){return this.slice(0,e.length)==e}}Array.prototype.pushArr=function(e){this.push.apply(this,e)};window.getIntBytes=function(e){var t=[];var n=4;do{t[--n]=e&255;e=e>>8}while(n);return t}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.0/jquery.min.js"></script>
<textarea id="text" cols="40" rows="10"></textarea>
<br/>
<button onclick="compile()">Compile</button>
<br/>
<textarea id="compiledArray" cols="40" rows="10" readonly></textarea>
"VM" Specifications
The VM has 3 32 bit unsigned integral registers: R0, R1, R2. They are represented in hex as 0x00, 0x01, and 0x02.
The following operations must be supported:
The format is [name] [...operands...], [hexadecimal op-code] [...operands repeated...]
- LOAD [register] [4 byte value], 0x00 [register] [4 byte value]
- PUSH [register], 0x02 [register]
- POP [register], 0x03 [register]
- ADD [register, 1 byte] [register, 1 byte], 0x04 [register] [register]
- SUB [register, 1 byte] [register, 1 byte], 0x05 [register] [register]
- MUL [register, 1 byte] [register, 1 byte], 0x06 [register] [register]
- DIV [register, 1 byte] [register, 1 byte], 0x07 [register] [register]
- JMP [code line, 4 bytes], 0x08 [4 byte code line number]
- CMP [register, 1 byte] [register, 1 byte], 0x09 [register] [register]
- BRANCHLT [code line, 4 bytes], 0x0a [4 byte code line number]
Some notes:
- The above math operations add the values of 2 registers together, placing the output in the first register.
- CMP, the comparison operator, should compare the values of 2 registers and store the output in some internal flag (this can be implementation specific) for future use on branch instructions.
- If BRANCH is called before CMP, unless BRANCHEQ is called, the "VM" should not branch.
- PUSH/POP unsurprisingly push or pop numbers from the stack.
- Jump and Branch operators jump to a specific operation (line of code), not a binary address.
- Branch operations do not do the comparison. Rather, they take the output from the last comparison to execute.
- Branch and Jump operators use a zero based line number indexing system. (E.g. JMP 0 jumps to the first line)
- All operations are to be performed on unsigned numbers which overflow to zero and do not throw an exception on an integer overflow.
- Division by zero is not allowed and as such, the behavior of the program is not defined. You can (for example)...
- Crash the program.
- End execution of the VM and return it's current state.
- Show a "ERR: Division by 0" message.
- Termination of the program is defined as when the instruction pointer reaches the end of the program (a non empty program can be assumed).
Output The output must be exactly this (newlines included)
R0 3126900366
R1 0
R2 10000
Points
Points are calculated based on the following formula: Number Of Characters * (Seconds Needed To Run / 2)
To avoid hardware differences causing different times, each test will be run on my computer (i5-4210u, 8GB ram) in either ubuntu server or Windows 8, so try not to use some insane-exotic-runtime that only compiles on a Dual G5 Mac Pro with exactly 762.66 mb of free RAM.
If you are using a specialized runtime/language, please post a link to it.
- For interested parties, I have posted the testing code (written in C#) here: http://pastebin.com/WYCG5Uqu
Test Program
The idea came from here, so we'll use a somewhat modified version of their program.
The correct output for the program is: 3,126,900,366
In C:
int s, i, j;
for (s = 0, i = 0; i < 10000; i++) {
for (j = 0; j < 10000; j++)
s += (i * j) / 3;
}
In code: [R0 is representative of s, R1 of j, R2 of i]
LOAD R0 0
LOAD R2 0 <--outer loop value
LOAD R1 0 <--inner loop value
--Begin inner loop--
PUSH R1 <--push inner loop value to the stack
MUL R1 R2 <--(i*j)
PUSH R2
LOAD R2 3
DIV R1 R2 <-- / 3
POP R2
ADD R0 R1 <-- s+=
POP R1
PUSH R2
LOAD R2 1
ADD R1 R2 <--j++
POP R2
PUSH R2
LOAD R2 10000
CMP R1 R2 <-- j < 10000
POP R2
BRANCHLT 3 <--Go back to beginning inner loop
--Drop To outer loop--
LOAD R1 1
ADD R2 R1 <--i++
LOAD R1 10000
CMP R2 R1 <-- i < 10000
LOAD R1 0 <--Reset inner loop
BRANCHLT 2
In binary/hex:
0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x02 0x00 0x00 0x00 0x00
0x00 0x01 0x00 0x00 0x00 0x00
0x02 0x01
0x06 0x01 0x02
0x02 0x02
0x00 0x02 0x00 0x00 0x00 0x03
0x07 0x01 0x02
0x03 0x02
0x04 0x00 0x01
0x03 0x01
0x02 0x02
0x00 0x02 0x00 0x00 0x00 0x01
0x04 0x01 0x02
0x03 0x02
0x02 0x02
0x00 0x02 0x00 0x00 0x27 0x10
0x09 0x01 0x02
0x03 0x02
0x0a 0x00 0x00 0x00 0x03
0x00 0x01 0x00 0x00 0x00 0x01
0x04 0x02 0x01
0x00 0x01 0x00 0x00 0x27 0x10
0x09 0x02 0x01
0x00 0x01 0x00 0x00 0x00 0x00
0x0a 0x00 0x00 0x00 0x02
Bonus Points (Effects are applied multiplicatively) E.g. if you qualify for all three, it would be ((characters * 0.50) * 0.75) * 0.90
- 50% decrease if the interpreter is actually a JIT compiler
- 25% decrease if it applies any sort of loop unrolling/meaningful optimization.
- 10% decrease if you extend the VM with
- BRANCHEQ [code line, 4 bytes] (Branch if equal - opcode 0x0b)
- BRANCHGT [code line, 4 bytes] (Branch if greater than - opcode 0x0c)
- BRANCHNE [code line, 4 bytes] (Branch if not equal - opcode 0x0d)
- RLOAD [register 1] [register 2] (move the value of register 2 to register 1 - opcode 0x01).
Disallowed
- Precompiling the test case into the program is prohibited. You must either accept the bytecode from STDIN or from a file (Doesn't matter which).
- Returning the output without running the program.
- Any other way you can think of to cheat the VM requirement.
Why not include a few more test programs to discourage the things you said are disallowed? If it's a VM, it should be able to run anything written in the bytecode specification, right? – Kasran – 2014-11-23T04:23:15.807
I will try to do that tonight. I am writing the compiler right now. – Colorfully Monochrome – 2014-11-23T04:31:14.523
The compiler is here http://jsfiddle.net/aL9y19bd/23/
– Colorfully Monochrome – 2014-11-23T05:35:46.9031Does
CMP
check for less than or equality? And what happens to its result? – es1024 – 2014-11-23T05:38:13.3971
MUL
andDIV
are also underspecified. Should they be signed or unsigned? What happens on multiplication overflow? – feersum – 2014-11-23T05:44:45.070As for the
BRANCH**
opcodes, what is compared (value from a register, value from stack, something else)? – es1024 – 2014-11-23T05:54:32.063Actually that's called multiplicative stacking for the bonuses, not linear (or additive) – Claudiu – 2014-11-23T06:18:25.130
@Claudiu There - fixed. Thanks for pointing that out; I thought linear wasn't the right word. – Colorfully Monochrome – 2014-11-23T06:20:25.610
@es1024 CMP compares the value of 2 registers and stores the result internally so it can do BRANCH** operations. BRANCH** opcodes do not actually do a comparison. They use the result of the previous
CMP
operation to decide whether to jump or not. - I somewhat modeled this after x86 assembly. – Colorfully Monochrome – 2014-11-23T06:20:49.030@feersum Math is unsigned and on an integer overflow, overflow to zero rather than crashing. – Colorfully Monochrome – 2014-11-23T06:21:12.547
A few notes: (a) The second and third lines in the binary test program are swapped. (b) The final value of
R0
is3126900366
(which is-1168066930
as a signed value.) – Ell – 2014-11-24T00:03:10.160@Ell I just spent an hour or two on finding (a). Now my short but slow solution will look really bad next to yours. :D There's also 3 redundant operations in the bytecode. There's
POP R2 PUSH R2
and theLOAD R1 0
at the end. – Martin Ender – 2014-11-24T00:04:54.773What would be considered as a meaningful optimization? – es1024 – 2014-11-24T02:25:30.493
@es1024 Sorry. That was also a bit vague. A "meaningful optimization" is defined as any change from the code provided to the code executed that measurably improves performance. For example, detecting that 2 of the operations, in a row, are "POP R2" and "PUSH R2" and removing them because they do not change the behavior of the program would be an optimization. So, optimization could be dead code detection, loop unrolling, etc. – Colorfully Monochrome – 2014-11-24T03:01:04.417
@Ell - Thanks, I hadn't realized. I have updated the post to reflect this. – Colorfully Monochrome – 2014-11-24T03:01:27.197
@MartinBüttner That was on purpose. Have to provide some room for optimization...and your answer is still awesome. I don't think I could write a hello world program in less than 250 characters, much less an entire VM. – Colorfully Monochrome – 2014-11-24T03:02:37.207