31
10
Background
I like my old 8-bit 6502 chip. It's even fun to solve some of the challenges here on PPCG in 6502 machine code. But some things that should be simple (like, read in data or output to stdout) are unnecessarily cumbersome to do in machine code. So there's a rough idea in my mind: Invent my own 8-bit virtual machine that's inspired by the 6502, but with the design modified to be more usable for challenges. Starting to implement something, I realized this might be a nice challenge itself if the design of the VM is reduced to a bare minimum :)
Task
Implement an 8-bit virtual machine conforming to the following specification. This is code-golf, so the implementation with the fewest bytes wins.
Input
Your implementation should take the following inputs:
A single unsigned byte
pc
, this is the initial program counter (the address in memory where the VM starts execution,0
-based)A list of bytes with a maximum length of
256
entries, this is the RAM for the virtual machine (with its initial contents)
You may take this input in any sensible format.
Output
A list of bytes which are the final contents of the RAM after the VM terminates (see below). You can assume you only get input that leads to terminating eventually. Any sensible format is allowed.
Virtual CPU
The virtual CPU has
- an 8-bit program counter,
- an 8-bit accumulator register called
A
and - an 8-bit index register called
X
.
There are three status flags:
Z
- the zero flag is set after some operation results in0
N
- the negative flag is set after some operation results in a negative number (iow bit 7 of the result is set)C
- the carry flag is set by additions and shifts for the "missing" bit of the result
Upon start, the flags are all cleared, the program counter is set to a given value and the contents of A
and X
are indeterminate.
The 8-bit values represent either
- an unsigned integer in the range
[0..255]
- a signed integer, 2's complement, in the range
[-128..127]
depending on context. If an operation over- or underflows, the value wraps around (and in case of an addition, the carry flag is affected).
Termination
The virtual machine terminates when
- A
HLT
instruction is reached - A non-existing memory address is accessed
- The program counter runs outside the memory (note it doesn't wrap around even if the VM is given the full 256 bytes of memory)
Adressing modes
- implicit -- the instruction has no argument, the operand is implied
- immediate -- the operand is the byte directly after the instruction
- relative -- (for branching only) the byte after the instruction is signed (2's complement) and determines the offset to add to the program counter if the branch is taken.
0
is the location of the following instruction - absolute -- the byte after the instruction is the address of the operand
- indexed -- the byte after the instruction plus
X
(the register) is the address of the operand
Instructions
Each instruction consists of an opcode (one byte) and, in the addressing modes immediate, relative, absolute and indexed a second argument byte. When the virtual CPU executes an instruction, it increments the program counter accordingly (by 1
or 2
).
All opcodes shown here are in hex.
LDA
-- load operand intoA
- Opcodes: immediate:
00
, absolute:02
, indexed:04
- Flags:
Z
,N
- Opcodes: immediate:
STA
-- storeA
into operand- Opcodes: immediate:
08
, absolute:0a
, indexed:0c
- Opcodes: immediate:
LDX
-- load operand intoX
- Opcodes: immediate:
10
, absolute:12
, indexed:14
- Flags:
Z
,N
- Opcodes: immediate:
STX
-- storeX
into operand- Opcodes: immediate:
18
, absolute:1a
, indexed:1c
- Opcodes: immediate:
AND
-- bitwise and ofA
and operand intoA
- Opcodes: immediate:
30
, absolute:32
, indexed:34
- Flags:
Z
,N
- Opcodes: immediate:
ORA
-- bitwise or ofA
and operand intoA
- Opcodes: immediate:
38
, absolute:3a
, indexed:3c
- Flags:
Z
,N
- Opcodes: immediate:
EOR
-- bitwise xor (exclusive or) ofA
and operand intoA
- Opcodes: immediate:
40
, absolute:42
, indexed:44
- Flags:
Z
,N
- Opcodes: immediate:
LSR
-- logical shift right, shift all bits of operand one place to the right, bit 0 goes to carry- Opcodes: immediate:
48
, absolute:4a
, indexed:4c
- Flags:
Z
,N
,C
- Opcodes: immediate:
ASL
-- arithmetic shift left, shift all bits of the operand one place to the left, bit 7 goes to carry- Opcodes: immediate:
50
, absolute:52
, indexed:54
- Flags:
Z
,N
,C
- Opcodes: immediate:
ROR
-- rotate right, shift all bits of operand one place to the right, carry goes to bit 7, bit 0 goes to carry- Opcodes: immediate:
58
, absolute:5a
, indexed:5c
- Flags:
Z
,N
,C
- Opcodes: immediate:
ROL
-- rotate left, shift all bits of the operand one place to the left, carry goes to bit 0, bit 7 goes to carry- Opcodes: immediate:
60
, absolute:62
, indexed:64
- Flags:
Z
,N
,C
- Opcodes: immediate:
ADC
-- add with carry, operand plus carry is added toA
, carry is set on overflow- Opcodes: immediate:
68
, absolute:6a
, indexed:6c
- Flags:
Z
,N
,C
- Opcodes: immediate:
INC
-- increment operand by one- Opcodes: immediate:
78
, absolute:7a
, indexed:7c
- Flags:
Z
,N
- Opcodes: immediate:
DEC
-- decrement operand by one- Opcodes: immediate:
80
, absolute:82
, indexed:84
- Flags:
Z
,N
- Opcodes: immediate:
CMP
-- compareA
with operand by subtracting operand fromA
, forget result. Carry is cleared on underflow, set otherwise- Opcodes: immediate:
88
, absolute:8a
, indexed:8c
- Flags:
Z
,N
,C
- Opcodes: immediate:
CPX
-- compareX
-- same asCMP
forX
- Opcodes: immediate:
90
, absolute:92
, indexed:94
- Flags:
Z
,N
,C
- Opcodes: immediate:
HLT
-- terminate- Opcodes: implicit:
c0
- Opcodes: implicit:
INX
-- incrementX
by one- Opcodes: implicit:
c8
- Flags:
Z
,N
- Opcodes: implicit:
DEX
-- decrementX
by one- Opcodes: implicit:
c9
- Flags:
Z
,N
- Opcodes: implicit:
SEC
-- set carry flag- Opcodes: implicit:
d0
- Flags:
C
- Opcodes: implicit:
CLC
-- clear carry flag- Opcodes: implicit:
d1
- Flags:
C
- Opcodes: implicit:
BRA
-- branch always- Opcodes: relative:
f2
- Opcodes: relative:
BNE
-- branch ifZ
flag cleared- Opcodes: relative:
f4
- Opcodes: relative:
BEQ
-- branch ifZ
flag set- Opcodes: relative:
f6
- Opcodes: relative:
BPL
-- branch ifN
flag cleared- Opcodes: relative:
f8
- Opcodes: relative:
BMI
-- branch ifN
flag set- Opcodes: relative:
fa
- Opcodes: relative:
BCC
-- branch ifC
flag cleared- Opcodes: relative:
fc
- Opcodes: relative:
BCS
-- branch ifC
flag set- Opcodes: relative:
fe
- Opcodes: relative:
Opcodes
The behavior of the VM is undefined if any opcode is found that doesn't map to a valid instruction from the above list.
As per Jonathan Allan's request, you may choose your own set of opcodes instead of the opcodes shown in the Instructions section. If you do so, you must add a full mapping to the opcodes used above in your answer.
The mapping should be a hex file with pairs <official opcode> <your opcode>
, e.g. if you replaced two opcodes:
f4 f5
10 11
Newlines don't matter here.
Test cases (official opcodes)
// some increments and decrements
pc: 0
ram: 10 10 7a 01 c9 f4 fb
output: 10 20 7a 01 c9 f4 fb
// a 16bit addition
pc: 4
ram: e0 08 2a 02 02 00 6a 02 0a 00 02 01 6a 03 0a 01
output: 0a 0b 2a 02 02 00 6a 02 0a 00 02 01 6a 03 0a 01
// a 16bit multiplication
pc: 4
ram: 5e 01 28 00 10 10 4a 01 5a 00 fc 0d 02 02 d1 6a 21 0a 21 02 03 6a 22 0a 22 52
02 62 03 c9 f8 e6 c0 00 00
output: 00 00 00 00 10 10 4a 01 5a 00 fc 0d 02 02 d1 6a 21 0a 21 02 03 6a 22 0a 22 52
02 62 03 c9 f8 e6 c0 b0 36
I might add more testcases later.
Reference and testing
To help with own experiments, here's some (totally not golfed) reference implementation -- it can output tracing information (including disassembled instructions) to stderr
and convert opcodes while running.
Recommended way to get the source:
git clone https://github.com/zirias/gvm --branch challenge --single-branch --recurse-submodules
Or checkout branch challenge
and do a git submodule update --init --recursive
after cloning, to get my custom build system.
Build the tool with GNU make (just type make
, or gmake
if on your system, the default make isn't GNU make).
Usage: gvm [-s startpc] [-h] [-t] [-c convfile] [-d] [-x] <initial_ram
-s startpc
-- the initial program counter, defaults to0
-h
-- input is in hex (otherwise binary)-t
-- trace execution tostderr
-c convfile
-- convert opcodes according to a mapping given inconvfile
-d
-- dump resulting memory as binary data-x
-- dump resulting memory as hexinitial_ram
-- the initial RAM contents, either hex or binary
Note the conversion feature will fail on programs that modify opcodes while running.
Disclaimer: The rules and specs above are authorative for the challenge, not this tool. This especially applies to the opcode conversion feature. If you think the tool presented here has a bug wrt the specs, please report in a comment :)
1I'd imagine that there would probably be numerous golfing opportunities to be had by choosing different opcodes for the instructions but it seems the opcodes have been fixed (even though the instruction set is what defines the machine). Maybe it's worth considering allowing implementations to have their own code-page? – Jonathan Allan – 2018-08-23T13:42:35.167
1@JonathanAllan thought twice about it, I'm allowing it now and might add a "conversion" tool to make solutions using other sets of opcodes easily testable. – Felix Palmen – 2018-08-23T14:33:28.613
Related. – Peter Taylor – 2018-08-23T15:37:33.023
@PeterTaylor only very loosely -- I've seen this one ;) my instruction set is inspired by the 6502, but the architecture here is greatly simplified, so golfed solutions are (hopefully) realistic ;) – Felix Palmen – 2018-08-23T15:42:55.287
Adding the link in comments causes both questions to show up in the other's "Linked" sidebar, so in particular people who find the other question by some means will be slightly more likely to then find this one and answer it. – Peter Taylor – 2018-08-23T15:50:37.877
@PeterTaylor I didn't oppose mentioning it ;) just wanted to emphasize the relation is "weak" ... it's there because the instruction set here is inspired by the famous 6502 :) – Felix Palmen – 2018-08-23T19:24:52.353
What's the expected behavior of
STA
,STX
,DEC
, etc. with immediate addressing? Self-modifying code? – Arnauld – 2018-08-24T08:21:55.280@Arnauld yes, straight forward -- they modify their operand, so if the operand is immediate, that byte is modified :) (this would allow for very strange code, but with the liberty to choose your own opcodes, I can't use it in testcases ....) – Felix Palmen – 2018-08-24T08:25:36.257
1@Arnauld btw my reasoning for allowing this was to reduce the amount of special cases, so it should be better "golfable" -- each opcode is either implicit, a relative branch or allows all three other addressing modes :) – Felix Palmen – 2018-08-24T08:41:45.070
1if
BRA
("branch always") doesn't introduce a branch in the control flow, shouldn't it be calledJMP
? – ngn – 2018-08-25T10:57:56.5431@ngn well,
BRA
exists in later chip designs (the 6502 doesn't have such an instruction) like the 65C02 and the MC 68000.JMP
exists as well. The difference is thatBRA
uses relative addressing andJMP
uses absolute addressing. So, I just followed these designs -- indeed, it doesn't sound all that logical ;) – Felix Palmen – 2018-08-25T11:00:33.153Since
A
andX
are "indeterminate" at the start, does this mean we can assume the PC will not reach a command which reads either of them before it reaches one that has set them properly? – Erik the Outgolfer – 2018-08-27T09:44:17.403Also, does the length of the inputted RAM represent the size of the whole RAM, or should we somehow pad it to 256 bytes? – Erik the Outgolfer – 2018-08-27T09:45:59.893
@EriktheOutgolfer a) not for a program with a defined result (iow not for any testcase I might add), yes ;) b) the input is the RAM, so yes, with an input of size 64, 64 would be a non-existent address. – Felix Palmen – 2018-08-27T09:47:21.030
@FelixPalmen Alright, so I won't have to handle such edge-cases or pad the RAM, just making sure I've understood the challenge correctly, thanks. – Erik the Outgolfer – 2018-08-27T09:49:13.680
a) How can ADC, INC, INX affect the N flag? Is that for simplicity of the VM? b) Do the increments and decrements wrap or what else will happen on overflow/underflow? c) Why is there no SBC? – Titus – 2018-09-05T01:15:28.440
1@Titus a) what's special about those? as described,
N
is set whenever a result is "negative" (has bit 7 set). b) yes they do, there's not much else that would make sense, but maybe should be made explicit in the specs c) simplicity for the golfing challenge. subtraction can be done byeor #$ff, adc
anyways (with inverted meaning of the carry flag). – Felix Palmen – 2018-09-07T06:14:08.0171@FelixPalmen a) I think I´ve been working with 32 and 64 bit integers for too long. Memories of the 6510 and 8bit values are a bit blurred. c) how about a IVC (invert carry) instruction then? (j/k) – Titus – 2018-09-07T10:07:49.147
@Titus the funny thing is
sbc
on the 6502 indeed works with an inverted carry flag. I guess this is exactly because with 2's complement, it allows to reuse theadc
logic by inverting the operand :) – Felix Palmen – 2018-09-15T11:46:42.597Also, can the
C
flag be set when the operand over/underflows after anINC
,INX
,DEC
orDEX
? – None – 2018-09-21T12:04:56.033@Rogem no, these don't affect
C
. – Felix Palmen – 2018-09-22T12:29:05.383What does LSR IMM 8 mean? – l4m2 – 2018-10-20T08:49:11.877
@l4m2: If I understand you correctly, see Arnauld's comment and my following comments
– Felix Palmen – 2018-10-22T08:15:05.120