Self-Mutilating Program

16

2

Simply put, your goal is to create a complete program that modifies its own source code until every character of the source is different than what it started as.

Please include the beginning source as well as the ending source in your post, as well as a description. E.g. Describe what (else) your program does, the language you used, your strategy, etc.

Rules

  • Your program must halt sometime after the modification is complete.
  • It must actually modify its own, currently running source code (not necessarily the file you passed to the interpreter, it modifies its instructions), not print a new program or write a new file.
  • Standard loopholes are disallowed.
  • Shortest program wins.

  • If your language can modify its own file and execute a new compiler process, but cannot modify its own (currently running) source code, you may write such a program instead at a +20% bytes penalty, rounded up. Real self-modifying languages should have an advantage.

Edit: If your program halts with errors, please specify it as such (and maybe say what the errors are.)

mbomb007

Posted 2015-10-07T18:32:54.120

Reputation: 21 944

7Do I understand correctly that the program should modify its own source while it is running, in a way that potentially affects its behavior? This would rule out most non-esoteric languages. Or is it allowed to modify the source and launch a new interpreter/compiler process on it? – Zgarb – 2015-10-07T18:38:01.397

@Zgarb It must actually modify its own, currently running source code. Yes, that rules out most languages. – mbomb007 – 2015-10-07T18:39:04.677

8@mbomb007 That's bad. – mınxomaτ – 2015-10-07T18:41:04.900

1@mbomb007 It's says nowhere in you challenge that it has to run the modified source code. – mınxomaτ – 2015-10-07T18:42:39.273

1Also, no it doesn't make this challenge trivial, it'll still good scoped. You ruled out too many languages with this. – mınxomaτ – 2015-10-07T18:44:17.283

Let us continue this discussion in chat.

– mbomb007 – 2015-10-07T18:45:05.653

I take it there is no requirement for the final source code to be the same length as the initial source code? – trichoplax – 2015-10-09T11:37:47.737

@trichoplax Correct. Only the area of the source code needs to be different. Any memory locations that are outside the initial area do not matter. – mbomb007 – 2015-10-09T18:05:46.810

@mbomb007 I was thinking of answers that reduce the source code rather than expanding it. If the source code is reduced to zero does that still count since the memory no longer occupied by source code will then be occupied by something different? – trichoplax – 2015-10-09T18:08:12.567

@trichoplax Ah, yes. I think that's reasonable. – mbomb007 – 2015-10-09T18:18:17.657

Must the end result have the same amount of characters as the original source? – ericw31415 – 2016-04-27T09:43:20.770

No. My program in Self-modifying BF is an example of one that doesn't.

– mbomb007 – 2016-04-27T14:08:43.860

Answers

19

///, 1 byte

/

The program finds a / (the start of a pattern-replacement group), and removes it in preparation to make the replacement. Then it reaches EOF, so it gives up and halts.

lirtosiast

Posted 2015-10-07T18:32:54.120

Reputation: 20 331

The is the earliest answer with 1 byte, so it's the winner. – mbomb007 – 2015-10-15T15:40:44.393

22

Labyrinth, 2 bytes

>@

The > rotates the source so that it becomes

@>

The instruction pointer is now in a dead end and turns around to hit the @ which terminates the program.

Of course, <@ would also work.

Martin Ender

Posted 2015-10-07T18:32:54.120

Reputation: 184 808

12

Python 2, 225 bytes

import sys
from ctypes import*
c=sys._getframe().f_code.co_code
i=c_int
p=POINTER
class S(Structure):_fields_=zip("r"*9+"v",(i,c_void_p,i,c_char_p,i,p(i),i,c_long,i,c_char*len(c)))
cast(id(c),p(S)).contents.v=`len([])`*len(c)

The ending source code is a string of "0"s whose length is equal to the number of bytes in the original compiled code object.

The code finds the running code object, sys._getframe().f_code.co_code, and creates a structure which represents python string objects. It then gets the memory that the code actually takes and replaces it with "0"*len(c).

When ran, the program exits with the following traceback:

XXX lineno: 7, opcode: 49
Traceback (most recent call last):
  File "main.py", line 7, in <module>
    cast(id(c),p(S)).contents.v=`1+1`*len(c)
SystemError: unknown opcode

This shows that the overwrite was successful and the program dies because 0 isn't a valid opcode.

I'm surprised that this is even possible in python, frame objects are read-only, I can't create new ones. The only complicated thing this does is change an immutable object (a string).

Blue

Posted 2015-10-07T18:32:54.120

Reputation: 26 661

Not sure if this quite meets the requirements that EVERY character must be different. The "1" in the original source code would still be a "1" in the mangled code... – Darrel Hoffman – 2015-10-09T13:05:00.610

Well actually, the "1" string in the code isn't actually part of the 'code', its just a constant that's referred to in the bytecode. What I'm actually changing is the compiled python virtual machine opcodes, not the constants or variables. So what I'm changing isn't the source code per say, just the compiled code. I could change the source code as stored but that wouldn't actually affect the code at runtime because it would have already been compiled. If you wanted, I could post this in a 'compiled Python 2.7 opcodes with constants', but that would be silly IMO. – Blue – 2015-10-09T16:31:01.430

And also, I can't look at the compiled code because by changing it to see inside, I'm actually changing the code, meaning I don't actually see the code. So really, I have no idea if the code really replaces every character, just that it changes most(?) of them – Blue – 2015-10-09T16:32:48.463

To get around the issue of the 1 not being changed in the compiled code, you could change the "1" to <backtick>1+1<backtick> for only 2 more bytes – Mego – 2016-04-28T01:59:28.390

Not that I see (compiled with 2.7.10). Unfortunately, the 1+1 from my suggestion gets turned into a 2 in the compiled version... The compiler is too smart for its own good! – Mego – 2016-04-28T05:53:52.820

I always forget you can compile using a library at runtime – Blue – 2016-04-28T06:00:01.417

Unfortunately, the 1+1 trick doesn't work, because the compiled code contains a 2, thanks to optimizing the addition. Another way I've found is using len([]), which will replace everything with a 0 (also an invalid opcode), with no 0s to be found in the compiled code.

– Mego – 2016-04-28T06:06:57.900

You should post this to https://codegolf.stackexchange.com/questions/119346/a-program-that-forgets-itself (if it doesn't get closed as duplicate)

– pppery – 2017-08-05T21:50:49.953

11

evil, 1 byte

q

evil has several memory stores - one is the source code itself and one is the wheel which is a circular queue which is initialised to a single zero. q swaps the source code and the wheel, so it replaces the source with a null-byte. However, only lower-case letters are actual operators in evil, so that character is simply a no-op and the program terminates.

Martin Ender

Posted 2015-10-07T18:32:54.120

Reputation: 184 808

6

MSM, 8 bytes

'.qp.;.;

Transforms the source code to pqpqpqpq

MSM operates on a list of strings. Commands are taken from the left and they treat the right side as a stack. MSM always works on it's own source.

Execution trace:

'.qp.;.;                       upon start the source is implicitly split into a
                               list of single char strings

' . q p . ; . ;                ' takes the next element and pushes it on the stack
    q p . ; . ; .              q is not a command so it's pushed
      p . ; . ; . q            same for p
        . ; . ; . q p          . concats the top and next to top element
          ; . ; . pq           ; duplicates the top element
            . ; . pq pq        concat
              ; . pqpq         dup
                . pqpq pqpq    concat
                  pqpqpqpq     MSM stops when there's only one element left      

nimi

Posted 2015-10-07T18:32:54.120

Reputation: 34 639

6

Malbolge, 1 or 2 bytes.

D

The Malbolge language "encrypts" each instruction after executing it, so this letter (Malbolge NOP) will become an ! (which is also a nop), and then terminate. For some reason, the Malbolge interpreter I use requires two bytes to run, giving DC (both of which are nops) becoming !U (both of which are also nops)

Edit: The initial state of Malbolge memory depends on the last two characters in the code, so it is not well defined for one character programs. (Though this code doesn't care about the initial state of memory)

pppery

Posted 2015-10-07T18:32:54.120

Reputation: 3 987

5

SMBF, 92 bytes

Can be golfed, and I'll probably work more on it later.

>>+>>+>>+>>+>>+>>+[<-[>+<---]>+++++<<]>>>>>--[>-<++++++]>--->>++>+++++[>------<-]>->>++[<<]<

Explanation

The program generates the following commands at the end of its tape to erase itself, so it has to generate the following values on the tape:

[[-]<]          ASCII: 91 91 45 93 60 93

Make a bunch of 91s, with nulls (shown as _) between to use for temp values.

>>+>>+>>+>>+>>+>>+[<-[>+<---]>+++++<<]

code__91_91_91_91_91_91_
   ^

Adjust the values by the differences

>>>>>--[>-<++++++]>---  Sub 46
>>++                    Add 2
>+++++[>------<-]>-     Sub 31
>>++                    Add 2
[<<]<                   Shift left to the code
code__[_[_-_]_<_]_      Zero out the code
   ^

The tape following execution will be all zeros, with the exception of the generated code [_[_-_]_<_].

Note:

This program made me realize that my Python interpreter for SMBF has a bug or two, and I haven't figured out a fix yet. It's fixed now.

mbomb007

Posted 2015-10-07T18:32:54.120

Reputation: 21 944

5

x86 asm - 6 bytes

not sure if "until every character of the source is different than what it started as" refers to each byte, each nemonic, or general modification. if I'm invalid I can change the xor to a rep xor so each bit changes values but was hoping not to do that to save 6 more bytes to stay at least a little bit comparable to these specialty golf languages.

All this does is change a c2 to a c3 retn by getting live address of eip and xoring 5 bytes in front.

58          | pop eax                        ; store addr of eip in eax
83 70 05 01 | xor dword ptr ds:[eax + 5], 1  ; c2 ^ 1 = c3 = RETN
c2          | retn                           ; leave

Pulga

Posted 2015-10-07T18:32:54.120

Reputation: 175

4

Redcode, 7 bytes, 1 instruction (Just an example. Not competing)

This is a trivial example.

Moves the next memory location onto itself, then halts (because the entire memory is initialized to DAT 0 0, which halts the program when executed.)

MOV 1 0

mbomb007

Posted 2015-10-07T18:32:54.120

Reputation: 21 944

2Why are you counting this as instructions instead of bytes? – Martin Ender – 2015-10-07T19:29:42.730

Because I don't know how many bytes it is. I think that's dependent on memory size, or implementation?... – mbomb007 – 2015-10-07T19:31:56.073

4I'd count by ASCII characters if you don't know how it's implemented. – lirtosiast – 2015-10-07T19:38:09.160

1From the Wikipedia page: Each Redcode instruction occupies exactly one memory slot and takes exactly one cycle to execute. ... The memory is addressed in units of one instruction. – mbomb007 – 2015-10-07T19:47:48.880

3All [tag:code-golf] posts are scored in bytes. Since there is no Redcode machine code, we must use the characters in the "assembly source", not what it assembles to. – lirtosiast – 2015-10-07T20:31:25.983

In ICWS-86 standard Redcode, each instruction is 4 bytes at coresize 8192, so 4 bytes. – CalculatorFeline – 2016-04-28T02:33:53.463

Don't forget to indicate ICWS-86 Redcode in the title. – CalculatorFeline – 2016-04-28T19:28:49.693

@CatsAreFluffy Do you have a citation? I don't see how that fits in 4 bytes. 8192 = 2^15, so A and B each need 2 bytes, right? The instruction and addressing modes probably also each need a byte, or possibly they could both fit into the same byte. That's 5 at minimum. Or are A and B stuck into the same byte? – mbomb007 – 2016-04-28T19:44:03.437

No, can't find it. – CalculatorFeline – 2016-04-28T21:24:17.987

To keep the language simple and abstract no numerical equivalents have been defined for the OpCodes, so using mathematical operations on them wouldn't make any sense at all. - The beginners' guide to Redcode – mbomb007 – 2016-04-28T21:35:45.217

4

Emacs Lisp 22 bytes

(defun a()(defun a()))

Run from REPL:

ELISP> (defun a()(defun a()))
a
ELISP> (symbol-function 'a)
(lambda nil
  (defun a nil))

ELISP> (a)
a
ELISP> (symbol-function 'a)
(lambda nil nil)

Function now evaluates to nil.

Alternately (unbind itself) 30 bytes

(defun a()(fmakunbound 'a)(a))

Evaluate and errors as void-function. Function existed prior to being run.

Jonathan Leech-Pepin

Posted 2015-10-07T18:32:54.120

Reputation: 273

3

Powershell 65 bytes

function a{si -pat:function:a -va:([scriptblock]::create($null))}

Define a function that rewrites itself to null.

Evaluate it once and it eliminates itself.

Alternately (deletes itself from memory) 36 bytes

function a{remove-item function:a;a}

Calling it first removes it then attempts to evaluate recursively. Erroring out as an unknown command.

Jonathan Leech-Pepin

Posted 2015-10-07T18:32:54.120

Reputation: 273

3

MIXAL, 6 bytes (counting 2 tabs)

    STZ    0

The program starts at memory location 0 and then writes 0 to memory location 0, thus erasing itself. The machine halts automatically.

This is the assembly language for Donald Knuth's hypothetical MIX computer, which may be assembled and run using GNU MIX development kit (https://www.gnu.org/software/mdk/).

musarithmia

Posted 2015-10-07T18:32:54.120

Reputation: 531

3

><>, 40 34 30 bytes

0&00a6*0&1+:&060"c"l=?!.~~r >p

Try it here!

Explanation:

0&          Adds 0 to the registry
00a6*       Adds "0,0,<" to the stack; coords followed by a character
------------loop start (for clarity)
0           0 added to stack
&1+:&       Registry retrieved, increased by 1, duplicated, one put back in registry
0           ASCII character 0 added to stack (just a 0 but will be converted to that character when inserted in the code)
60          6 and 0 added to stack
"c"         The number 99 added to stack (length of code + 1 * 3)
l=?         Code length added to stack, checks if it is equal to 111

!.          If false, pointer jumps back to character (9,0) (loop start position)
~~r >p      If true, removes the 0 and 9, reverses stack, then loops the p command setting
all the characters to a "<" character and the 2nd character to a " "

Basically this puts a bunch of 3 character blocks in the stack like so: (ypos, xpos, ASCII character) which gets reversed at the end so the final 'p' command reads (character, xpos, ypos) and sets that position in the code to that character. The first character is manually set as '<', so that the code ends up being '>p<' at the end to loop the command. Then every other character is overwritten as a ' ' including the p character. The ' ' is actually "ASCII CHAR 0" which is NOT a NOP and will give an error when read.

Also there have to be an odd(?) number of characters before the 'p' command or else it wont be looped back into a last time and overwritten.

torcado

Posted 2015-10-07T18:32:54.120

Reputation: 550

2

Batch, 11 bytes

@echo>%0&&*

Modifies source code to ECHO is on.

@           - don't echo the command.
 echo       - print ECHO is on.
     >%0    - write to own file.
        &&* - execute * command after the above is done, * doesn't exist so program halts.

The @ is there so the command isn't echoed, but mostly so the two echos don't line up.

ericw31415

Posted 2015-10-07T18:32:54.120

Reputation: 2 229

the @ can be removed, because ECHO (uppercase) != echo (lowercase) – pppery – 2016-06-13T19:21:33.610

@ppperry The two echos can't line up. – ericw31415 – 2016-06-16T00:18:47.587

But they're different cases. – pppery – 2016-06-16T00:21:03.687

2

Jolf, 4 bytes, noncompeting

₯S₯C

This ₯Sets the ₯Code element's value to the input, undefined as none is given. Try it here!

Conor O'Brien

Posted 2015-10-07T18:32:54.120

Reputation: 36 228

0

(Filesystem) Befunge 98, 46 bytes

ff*:1100'aof0'ai
               21f0'ai@

Note that this program creates and manipulates a file named a. How it works:

  1. The code creates a file named a containing the entire code (out to 256 characters in each dimension) shifted one space upward and two the left.
  2. This program then reads the file named a as one line, replacing the entire first line with the contents of the a file.
  3. The second line, which has been copied in front of the IP, is executed
  4. Which reads the a file into the second line shifted two places to the right.

As a side effect, the ending source code isn't even valid Befunge! (because it contains newlines as data in a single line)

pppery

Posted 2015-10-07T18:32:54.120

Reputation: 3 987

0

Python 2, 238 bytes + 20% = 285.6

# coding: utf-8
import codecs
with codecs.open(__file__,'r') as f:
    t = f.read()
n="utf-8" if t.startswith("# coding: ascii") else "ascii"
with codecs.open(__file__,'w', encoding=n) as f:
    f.write(t[0]+" coding: "+n+t[t.find("\n"):])

Basically, this toggles the current file encoding of the python source between ascii and utf-8, thus essentially changing every character of the source!

Prahlad Yeri

Posted 2015-10-07T18:32:54.120

Reputation: 101

There are some extra spaces that can be removed. ) as -> )as, ) else -> )else, "utf-8"if, 'w',encoding. – mbomb007 – 2016-06-13T18:43:37.213