Tips for golfing in brainfuck

23

2

What general tips do you have for golfing in brainfuck? I'm looking for ideas that can be applied to code golf problems in general that are at least somewhat specific to brainfuck (e.g. "remove comments" is not an answer). Please post one tip per answer.

RAM

Posted 2013-10-23T05:55:01.537

Reputation: 797

Answers

26

Putting one tip per answer would be way too many answers.

  • Learn to think in Brainfuck. It's very different than anything else. Read and write, and rewrite, and rewrite lots of brainfuck programs. The language doesn't give you much to work with, so it's important to use what it does give you flexibly and efficiently. Don't let any abstractions get between you and the language--get in there and grapple with it.

  • Get very comfortable with nondestructive flow control. To get out of a decision loop, rather than zeroing the starting cell by copying it elsewhere and then copying it back after leaving the loop, it's often better to move the pointer to a pre-existing zero nearby. Yes, this means the pointer will be in different places depending on whether you went through the loop, but that also means those places probably have different arrangements of nearby zeros and nonzeros, which you can use to resynch the pointer location using another loop. This technique is fundamental to good Brainfuck programming, and various forms of it will constantly prove useful.

  • That and the fact that every > or < costs mean that the details of memory layout are important. Try out as many variations of your layout as you have patience for. And remember, your memory layout does not have to be a rigid mapping of data to locations. It can morph over the course of execution.

  • On a larger scale, consider and even try implementing a variety of different algorithms. Initially it will not be obvious exactly what algorithm will be best; it may not even be obvious what basic approach will be best, and it will probably be something different than what would be best in a normal language.

  • If you're dealing with large or variable-sized data, see if there's any way you can possibly deal with it locally, without having to keep track of how big it is or your numerical location within it.

  • The same data can be two different things. (Most often, a number or character and also a nonzero positional marker. But see random.b, where a bit counter doubles as the value of one cell of a cellular automaton.)

  • The same code can do two different things, and it's a lot easier to make it do so in a language where code is as generic as <+<. Be alert to such possibilities. In fact, you may occasionally notice, even in what seems to be a well-written program, that there are small portions that could be deleted entirely, nothing added, and the thing would, by happenstance, still run flawlessly.

  • In most languages, you use a compiler or interpreter frequently to check your program's behavior. The Brainfuck language demands greater conceptual control; if you need a compiler to tell you what your program does, you don't have a firm enough grasp of your program, and you probably need to stare at it some more--at least if you want to have a clear enough image of the conceptual halo of similar programs to be good at golf. With practice, you'll be producing a dozen versions of your program before you try running one, and by that point you'll be 95% sure that your shortest one will run correctly.

  • Good luck! Very few people bother to write Brainfuck concisely, but I think that's the only way the language can possibly justify continuing attention--as a stunningly obscure art form.

Daniel Cristofani

Posted 2013-10-23T05:55:01.537

Reputation: 947

Holy shit, thought I recognised your name. Big fan of your brainfuck programs, especially your super-short self-interpreter! – Jo King – 2018-02-20T11:22:25.893

"you may occasionally notice ... that there are small portions that could be deleted entirely, nothing added, and the thing would, by happenstance, still run flawlessly." This is so true, especially for a beginner. I've only ever written one answer in BF as far as I can remember, yet its revision history contains at least 3 instances where I was playing with the program and randomly went, "hey, the program still works if I remove this bit!"

– ETHproductions – 2018-02-21T04:05:14.510

"Try out as many variations of your layout as you have patience for," or can brute-force :P

– Esolanging Fruit – 2018-02-21T04:37:02.217

Really nice answer – RGS – 2020-02-06T00:12:32.440

3Reading this makes me want to try programming in Brainfuck now.. – Claudiu – 2014-09-26T17:44:28.927

9

A few tips here:

Constants:

The Esolangs' constants page has an extremely useful list of the shortest ways to create specific values. I find myself consulting this page at least twice per program.

The start of everything:

+++[[<+>>++<-]>]

This sets up the tape in the format 3*n^2, which looks like

3 6 12 24 48 96 192 128 0 0'

Why is this so important?

Let's go down the list:

  • 3 and 6 are boring
  • 12: Close to 10 (newline) or 13 (carriage return). Can also be used for the counter for 0-9
  • 24: Close to 26, the number of letters in the alphabet
  • 48: ASCII for 0
  • 96: Close to 97, ASCII for a
  • 196 and 128: 196-128=64, close to 65, the ASCII for A.

From this one algorithm, we're at the start of practically every sequence in the ASCII range, along with a counter for each and a newline in easy reach.

A practical example:

Printing all uppercase and lowercase letters and digits.

With algorithm:

+++[[<+>>++<-]>]<<[-<->]<<<<++[->>+.>+.<<<]<--[>>.+<<-]

Without:

+++++++++++++[->+++++++>++>+++++>++++>+<<<<<]>+++++>[-<+.>>.+<]>>---->---[-<.+>]

We spend most of the bytes just initialising the tape in the second example. Some of this it offset by the extra movements in the first example, but this method clearly has the advantage.

A couple of other interesting algorithms in the same vein:

3*2^n+1:

+++[[<+>>++<-]+>]
Tape: 4 7 13 25 49 65 197 129 1 0'

This offsets the values by 1, which accomplishes a few things. It makes the 12 a carriage return, the 64 the actual start of the uppercase alphabet, and the 24 one closer to 26.

2^n:

+[[<+>>++<-]>]
Tape: 1 2 4 8 16 32 64 128

Because 64 is good for uppercase letters, 32 is the ASCII for space, and 128 can be used as the counter for 26 (130/5 = 26). This can save bytes in certain situations where digits and lowercase letters aren't needed.

Choose the implementation that suits the question:

  • Negative cells are almost always useful, and there's no reason to avoid them (unless it doesn't change your bytecount)
  • Almost the exact same thing with wrapping cells, even more so because many constants use wrapping.
  • Arbitrary cell sizes are useful for infinite math sequences, such as calculating the Fibonacci sequence infinitely (+[[-<+>>+>+<<]>]), or processing larger/negative numbers. The downside is that some common methods, such as [-] and [->+<] can't be relied upon, just in case the number is negative.
  • EOF as 0, -1 or no change. 0 is usually preferable, as you can loop over an entire input without extra checks. -1 is useful when looping over array structures. I haven't yet found a use for no change :(.

Keep track of what the frick is going on:

At all times you should have comments on where the pointer should be in relation to the data around it, and make sure that you know the range of possible values of each cell. This is especially important when you've split the pointer before a loop, because you'll want to join the two possibilities back together afterwards.

At any point, my code is littered with comments on every other line that look like this:

*0 *dat a_1 ?  0' !0 0*
 or
*0 *dat 0' ap1 0  !0 0*

Some extra advice is to assign symbols special meanings. In the above example, ' is where the pointer is, * means repetition in that direction, ? means a cell with unknown value, !0 means a non-zero cell, _ is a substitute for - and p is a substitute for +. or implies that the tape could look like either of representations, and needs to be handled as such.

Your symbol scheme doesn't necessarily have to be the same as mine (which has a few flaws), it just has to be consistent. This is also extremely useful when debugging, because you can run it up to that point and compare the actual tape to what you should have, which can point out potential flaws in your code.

Jo King

Posted 2013-10-23T05:55:01.537

Reputation: 38 234

5

My principal piece of advice would be don't.

OK, fine, you want something more useful than that. BF is already a very terse language, but what really kills you is arithmetic, which effectively needs to be done in unary. It's worth reading over the constants page in Esolang to pick out exactly how to write large numbers efficiently, and to exploit wrapping wherever possible.

Memory access is also very expensive. Since you're reading from a tape, you need to keep in mind where your head is moving at any given time. Unlike other languages where you can just write a, b, c, in bf you have to explicitly move the head some number of bytes left or right, so you must be mindful of where you store what. I'm pretty sure that organising your memory in the optimal way is NP-hard, so good luck with that one.

ymbirtt

Posted 2013-10-23T05:55:01.537

Reputation: 1 792

5

In this answer I'm going to refer to a specific cell on the tape many times. It doesn't matter which cell it is, but it's the same cell throughout the entire answer. For the purposes of this post, I'll call that cell "Todd".

When trying to set a cell to a constant value, it sometimes pays to not finish it immediately. For example, say you wanted Todd to contain 30. Later on in your code (which may modify Todd's value but never reads it) you come back to Todd. If Todd's value is 0, the program exits. Otherwise, Todd's value is printed forever.

According to the esolangs.org page of brainfuck constants (which could probably be the subject of a tip on its own!) the shortest way to get 30 is >+[--[<]>>+<-]>+. That leading > is only there to ensure that nothing to the left of the pointer is modified, but in this case we'll assume we don't care about that and drop it. Using that code, your code would look something like this:

+[--[<]>>+<-]>+(MISC. CODE)(GO TO TODD)[.]

You can think of the first chunk of code like this:

(SET TODD TO 30)(MISC. CODE)(GO TO TODD)[.]

But remember the last two characters in that chunk: >+. It's just as valid to think of it this way:

(SET TODD TO 29)(GO TO TODD)(ADD 1 TO TODD)(MISC. CODE)(GO TO TODD)[.]

Notice that you (GO TO TODD) twice! You could instead write your code this way:

(SET TODD TO 29)(MISC. CODE)(GO TO TODD)(ADD 1 TO TODD)[.]
+[--[<]>>+<-](MISC. CODE)(GO TO TODD)+[.]

Assuming that the number of bytes it takes to (GO TO TODD) is the same before, one less move == one less byte! Sometimes the fact that your starting position has changed does take that benefit away, but not always.

undergroundmonorail

Posted 2013-10-23T05:55:01.537

Reputation: 5 897

0

A tiny little tip for challenges without input. You can use , instead of [-], if you need to clear the cell quickly, as most of the interpreters (including the TIO.run one) will set the cell contents to EOF representation being zero. This makes programs a tiny bit unportable, but who cares about it in code golf anyway?

Krzysztof Szewczyk

Posted 2013-10-23T05:55:01.537

Reputation: 3 819