113
77
This is Fortnightly Challenge #3. Theme: Genetic Algorithms
This challenge is a bit of an experiment. We wanted to see what we could do, challenge-wise, with genetic algorithms. Not everything may be optimal, but we tried our best to make it accessible. If this works out, who knows what we might see in the future. Maybe a genetic King of the Hill?
The spec is quite long! We've tried to separate the spec into The Basics - the bare minimum you need to know to start playing with the framework and submit an answer - and The Gory Details - the full spec, with all details about the controller, based on which you could write your own.
If you have any questions whatsoever, feel free to join us in chat!
You're a researcher in behavioural psychology. It's Friday evening and you and your colleagues decide to have some fun and use your lab rats for a little rat race. In fact, before we get too emotionally attached to them, let's call them specimens.
You have set up a little race track for the specimens, and to make it more interesting, you've put a few walls and traps and teleporters across the track. Now, your specimens are still rats... they have no idea what a trap or a teleporter is. All they see is some things in different colours. They also don't have any sort of memory - all they can do is make decisions based on their current surroundings. I guess natural selection will pick out the specimens that know how to avoid a trap from those that don't (this race is going to take a while...). Let the games begin!†
† 84,465 specimens were harmed in the making of this challenge.
The Basics
This is a single-player game (you and your colleagues didn't want to mix up the populations so each one built their own race track). The race track is a rectangular grid, 15 cells tall and 50 cells wide. You start with 15 specimens on random (not necessarily distinct) cells on the left edge (where x = 0). Your specimens should try to reach the goal which is any cell at x ≥ 49 and 0 ≤ y ≤ 14 (the specimens may overshoot the track to the right). Each time this happens, you get a point. You also start the game with 1 point. You should try to maximise your points after 10,000 turns.
Multiple specimens may occupy the same cell and will not interact.
At each turn, each specimen sees a 5x5 grid of their surroundings (with itself in the centre). Each cell of that grid will contain a colour -1
to 15
. -1
represents cells that are out of bounds. Your specimen dies if it moves out of bounds. As for the other colours, they represent empty cells, traps, walls and teleporters. But your specimen doesn't know which colour represents what and neither do you. There are some constraints though:
- 8 colours will represent empty cells.
- 4 colours will represent a teleporter. A teleporter will send the specimen to a certain cell within its 9x9 neighbourhood. This offset will be the same for all teleporters of the same colour.
- 2 colours will represent walls. Moving into a wall is the same as standing still.
- 2 colours will represent a trap. A trap indicates that one of the 9 cells in its immediate neighbourhood is lethal (not necessarily the trap cell itself). This offset will be the same for all traps of the same colour.
Now, about that natural selection... each specimen has a genome, which is a number with 100 bits. New specimens will be created by cross-breeding two existing specimens, and then mutating the genome slightly. The more successful a specimen, the bigger its chance of reproducing.
So here is your task: You will write a single function, which receives as input the 5x5 grid of colours a specimen sees, as well as its genome. Your function will return a move (Δx, Δy) for the specimen, where Δx and Δy will each be one of {-1, 0, 1}
. You must not persist any data between function calls. This includes using your own random number generators. Your function will be provided with a seeded RNG which you are free to use as you want.
Your submission's score will be the geometric mean of the number of points across 50 random tracks. We have found that this score is subject to a fair bit of variance. Therefore, these scores will be preliminary. Once this challenge dies down, a deadline will be announced. At the end of the deadline, 100 boards will be chosen at random, and all submissions will be rescored on these 100 boards. Feel free to put an estimated score in your answer, but we will score every submission ourselves to ensure no one cheats.
We have provided controller programs in a handful of languages. Currently, you can write your submission in Python (2 or 3), Ruby, C++, C# or Java. The controller generates the boards, runs the game and provides a framework for the genetic algorithm. All you have to do is provide the moving function.
Wait, so what exactly do I do with the genome?
The challenge is to figure that out!
Since the specimens have no memory, all you've got in a given turn is a 5x5 grid of colours that don't mean anything to you. So you'll have to use the genome to reach the goal. The general idea is that you use parts of the genome to store information about the colours or the grid layout, and your bot bases its decisions on the additional information stored in the genome.
Now, of course you can't actually store anything there manually. So the actual information stored there will initially be completely random. But the genetic algorithm will soon select those specimens whose genome contains the right information while killing off those which have the wrong information. Your goal is to find a mapping from the genome bits and your field of view to a move, which allows you to find a path to the goal quickly and which consistently evolves to a winning strategy.
This should be enough information to get you started. If you want, you can skip the next section, and select your controller of choice from the list of controllers at the bottom (which also contains information about how to use that particular controller).
Read on if you want all...
The Gory Details
This specification is complete. All controllers have to implement these rules.
All randomness uses a uniform distribution, unless stated otherwise.
Track generation:
- The track is a rectangular grid, X = 53 cells wide and Y = 15 cells tall. Cells with x ≥ 49 are goal cells (where x is zero-based).
- Each cell has a single colour and may or may not be lethal - cells are not lethal unless specified by one of the cell types below.
- There are 16 different cell colours, labelled from
0
to15
, whose meaning will change from game to game. In addition,-1
represents cells that are out of bounds - these are lethal. - Choose 8 random colours. These will be empty cells (which have no effect).
- Choose 4 more random colours. These are teleporters. For two of these colours, choose a non-zero offset in the 9x9 neighbourhood (from (-4,-4) to (4,4) except (0,0)). For the other two colours, invert those offsets. If a specimen steps on a teleporter it is immediately moved by that offset.
- Choose 2 more random colours. These are traps. For each of these colours, choose an offset in the 3x3 neighbourhood (from (-1,-1) to (1,1)). A trap indicates that the cell at that offset is lethal. Note: The trap cell itself is not necessarily lethal.
- The 2 remaining colours are walls, which impede movement. Attempting to move onto a wall cell will turn the move into staying still. Wall cells themselves are lethal.
- For each non-goal cell of the grid, choose a random colour. For each goal cell choose a random empty colour.
- For each cell at the left edge of the track, determine whether the goal can be reached within 100 turns (according to the turn order rules below). If so, this cell is an admissible starting cell. If there are less than 10 starting cells, discard the track and generate a new one.
- Create 15 specimens, each with a random genome and age 0. Place each specimen on a random starting cell.
Turn order:
- The following steps will be performed, in order, for each specimen. Specimens do not interact or see each other, and may occupy the same cell.
- If the specimen's age is 100, it dies. Otherwise, increment its age by 1.
- The specimen is given its field of view - a 5x5 grid of colours, centred on the specimen - and returns a move in its 3x3 neighbourhood. Moves outside this range will cause the controller to terminate.
- If the target cell is a wall, then the move is changed to (0,0).
- If the target cell is a teleporter, the specimen is moved by the teleporter's offset. Note: This step is performed once, not iteratively.
- If the cell currently occupied by the specimen (potentially after using one teleporter) is lethal the specimen dies. This is the only time specimens die (apart from step 1.1. above). In particular, a new specimen which spawns on a lethal cell will not die immediately, but has a chance to move off the dangerous cell first.
- If the specimen occupies a goal cell, score a point, move the specimen to a random starting cell and reset its age to 0.
- If there are less than two specimens left on the board, the game ends.
- Create 10 new specimens with age 0. Each genome is determined (individually) by the breeding rules below. Place each specimen on a random starting cell.
Breeding:
When a new specimen is created choose two distinct parents at random, with a bias towards specimens which have progressed further to the right. The probability of a specimen to be chosen is proportional to its current fitness score. A specimen’s fitness score is
1 + x + 50 * number of times it reached the goal
where x is the 0-based horizontal index. Specimens created in the same turn cannot be chosen as parents.
Of the two parents, choose a random one to take the first genome bit from.
- Now as you walk along the genome, switch parents with a probability of 0.05, and keep taking bits from the resulting parent.
- Mutate the fully assembled genome: for each bit, flip it with probability 0.01.
Scoring:
- One game lasts 10,000 turns.
- Players start the game with 1 point (to allow use of the geometric mean).
- Every time a specimen reaches the goal, the player scores a point.
- For now, each player's submission will be run for 50 games, each with a different random track.
- The above approach results in more variance than is desirable. Once this challenge dies down, a deadline will be announced. At the end of the deadline, 100 boards will be chosen at random, and all submissions will be rescored on these 100 boards.
- A player's overall score is the geometric mean of the scores of these individual games.
The Controllers
You may choose any of the following controllers (as they are functionally equivalent). We have tested all of them, but if you spot a bug, want improve the code or performance, or add a feature like graphical output, please do send raise an issue or send a pull request on GitHub! You're also welcome to add a new controller in another language!
Click the language name for each controller to get to the right directory on GitHub, which contains a README.md
with exact usage instructions.
If you're not familiar with git and/or GitHub, you can download the entire repository as a ZIP from the front page (see the button in the sidebar).
Python
- Most thoroughly tested. This is our reference implementation.
- Works with both Python 2.6+ and Python 3.2+!
- It's very slow. We recommend running it with PyPy for a substantial speedup.
- Supports graphical output using either
pygame
ortkinter
.
Ruby
- Tested with Ruby 2.0.0. Should work with newer versions.
- It's also fairly slow, but Ruby may be convenient for prototyping an idea for a submission.
C++
- Requires C++11.
- Optionally supports multithreading.
- By far the fastest controller in the bunch.
C#
- Uses LINQ, so it requires .NET 3.5.
- Rather slow.
Java
- Not particularly slow. Not particularly fast.
Preliminary Leaderboard
All scores are preliminary. Nevertheless, if something is plain wrong or out of date, please let me know. Our example submission is listed for comparison, but not in contention.
Score | # Games | User | Language | Bot
===================================================================================
2914.13 | 2000 | kuroi neko | C++ | Hard Believers
1817.05097| 1000 | TheBestOne | Java | Running Star
1009.72 | 2000 | kuroi neko | C++ | Blind faith
782.18 | 2000 | MT0 | C++ | Cautious Specimens
428.38 | | user2487951 | Python | NeighborsOfNeighbors
145.35 | 2000 | Wouter ibens | C++ | Triple Score
133.2 | | Anton | C++ | StarPlayer
122.92 | | Dominik Müller | Python | SkyWalker
89.90 | | aschmack | C++ | LookAheadPlayer
74.7 | | bitpwner | C++ | ColorFarSeeker
70.98 | 2000 | Ceribia | C++ | WallGuesser
50.35 | | feersum | C++ | Run-Bonus Player
35.85 | | Zgarb | C++ | Pathfinder
(34.45) | 5000 | Martin Büttner | <all> | ColorScorePlayer
9.77 | | DenDenDo | C++ | SlowAndSteady
3.7 | | flawr | Java | IAmARobotPlayer
1.9 | | trichoplax | Python | Bishop
1.04 | 2000 | fluffy | C++ | Gray-Color Lookahead
Credits
This challenge was a huge collaborative effort:
- Nathan Merril: Wrote Python and Java controllers. Turned the challenge concept from a King-of-the-Hill into a Rat Race.
- trichoplax: Playtesting. Worked on Python controller.
- feersum: Wrote C++ controller.
- VisualMelon: Wrote C# controller.
- Martin Büttner: Concept. Wrote Ruby controller. Playtesting. Worked on Python controller.
- T Abraham: Playtesting. Tested Python and reviewed C# and C++ controller.
All of the above users (and probably a couple more I forgot) have contributed to the overall design of the challenge.
C++ controller update
If you are using the C++ with Visual Studio and multithreading, you should get the latest update because of a bug with their random number generator seeding which allows duplicate boards to be produced.
@kuroineko I may be missing something but it seems to me the rng instance is already created per thread. – Anton – 2015-01-20T10:22:14.013
@kuroineko Anton is correct; each thread has its own RNG (one is created per game). If you see any more controller problems, please report them in chat.
– feersum – 2015-01-20T10:25:11.283Is there a way of downloading all those files (at once) just from the website? I do not have any experience with git. Alternatively, do you know any good (not too long please...) tutorials on how to use git/github? – flawr – 2015-01-20T19:22:35.093
Is there any reason a rat can't know its own age? It seems that the rat doesn't know how far it has gotten, even though that's part of its fitness calculation. – None – 2015-01-21T21:41:43.107
3Couldn't someone just make a genetic genetic algorithm to find the optimal genetic algorithm for this problem? – mbomb007 – 2015-01-21T22:10:09.440
@mbomb007 I'm looking forward to your submission! ;) – Martin Ender – 2015-01-21T22:13:04.760
1@anon3202 Well, that would of course give you more information about the track layout since you could gauge where about you are. Essentially, we wanted to keep the interface for bots simple, and make it a purely local problem, where you will need the genome to learn which local solution is most beneficial for your global progress. – Martin Ender – 2015-01-21T22:15:50.380
The Java controller lacks a readme. Also, your (@Martin) solution isn't implemented in all available languages, because it isn't implemented in Java. – None – 2015-01-22T11:48:20.897
@Martin Büttner Is it normal that the first generation consists of 15 specimens but 10 specimens afterwards ? And if I read the spec correctly a perfectly geneticly fit specimen can start on a trap and die at age 0 ? – matovitch – 2015-01-22T14:20:40.023
@Calle thanks for pointing those out. Both a README and an implementation of ColorScorePlayer are required for Java - we'll get those done ASAP. – trichoplax – 2015-01-22T15:01:37.953
@matovitch 15 is the size of the starting population. 10 is the reproduction rate per turn. Each turn 10 new specimens are born, increasing the population. For each new board the starting population is always 15. – trichoplax – 2015-01-22T15:13:31.217
1@matovitch See section 5 of the *Turn order* section of the Gory Details (full spec):
'In particular, a new specimen which spawns on a lethal cell will not die immediately, but has a chance to move off the dangerous cell first.'
– trichoplax – 2015-01-22T15:16:36.510The C++ controller by default appears to do 50 games of 10000 turns, but at least for my code the geometric mean still has quite a variance. Subsequent runs on my code produce means of 7.10, 5.83, 7.63, 9.63, 6.40, 5.32. I'm no statistician, but with deviance of almost 24-38% from the average, I think your official scores will require more than 50 runs. For instance, you might keep track of a rolling margin-of-error-from-true-average, do 50 runs, and then keep doing runs until the margin-of-error is <5% or something. With my rat the 50 runs only takes 29.4 seconds. Maybe related. – Mooing Duck – 2015-01-22T22:45:15.757
@MooingDuck Yes, we've been noticing similar things. The final scores will be determined on 100 boards, and I think we might actually perform multiple runs per board, depending on how feasible this is with the slower controllers. – Martin Ender – 2015-01-22T22:46:58.997
@TheBestOne "You must not persist any data between function calls. This includes using your own random number generators. Your function will be provided with a seeded RNG which you are free to use as you want." – Martin Ender – 2015-01-23T08:08:11.277
I had a bot that was getting low twenties, and then I fixed a bug, and now it's getting like 1.4 :/ – Mooing Duck – 2015-01-24T03:47:50.543
1I tweaked the C++ code to show sample mean, stddev, stderr, and 99%conf-interval, (before your "geometric" log/exp), and made a startling discovery. The "Blind Faith" answer had "Sample Mean of 116529 +- 2.78337e+010 (99%) stddev=7.77951e+010" after 50 runs." Dropping the confidence interval to 50% doesn't notably improve matters at all. Geometric mean was stabler though: "Mean of 159.458 +- 117262 (99%) stddev=32.6237" (Before his 800-score update) – Mooing Duck – 2015-01-24T16:48:38.080
You mentioned that a Random would be passed to our player. That doesn't seem to be the case with Java controller. The provided "RandomPlayer" creates its own Random. – James Montagne – 2015-01-24T21:07:43.047
@JamesMontagne You can use this controller if you want to.
– TheNumberOne – 2015-01-24T21:23:27.213Interesting challenge (Save for later). – Display_name – 2015-01-26T00:58:29.537
1I did some experiment with mutation rate, and I think the challenge would be more interesting (and the controllers would run a lot quicker) if the probability was raised from .01 to .0227, which gives a DNA only 10% chances to go through mutation unchanged instead of 37% with the current value. This avoids ridiculous population explosions (which in turn saves a lot of computation time) and prevents a lot of failures due to insufficient diversity. Individual scores are lower, but since more runs produce winners, the global average tends to rise. – None – 2015-01-26T05:07:51.837
1Am I the only one bothered by how the C++ controller uses an
#include
of a .cpp file to function correctly? I know this is codegolf.se and all but it feels like it should be better-factored than that... – fluffy – 2015-01-26T06:18:14.760The C++ controller does the job pretty well, and is completely opaque to the player code, so the extension of the file it hides in is of no importance whatsoever, as far as I'm concerned. This is no C++ beauty contest (assuming C++ code can be anything but ugly, which requires a considerable suspension of disbelief to begin with), but if you add half a dozen templates and implement the rule of five on each object the controller uses, I'm sure a lot of contestants will get a rich, warm feeling inside. – None – 2015-01-26T06:49:28.570
1@kuroineko It doesn't take a huge amount of templating or the like - just putting the types in a .h would be sufficient. And personally I think C++ is a quite beautiful language, but to each their own. – fluffy – 2015-01-26T09:13:36.963
@fluffy All it takes is renaming gamelogic.cpp into gamelogic.h, IMHO. What good would come from putting the types in yet another file, when the whole point is to have a dedicated execution environment where the player's code cannot - by design - access anything but what the controller allows it to? – None – 2015-01-26T09:16:52.153
1Regarding the challenge itself, it seems like given the genome size it would be interesting to start with a much larger initial population, and also a higher crossover frequency (maybe even as high as 0.5). Obviously the selected parameters are the rules of the challenge but it might be interesting to play with anyway. – fluffy – 2015-01-26T09:17:28.037
1That's what magicnum.h is here for. As I said in my previous comment, I think 0.0227 might be a more meaningful value for the 1 bit mutation probability (1 offspring out of 10 would have a non-mutated mix of two parents in average), but you are free to change this parameter as you please, for instance to study the noise-resilience of your genome. About crossover probability, a higher value would probably turn the offspring DNA into garbage. With the current value, a 13 bits sequence has only 50% chances to be copied whole. Increasing it to 0.5 would reduce the average length to nothing. – None – 2015-01-26T09:26:26.417
@fluffy You can try to use my C++ controller if you want. It has an ASCII output of the map with the rats, I plan to add multithreading and statistics (average/max fitness, living mouses...) and it is pretty fast : https://github.com/matovitch/Ratlab-.
– matovitch – 2015-01-27T13:48:45.680You can considerably increase execution by factor of how many cores your cpu has (in my case 4) or more if you reference the tpl library and change the GameLogic.move method content into https://ideone.com/WnnFjF this.
– Stephan Schinkel – 2015-02-03T12:55:51.033I consider this task invalid, because it was posted "Jan 19 '15 at 22:55", which is Monday, not Friday, as it's written in spec. – metalim – 2018-02-14T09:09:55.230