Shortest universal maze exit string

49

18

A maze on an N by N grid of square cells is defined by specifying whether each edge is a wall or not a wall. All outer edges are walls. One cell is defined as the start, and one cell is defined as the exit, and the exit is reachable from the start. The start and exit are never the same cell.

Note that neither the start nor the exit need to be on the outer border of the maze, so this is a valid maze:

A 3 by 3 maze with the exit on the central cell

A string of 'N', 'E', 'S' and 'W' indicates attempting to move North, East, South and West respectively. A move that is blocked by a wall is skipped without movement. A string exits a maze if applying that string from the start results in the exit being reached (regardless of whether the string continues after reaching the exit).

Inspired by this puzzling.SE question for which xnor provided a provable method of solving with a very long string, write code that can find a single string that exits any 3 by 3 maze.

Excluding invalid mazes (start and exit on the same cell, or exit not reachable from start) there are 138,172 valid mazes and the string must exit each of them.

Validity

The string must satisfy the following:

  • It is composed of only the characters 'N', 'E', 'S' and 'W'.
  • It exits any maze it is applied to, if started at the start.

Since the set of all possible mazes includes each possible maze with each possible valid starting point, this automatically means that the string will exit any maze from any valid starting point. That is, from any starting point from which the exit is reachable.

Winning

The winner is the answer that provides the shortest valid string and includes the code used to produce it. If more than one of the answers provide a string of this shortest length, the first to post that string length wins.

Example

Here is an example string 500 characters long, to give you something to beat:

SEENSSNESSWNNSNNNNWWNWENENNWEENSESSNENSESWENWWWWWENWNWWSESNSWENNWNWENWSSSNNNNNNESWNEWWWWWNNNSWESSEEWNENWENEENNEEESEENSSEENNWWWNWSWNSSENNNWESSESNWESWEENNWSNWWEEWWESNWEEEWWSSSESEEWWNSSEEEEESSENWWNNSWNENSESSNEESENEWSSNWNSEWEEEWEESWSNNNEWNNWNWSSWEESSSSNESESNENNWEESNWEWSWNSNWNNWENSNSWEWSWWNNWNSENESSNENEWNSSWNNEWSESWENEEENSWWSNNNNSSNENEWSNEEWNWENEEWEESEWEEWSSESSSWNWNNSWNWENWNENWNSWESNWSNSSENENNNWSSENSSSWWNENWWWEWSEWSNSSWNNSEWEWENSWENWSENEENSWEWSEWWSESSWWWNWSSEWSNWSNNWESNSNENNSNEWSNNESNNENWNWNNNEWWEWEE

Thanks to orlp for donating this.


Leaderboard

Leaderboard

Equal scores are listed in order of posting of that score. This is not necessarily the order that the answers were posted since the score for a given answer may be updated over time.


Judge

Here is a Python 3 validator that takes a string of NESW as a command line argument or via STDIN.

For an invalid string, this will give you a visual example of a maze it fails for.

trichoplax

Posted 2015-07-17T16:09:56.477

Reputation: 10 499

3This is a really neat question. Is there one shortest string (or a number of strings and a proof that there can't be any shorter answers)? And if so, do you know it? – Alex Van Liew – 2015-07-31T18:43:52.473

@AlexVanLiew Thanks :) For a 2 by 2 maze the shortest possible string is 11 characters. I chose 3 by 3 for the question to make it unlikely that an optimal solution will be found. Judging by the results for the 2 by 2 case (which can be brute forced), there will be a shortest possible length, with a number of different valid strings of that length. – trichoplax – 2015-07-31T19:33:41.710

And no, I don't have any idea how long the shortest possible string will be for 3 by 3 mazes. More than 11, and less than 86... You could increase that lower bound by brute force, but my suspicion is that the shortest string will be too long to be found that way. – trichoplax – 2015-07-31T19:38:24.747

Huh. If I was better at this sort of thing I would give it a shot, but I'm not good enough with optimization and low-level speed and the like to beat orlp's submission. I hope you get somebody else to post something interesting! – Alex Van Liew – 2015-07-31T21:12:14.887

@AlexVanLiew optimisation is one way to approach it. I'm also interested to see if anyone finds some kind of pattern that allows for reducing the length without having to process millions of candidates... – trichoplax – 2015-07-31T21:27:42.123

Can the exit be the center cell? – Alex Reinking – 2015-08-03T19:34:01.157

1@AlexReinking yes the start can be any of the 9 cells and the exit can be any of the 9 cells, as long as they are not the same cell, and the exit is reachable from the start. – trichoplax – 2015-08-03T23:30:32.273

1

Slightly similar to this stackoverflow question: http://stackoverflow.com/questions/26910401/solve-all-4x4-mazes-simultaneously-with-least-moves - but start and end cell are top left and bottom right in that one, which reduces the possible maze count to 2423.

– schnaader – 2015-08-04T12:40:55.507

@schnaader that's an interesting question you link to! In addition to start and end being fixed, the walls are filled cells in that question, rather than having walls along cell edges. Some of the techniques discussed there might still be applicable here though... – trichoplax – 2015-08-04T18:53:13.473

why is this specific to 3? in this case my code could just be "print NENENNNENENE" or something. why wouldn't the question be to all numbers, and be scored based on the string of length 3? – proud haskeller – 2015-08-04T19:39:43.653

1@proudhaskeller either way would be a valid question. The general case, scored for n=3, would require more generalised code. This specific case allows for optimisations that do not apply to general n, and that's the way I chose to ask it. – trichoplax – 2015-08-04T21:24:33.920

@proudhaskeller sorry I missed the point of your comment initially - you mean someone could hardcode a string rather than submitting code that finds one. If someone did, I would downvote and raise it on meta, as I am certain that they would need to have code elsewhere that they omitted from the answer in order to find a competitive string. A valid answer requires a string plus the code that produced it. – trichoplax – 2015-08-05T10:35:44.503

2Has anyone considered approaching this problem as finding the shortest accepted string to a regular expression? It would require a LOT of reductions in the number of problems before converting to regexes, but could theoretically find a verifiably optimal solution. – Kyle McCormick – 2015-08-09T03:47:38.123

The link to the python 3 validator is dead. – NonlinearFruit – 2017-05-19T23:04:02.810

@NonlinearFruit thanks for letting me know - I've fixed the link now – trichoplax – 2017-05-19T23:16:47.863

@AlexVanLiew, if NP-complete problems can't be effectively solved, almost surely optimal solution won't be found. – rus9384 – 2017-09-13T10:40:43.167

Im excited about this puzzle. it's probably going to be one of the first things we can solve via quantum computing. you only need 2*n qbits where n is the number of instructions that you want to check. – tuskiomi – 2018-10-15T14:56:08.177

@tuskiomi so as quantum computers progress we'll get increasing lower bounds on the optimal solution until eventually we find it... – trichoplax – 2018-10-15T19:31:35.723

@trichoplax yep, and then (given an average computational time), we would only need to perform 2 checks per Qbit. it's very computationally lax for a quantum computer, but for a classical machine, it's very hard to do, even with current technology. – tuskiomi – 2018-10-15T19:36:19.757

Can we call out to the validator you've written, in our code? – Stackstuck – 2019-03-25T16:42:02.553

@Stackstuck this is not code-golf, so you can include my validator directly in your code - there's no need to make it short. However, the validator is intended for checking a single solution with human readable output - it isn't fast enough to check large numbers of candidates in a practical time. Go ahead and use and modify it however you wish, but the existing answers managed to find ways of optimising this check dramatically, so looking at them will probably be more useful than looking at mine... :) – trichoplax – 2019-03-25T21:36:49.393

Yeah, I have absolutely no idea how they're doing it. I was just planning on exhaustively searching. – Stackstuck – 2019-03-26T02:10:38.590

Answers

37

C++, 97 95 93 91 86 83 82 81 79 characters

NNWSWNNSENESESWSSWNSEENWWNWSSEWWNENWEENWSWNWSSENENWNWNESENESESWNWSESEWWNENWNEES

My strategy is fairly simple - an evolution algorithm that can grow, shrink, swap elements of and mutate valid sequences. My evolution logic is now nearly the same as @Sp3000's, as his was an improvement over mine.

However, my implementation of the maze logic is rather nifty. This allows me to check if strings are valid at blistering speed. Try to figure it out by looking at the comment, do_move and the Maze constructor.

#include <algorithm>
#include <bitset>
#include <cstdint>
#include <iostream>
#include <random>
#include <set>
#include <vector>

/*
    Positions:

        8, 10, 12
        16, 18, 20
        24, 26, 28

    By defining as enum respectively N, W, E, S as 0, 1, 2, 3 we get:

        N: -8, E: 2, S: 8, W: -2
        0: -8, 1: -2, 2: 2, 3: 8

    To get the indices for the walls, average the numbers of the positions it
    would be blocking. This gives the following indices:

        9, 11, 12, 14, 16, 17, 19, 20, 22, 24, 25, 27

    We'll construct a wall mask with a 1 bit for every position that does not
    have a wall. Then if a 1 shifted by the average of the positions AND'd with
    the wall mask is zero, we have hit a wall.
*/

enum { N = -8, W = -2, E = 2, S = 8 };
static const int encoded_pos[] = {8, 10, 12, 16, 18, 20, 24, 26, 28};
static const int wall_idx[] = {9, 11, 12, 14, 16, 17, 19, 20, 22, 24, 25, 27};
static const int move_offsets[] = { N, W, E, S };

int do_move(uint32_t walls, int pos, int move) {
    int idx = pos + move / 2;
    return walls & (1ull << idx) ? pos + move : pos;
}

struct Maze {
    uint32_t walls;
    int start, end;

    Maze(uint32_t maze_id, int start, int end) {
        walls = 0;
        for (int i = 0; i < 12; ++i) {
            if (maze_id & (1 << i)) walls |= 1 << wall_idx[i];
        }
        this->start = encoded_pos[start];
        this->end = encoded_pos[end];
    }

    uint32_t reachable() {
        if (start == end) return false;

        uint32_t reached = 0;
        std::vector<int> fill; fill.reserve(8); fill.push_back(start);
        while (fill.size()) {
            int pos = fill.back(); fill.pop_back();
            if (reached & (1 << pos)) continue;
            reached |= 1 << pos;
            for (int m : move_offsets) fill.push_back(do_move(walls, pos, m));
        }

        return reached;
    }

    bool interesting() {
        uint32_t reached = reachable();
        if (!(reached & (1 << end))) return false;
        if (std::bitset<32>(reached).count() <= 4) return false;

        int max_deg = 0;
        uint32_t ends = 0;
        for (int p = 0; p < 9; ++p) {
            int pos = encoded_pos[p];
            if (reached & (1 << pos)) {
                int deg = 0;
                for (int m : move_offsets) {
                    if (pos != do_move(walls, pos, m)) ++deg;
                }
                if (deg == 1) ends |= 1 << pos;
                max_deg = std::max(deg, max_deg);
            }
        }

        if (max_deg <= 2 && ends != ((1u << start) | (1u << end))) return false;

        return true;
    }
};

std::vector<Maze> gen_valid_mazes() {
    std::vector<Maze> mazes;
    for (int maze_id = 0; maze_id < (1 << 12); maze_id++) {
        for (int points = 0; points < 9*9; ++points) {
            Maze maze(maze_id, points % 9, points / 9);
            if (!maze.interesting()) continue;
            mazes.push_back(maze);
        }
    }

    return mazes;
}

bool is_solution(const std::vector<int>& moves, Maze maze) {
    int pos = maze.start;
    for (auto move : moves) {
        pos = do_move(maze.walls, pos, move);
        if (pos == maze.end) return true;
    }

    return false;
}

std::vector<int> str_to_moves(std::string str) {
    std::vector<int> moves;
    for (auto c : str) {
        switch (c) {
        case 'N': moves.push_back(N); break;
        case 'E': moves.push_back(E); break;
        case 'S': moves.push_back(S); break;
        case 'W': moves.push_back(W); break;
        }
    }

    return moves;
}

std::string moves_to_str(const std::vector<int>& moves) {
    std::string result;
    for (auto move : moves) {
             if (move == N) result += "N";
        else if (move == E) result += "E";
        else if (move == S) result += "S";
        else if (move == W) result += "W";
    }
    return result;
}

bool solves_all(const std::vector<int>& moves, std::vector<Maze>& mazes) {
    for (size_t i = 0; i < mazes.size(); ++i) {
        if (!is_solution(moves, mazes[i])) {
            // Bring failing maze closer to begin.
            std::swap(mazes[i], mazes[i / 2]);
            return false;
        }
    }
    return true;
}

template<class Gen>
int randint(int lo, int hi, Gen& gen) {
    return std::uniform_int_distribution<int>(lo, hi)(gen);
}

template<class Gen>
int randmove(Gen& gen) { return move_offsets[randint(0, 3, gen)]; }

constexpr double mutation_p = 0.35; // Chance to mutate.
constexpr double grow_p = 0.1; // Chance to grow.
constexpr double swap_p = 0.2; // Chance to swap.

int main(int argc, char** argv) {
    std::random_device rnd;
    std::mt19937 rng(rnd());
    std::uniform_real_distribution<double> real;
    std::exponential_distribution<double> exp_big(0.5);
    std::exponential_distribution<double> exp_small(2);

    std::vector<Maze> mazes = gen_valid_mazes();

    std::vector<int> moves;
    while (!solves_all(moves, mazes)) {
        moves.clear();
        for (int m = 0; m < 500; m++) moves.push_back(randmove(rng));
    }

    size_t best_seen = moves.size();
    std::set<std::vector<int>> printed;
    while (true) {
        std::vector<int> new_moves(moves);
        double p = real(rng);

        if (p < grow_p && moves.size() < best_seen + 10) {
            int idx = randint(0, new_moves.size() - 1, rng);
            new_moves.insert(new_moves.begin() + idx, randmove(rng));
        } else if (p < swap_p) {
            int num_swap = std::min<int>(1 + exp_big(rng), new_moves.size()/2);
            for (int i = 0; i < num_swap; ++i) {
                int a = randint(0, new_moves.size() - 1, rng);
                int b = randint(0, new_moves.size() - 1, rng);
                std::swap(new_moves[a], new_moves[b]);
            }
        } else if (p < mutation_p) {
            int num_mut = std::min<int>(1 + exp_big(rng), new_moves.size());
            for (int i = 0; i < num_mut; ++i) {
                int idx = randint(0, new_moves.size() - 1, rng);
                new_moves[idx] = randmove(rng);
            }
        } else {
            int num_shrink = std::min<int>(1 + exp_small(rng), new_moves.size());
            for (int i = 0; i < num_shrink; ++i) {
                int idx = randint(0, new_moves.size() - 1, rng);
                new_moves.erase(new_moves.begin() + idx);
            }
        }

        if (solves_all(new_moves, mazes)) {
            moves = new_moves;

            if (moves.size() <= best_seen && !printed.count(moves)) {
                std::cout << moves.size() << " " << moves_to_str(moves) << "\n";
                if (moves.size() < best_seen) {
                    printed.clear(); best_seen = moves.size();
                }
                printed.insert(moves);
            }
        }
    }

    return 0;
}

orlp

Posted 2015-07-17T16:09:56.477

Reputation: 37 067

5Confirmed valid. I'm impressed - I didn't expect to see strings this short. – trichoplax – 2015-07-17T20:45:03.417

2I finally got round to installing gcc and running this for myself. It is hypnotic watching the strings mutating and slowly shrinking... – trichoplax – 2015-07-29T15:37:00.063

1@trichoplax I told you it was fun :) – orlp – 2015-07-29T15:39:06.707

You could make this even faster by just preprocessing a table of walls, positions, and directions. It'd be less than a megabyte in size, and do_move would just become table[walls][pos][move] – Alex Reinking – 2015-08-04T03:27:01.010

Actually, if you created that table within each Maze struct, you would only use two memory loads to compute a move, rather than one memory load and a bunch of arithmetic (and a branch!). – Alex Reinking – 2015-08-04T03:29:24.643

@AlexReinking That would be significantly slower. Since the table is around ~1 megabyte you'll expect to hit L3 cache. L3 cache is ~40 cycles per access. The best way to speed this up is to precompute the masks into the walls variable, and to not do memory lookups at all. – orlp – 2015-08-04T03:32:08.850

Aren't you already hitting L3 cache when you iterate over the vector of Mazes? No memory lookups would be best, yeah. – Alex Reinking – 2015-08-04T03:36:33.163

@AlexReinking No, iterating over a contiguous chunk of memory in a linear fashion is perfect for the prefetcher and cache. Either way, that iteration can not be avoided. – orlp – 2015-08-04T03:38:22.080

@orlp I had an idea for moving that doesn't require branching and uses a small lookup table (less than the size of a cache line). How fast do you think it is? – Alex Reinking – 2015-08-04T04:06:54.470

@AlexReinking Not needed, you can change the position numbers such that they'll never go negative, and then encode the wall such that the ith bit of it contains either a 0 or a 1 for a wall at that position, and every other bit is a 0. Then you can just mask directly without the need of any lookup table or branch. – orlp – 2015-08-04T04:09:34.700

Would love to see what that looks like. – Alex Reinking – 2015-08-04T04:17:23.080

2

@AlexReinking I updated my answer with said implementation. If you look at the disassembly you'll see it's only a dozen of instructions without any branch or load: http://coliru.stacked-crooked.com/a/3b09d36db85ce793 .

– orlp – 2015-08-04T04:38:00.763

could you save a little bit by defining the enum constants to be the offsets you compute and then make the rarely-called string translation a bit more complicated? – Alex Reinking – 2015-08-04T05:09:14.507

@AlexReinking Initially I thought of using 2-bit integers to represent the strings, but then I found out memory usage is no problem, so the enum still stems from that version. In my next version I can do that, yes. – orlp – 2015-08-04T05:18:33.173

2@AlexReinking Done. do_move is now insanely fast. – orlp – 2015-08-04T09:24:01.867

I got a little bit of a speed up by sorting the vector of mazes - each time a string fails for a maze, that maze is moved one step closer to the start of the vector, so that after a while invalid strings are failing earlier on average. Not sure if that will conflict with some of the other improvements you've made since then though... – trichoplax – 2015-08-04T18:59:04.830

@trichoplax That's a cool idea! – orlp – 2015-08-04T19:30:25.443

@trichoplax And from my measurements, effective too! All I do is if I detect that the maze at index i caused the failure, I swap it with the maze at position i / 2. This (really) often reduces 10000+ mazes being checked to <50. – orlp – 2015-08-04T19:38:44.757

@trichoplax From my measurements it seems to be 2x faster! – orlp – 2015-08-04T19:51:42.853

@orlp Great! I was only swapping with i - 1 and saw a marginal speed up - I didn't think to use i / 2. – trichoplax – 2015-08-04T21:26:59.037

@orlp did the improvements lead to shorter strings? 24 hours left to get lower than 82 for the bounty... – trichoplax – 2015-08-07T11:23:03.987

@trichoplax Well, I woke up today to check the terminal, but it seems that my computer BSOD'd overnight. So, the only answer is 'perhaps'. – orlp – 2015-08-07T12:37:38.920

Ah... Well I'll be waiting until the end of the 24 hour grace period before awarding the bounty, so there are over 25 hours still available. Good luck :) – trichoplax – 2015-08-07T12:48:23.130

@trichoplax 81 :) – orlp – 2015-08-07T14:14:47.497

That was quick... – trichoplax – 2015-08-07T18:16:33.213

I hope you don't mind my edit - I wanted to include the most up to date score found using your code ready for posting an open ended bounty for an improvement.

– trichoplax – 2016-10-23T23:44:16.567

Has anyone thought about porting the do_Move function to OpenCL or CUDA? – tuskiomi – 2017-08-02T20:12:31.367

1

This code can be further sped up by halving move values and not doing the division by 2. Though I haven't measured precisly, the speedup was noticable at least on my pc. https://godbolt.org/z/iLml-3 The version with bt was slower, at least on my nehalem `enum { N = -4, W = -1, E = 1, S = 4 };

int do_move(uint32_t walls, int pos, int halfmove) { int idx = pos + halfmove; return walls & (1ull << idx) ? pos + halfmove + halfmove : pos; }Also it's trivial to parallelize. https://pastebin.com/tv1vnPcA There'snum_threads` variable.

– Sopel – 2019-11-05T20:38:15.480

16

Python 3 + PyPy, 82 80 characters

SWWNNSENESESWSSWSEENWNWSWSEWNWNENENWWSESSEWSWNWSENWEENWWNNESENESSWNWSESESWWNNESE

I've been hesitant to post this answer because I've basically taken orlp's approach and put my own spin on it. This string was found by starting with a pseudorandom length 500 solution - quite a number of seeds were tried before I could break the current record.

The only new major optimisation is that I only look at one third of the mazes. Two categories of mazes are excluded from the search:

  • Mazes where <= 7 squares are reachable
  • Mazes where all reachable squares are on a single path, and the start/finish are not at both ends

The idea is that any string which solves the rest of the mazes should also solve the above automatically. I'm convinced this is true for the second type, but it is definitely not true for the first, so the output will contain some false positives that need to be checked separately. These false positive usually only miss about 20 mazes though, so I thought it'd be a good tradeoff between speed and accuracy, and it would also give the strings a little more breathing space to mutate.

Initially I went through a long list of search heuristics, but horrifically none of them came up with anything better than 140 or so.

import random

N, M = 3, 3

W = 2*N-1
H = 2*M-1

random.seed(142857)


def move(c, cell, walls):
    global W, H

    if c == "N":
        if cell > W and not (1<<(cell-W)//2 & walls):
            cell = cell - W*2

    elif c == "S":
        if cell < W*(H-1) and not (1<<(cell+W)//2 & walls):
            cell = cell + W*2

    elif c == "E":
        if cell % W < W-1 and not (1<<(cell+1)//2 & walls):
            cell = cell + 2

    elif c == "W":
        if cell % W > 0 and not (1<<(cell-1)//2 & walls):
            cell = cell - 2

    return cell


def valid_maze(start, finish, walls):
    global adjacent

    if start == finish:
        return False

    visited = set()
    cells = [start]

    while cells:
        curr_cell = cells.pop()

        if curr_cell == finish:
            return True

        if curr_cell in visited:
            continue

        visited.add(curr_cell)

        for c in "NSEW":
            cells.append(move(c, curr_cell, walls))

    return False


def print_maze(maze):
    start, finish, walls = maze
    print_str = "".join(" #"[walls & (1 << i//2) != 0] if i%2 == 1
                        else " SF"[2*(i==finish) + (i==start)]
                        for i in range(W*H))

    print("#"*(H+2))

    for i in range(H):
        print("#" + print_str[i*W:(i+1)*W] + "#")

    print("#"*(H+2), end="\n\n")

all_cells = [W*y+x for y in range(0, H, 2) for x in range(0, W, 2)]
mazes = []

for start in all_cells:
    for finish in all_cells:
        for walls in range(1<<(N*(M-1) + M*(N-1))):
            if valid_maze(start, finish, walls):
                mazes.append((start, finish, walls))

num_mazes = len(mazes)
print(num_mazes, "mazes generated")

to_remove = set()

for i, maze in enumerate(mazes):
    start, finish, walls = maze

    reachable = set()
    cells = [start]

    while cells:
        cell = cells.pop()

        if cell in reachable:
            continue

        reachable.add(cell)

        if cell == finish:
            continue

        for c in "NSEW":
            new_cell = move(c, cell, walls)
            cells.append(new_cell)

    max_deg = 0
    sf = set()

    for cell in reachable:
        deg = 0

        for c in "NSEW":
            if move(c, cell, walls) != cell:
                deg += 1

        max_deg = max(deg, max_deg)

        if deg == 1:
            sf.add(cell)

    if max_deg <= 2 and len(sf) == 2 and sf != {start, finish}:
        # Single path subset
        to_remove.add(i)

    elif len(reachable) <= (N*M*4)//5:
        # Low reachability maze, above ratio is adjustable
        to_remove.add(i)

mazes = [maze for i,maze in enumerate(mazes) if i not in to_remove]
print(num_mazes - len(mazes), "mazes removed,", len(mazes), "remaining")
num_mazes = len(mazes)


def check(string, cache = set()):
    global mazes

    if string in cache:
        return True

    for i, maze in enumerate(mazes):
        start, finish, walls = maze
        cell = start

        for c in string:
            cell = move(c, cell, walls)

            if cell == finish:
                break

        else:
            # Swap maze to front
            mazes[i//2], mazes[i] = mazes[i], mazes[i//2]
            return False

    cache.add(string)
    return True


while True:
    string = "".join(random.choice("NSEW") for _ in range(500))

    if check(string):
        break

# string = "NWWSSESNESESNNWNNSWNWSSENESWSWNENENWNWESESENNESWSESWNWSWNNEWSESWSEEWNENWWSSNNEESS"

best = len(string)
seen = set()

while True:
    action = random.random()

    if action < 0.1:
        # Grow
        num_grow = int(random.expovariate(lambd=3)) + 1
        new_string = string

        for _ in range(num_grow):
            i = random.randrange(len(new_string))
            new_string = new_string[:i] + random.choice("NSEW") + new_string[i:]

    elif action < 0.2:
        # Swap
        num_swap = int(random.expovariate(lambd=1)) + 1
        new_string = string

        for _ in range(num_swap):
            i,j = sorted(random.sample(range(len(new_string)), 2))
            new_string = new_string[:i] + new_string[j] + new_string[i+1:j] + new_string[i] + new_string[j+1:]

    elif action < 0.35:
        # Mutate
        num_mutate = int(random.expovariate(lambd=1)) + 1
        new_string = string

        for _ in range(num_mutate):
            i = random.randrange(len(new_string))
            new_string = new_string[:i] + random.choice("NSEW") + new_string[i+1:]

    else:
        # Shrink
        num_shrink = int(random.expovariate(lambd=3)) + 1
        new_string = string

        for _ in range(num_shrink):
            i = random.randrange(len(new_string))
            new_string = new_string[:i] + new_string[i+1:]


    if check(new_string):
        string = new_string

    if len(string) <= best and string not in seen:
        while True:
            if len(string) < best:
                seen = set()

            seen.add(string)
            best = len(string)
            print(string, len(string))

            # Force removals on new record strings
            for i in range(len(string)):
                new_string = string[:i] + string[i+1:]

                if check(new_string):
                    string = new_string
                    break

            else:
                break

Sp3000

Posted 2015-07-17T16:09:56.477

Reputation: 58 729

Confirmed valid. Nice improvements :) – trichoplax – 2015-08-04T12:51:14.840

I like your idea of realising some mazes do not need to be checked. Could you somehow automate the process of determining which mazes are redundant checks? I'm curious to know if that would show up more mazes than the ones that can be deduced intuitively... – trichoplax – 2015-08-04T19:04:58.397

What's your reasoning for not needing to check path graphs where the start is not at one end? The case where the finish is not at one end is easy to justify, and can be strengthened to not needing to check cases where the finish is a cut vertex, but I can't see how to justify eliminating start vertices. – Peter Taylor – 2015-08-09T07:53:06.053

@PeterTaylor After some more thinking, theoretically you are right, there are some mazes that you can't eliminate like that. However, it seems that on 3x3 it doesn't matter for strings this long. – orlp – 2015-08-09T11:20:32.820

2@orlp, Sp3000 sketched a proof in chat. Path graphs are a special case. Renumber the cells 0 to n along the path and suppose that string S gets you from 0 to n. Then S also gets you from any intermediate cell c to n. Suppose otherwise. Let a(i) be the position after i steps starting at 0 and b(i) starting at c. Then a(0) = 0 < b(0), each step changes a and b by at most 1, and a(|S|) = n > b(|S|). Take the smallest t such that a(t) >= b(t). Clearly a(t) != b(t) or they would be in sync, so they must swap places at step t by moving in the same direction. – Peter Taylor – 2015-08-09T13:16:20.610

Hi, can I modify this code to try to find a shorter string? – Oliver Ni – 2016-11-10T05:18:20.687

I just need the first part the checks if it is a valid string. – Oliver Ni – 2016-11-10T05:21:47.330

@Oliver I... guess, if it's just the validation. Would be interested to see whether alternative search methods have more success. – Sp3000 – 2016-11-10T19:32:59.943

I'm not getting your point. If a string solves all reachable mazes, why doesn't it solve others? In fact string must visit every cell in every reachable maze, which also solves unreachable cases. – rus9384 – 2017-09-16T03:07:38.530

@rus9384 "reachable" is referring to a cell (a square) in a maze. There are unreachable cells, not unreachable mazes. For example, if a square has 4 walls around it then it cannot be reached from any other cell. – trichoplax – 2017-09-20T08:31:25.887

@trichoplax Reachable maze is a maze where all cells are reachable from any other. – rus9384 – 2017-09-20T09:08:07.783

@rus9384 That's not how the word "reachable" is being used in either the answer or the comments. – trichoplax – 2017-09-20T12:14:48.060

@trichoplax, in fact it's enough that string passes through all cells from any starting cell in 189 different mazes (assuming maze count as the same wherever start and finish are). That's what I mean. – rus9384 – 2017-09-20T12:20:30.317

@rus9384 I don't yet follow how this works, but if you've found a set of 189 mazes that means you don't have to check any others, that would give a huge improvement in the time taken to check each string. The comments are getting full here but I'd be interested to hear how this works in [chat]. – trichoplax – 2017-09-20T17:18:56.407

5

C++ and the library from lingeling

Summary: A new approach, no new solutions, a nice program to play with, and some interesting results of local non-improvability of the known solutions. Oh, and some generally useful observations.

Using a SAT based approach, I could completely solve the similar problem for 4x4 mazes with blocked cells instead of thin walls and fixed start and exit positions at opposite corners. So I hoped to be able to use the same ideas for this problem. However, even though for the other problem I only used 2423 mazes (in the meantime it has been observed that 2083 are enough) and it has a solution of length 29, the SAT encoding used millions of variables and solving it took days.

So I decided to change the approach in two important ways:

  • Don't insist on searching a solution from scratch, but allow to fix a part of the solution string. (That's easy to do anyway by adding unit clauses, but my program makes it comfortable to do.)
  • Don't use all mazes from the beginning. Instead, incrementally add one unsolved maze at a time. Some mazes may be solved by chance, or they are always solved when the ones already considered are solved. In the latter case, it will never be added, without us needing to know the implication.

I also did some optimizations to use less variables and unit clauses.

The program is based on @orlp's. An important change was the selection of mazes:

  • First of all, mazes are given by their wall structure and the start position only. (They also store the reachable positions.) The function is_solution checks if all reachable positions are reached.
  • (Unchanged: still not using mazes with only 4 or less reachable positions. But most of them would be thrown away anyway by the following observations.)
  • If a maze does not use any of the three top cells, it is equivalent to a maze that is shifted up. So we can drop it. Likewise for a maze that does not use any of the three left cells.
  • It doesn't matter if unreachable parts are connected, so we insist that each unreachable cell is completely surrounded by walls.
  • A single path maze that is a submaze of a bigger single path maze is always solved when the bigger one is solved, so we don't need it. Each single path maze of size at most 7 is part of a bigger one (still fitting in 3x3), but there are size 8 single path mazes that aren't. For simpliciy, let's just drop single path mazes of size less than 8. (And I'm still using that only the extreme points need to be considered as start positions. All positions are used as exit positions, which only matters for the SAT part of the program.)

In this way, I get a total of 10772 mazes with start positions.

Here is the program:

#include <algorithm>
#include <array>
#include <bitset>
#include <cstring>
#include <iostream>
#include <set>
#include <vector>
#include <limits>
#include <cassert>

extern "C"{
#include "lglib.h"
}

// reusing a lot of @orlp's ideas and code

enum { N = -8, W = -2, E = 2, S = 8 };
static const int encoded_pos[] = {8, 10, 12, 16, 18, 20, 24, 26, 28};
static const int wall_idx[] = {9, 11, 12, 14, 16, 17, 19, 20, 22, 24, 25, 27};
static const int move_offsets[] = { N, E, S, W };
static const uint32_t toppos = 1ull << 8 | 1ull << 10 | 1ull << 12;
static const uint32_t leftpos = 1ull << 8 | 1ull << 16 | 1ull << 24;
static const int unencoded_pos[] = {0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,3,
                                    0,4,0,5,0,0,0,6,0,7,0,8};

int do_move(uint32_t walls, int pos, int move) {
  int idx = pos + move / 2;
  return walls & (1ull << idx) ? pos + move : pos;
}

struct Maze {
  uint32_t walls, reach;
  int start;

  Maze(uint32_t walls=0, uint32_t reach=0, int start=0):
    walls(walls),reach(reach),start(start) {}

  bool is_dummy() const {
    return (walls==0);
  }

  std::size_t size() const{
    return std::bitset<32>(reach).count();
  }

  std::size_t simplicity() const{  // how many potential walls aren't there?
    return std::bitset<32>(walls).count();
  }

};

bool cmp(const Maze& a, const Maze& b){
  auto asz = a.size();
  auto bsz = b.size();
  if (asz>bsz) return true;
  if (asz<bsz) return false;
  return a.simplicity()<b.simplicity();
}

uint32_t reachable(uint32_t walls) {
  static int fill[9];
  uint32_t reached = 0;
  uint32_t reached_relevant = 0;
  for (int start : encoded_pos){
    if ((1ull << start) & reached) continue;
    uint32_t reached_component = (1ull << start);
    fill[0]=start;
    int count=1;
    for(int i=0; i<count; ++i)
      for(int m : move_offsets) {
        int newpos = do_move(walls, fill[i], m);
        if (reached_component & (1ull << newpos)) continue;
        reached_component |= 1ull << newpos;
        fill[count++] = newpos;
      }
    if (count>1){
      if (reached_relevant)
        return 0;  // more than one nonsingular component
      if (!(reached_component & toppos) || !(reached_component & leftpos))
        return 0;  // equivalent to shifted version
      if (std::bitset<32>(reached_component).count() <= 4)
        return 0;  
      reached_relevant = reached_component;
    }
    reached |= reached_component;
  }
  return reached_relevant;
}

void enterMazes(uint32_t walls, uint32_t reached, std::vector<Maze>& mazes){
  int max_deg = 0;
  uint32_t ends = 0;
  for (int pos : encoded_pos)
    if (reached & (1ull << pos)) {
      int deg = 0;
      for (int m : move_offsets) {
        if (pos != do_move(walls, pos, m))
          ++deg;
      }
      if (deg == 1)
        ends |= 1ull << pos;
      max_deg = std::max(deg, max_deg);
    }
  uint32_t starts = reached;
  if (max_deg == 2){
    if (std::bitset<32>(reached).count() <= 7)
      return; // small paths are redundant
    starts = ends; // need only start at extremal points
  }
  for (int pos : encoded_pos)
    if ( starts & (1ull << pos))
      mazes.emplace_back(walls, reached, pos);
}

std::vector<Maze> gen_valid_mazes() {
  std::vector<Maze> mazes;
  for (int maze_id = 0; maze_id < (1 << 12); maze_id++) {
    uint32_t walls = 0;
    for (int i = 0; i < 12; ++i) 
      if (maze_id & (1 << i))
    walls |= 1ull << wall_idx[i];
    uint32_t reached=reachable(walls);
    if (!reached) continue;
    enterMazes(walls, reached, mazes);
  }
  std::sort(mazes.begin(),mazes.end(),cmp);
  return mazes;
};

bool is_solution(const std::vector<int>& moves, Maze& maze) {
  int pos = maze.start;
  uint32_t reached = 1ull << pos;
  for (auto move : moves) {
    pos = do_move(maze.walls, pos, move);
    reached |= 1ull << pos;
    if (reached == maze.reach) return true;
  }
  return false;
}

std::vector<int> str_to_moves(std::string str) {
  std::vector<int> moves;
  for (auto c : str) {
    switch (c) {
    case 'N': moves.push_back(N); break;
    case 'E': moves.push_back(E); break;
    case 'S': moves.push_back(S); break;
    case 'W': moves.push_back(W); break;
    }
  }
  return moves;
}

Maze unsolved(const std::vector<int>& moves, std::vector<Maze>& mazes) {
  int unsolved_count = 0;
  Maze problem{};
  for (Maze m : mazes)
    if (!is_solution(moves, m))
      if(!(unsolved_count++))
    problem=m;
  if (unsolved_count)
    std::cout << "unsolved: " << unsolved_count << "\n";
  return problem;
}

LGL * lgl;

constexpr int TRUELIT = std::numeric_limits<int>::max();
constexpr int FALSELIT = -TRUELIT;

int new_var(){
  static int next_var = 1;
  assert(next_var<TRUELIT);
  return next_var++;
}

bool lit_is_true(int lit){
  int abslit = lit>0 ? lit : -lit;
  bool res = (abslit==TRUELIT) || (lglderef(lgl,abslit)>0);
  return lit>0 ? res : !res;
}

void unsat(){
  std::cout << "Unsatisfiable!\n";
  std::exit(1);
}

void clause(const std::set<int>& lits){
  if (lits.find(TRUELIT) != lits.end())
    return;
  for (int lit : lits)
    if (lits.find(-lit) != lits.end())
      return;
  int found=0;
  for (int lit : lits)
    if (lit != FALSELIT){
      lgladd(lgl, lit);
      found=1;
    }
  lgladd(lgl, 0);
  if (!found)
    unsat();
}

void at_most_one(const std::set<int>& lits){
  if (lits.size()<2)
    return;
  for(auto it1=lits.cbegin(); it1!=lits.cend(); ++it1){
    auto it2=it1;
    ++it2;
    for(  ; it2!=lits.cend(); ++it2)
      clause( {- *it1, - *it2} );
  }
}

/* Usually, lit_op(lits,sgn) creates a new variable which it returns,
   and adds clauses that ensure that the variable is equivalent to the
   disjunction (if sgn==1) or the conjunction (if sgn==-1) of the literals
   in lits. However, if this disjunction or conjunction is constant True
   or False or simplifies to a single literal, that is returned without
   creating a new variable and without adding clauses.                    */ 

int lit_op(std::set<int> lits, int sgn){
  if (lits.find(sgn*TRUELIT) != lits.end())
    return sgn*TRUELIT;
  lits.erase(sgn*FALSELIT);
  if (!lits.size())
    return sgn*FALSELIT;
  if (lits.size()==1)
    return *lits.begin();
  int res=new_var();
  for(int lit : lits)
    clause({sgn*res,-sgn*lit});
  for(int lit : lits)
    lgladd(lgl,sgn*lit);
  lgladd(lgl,-sgn*res);
  lgladd(lgl,0);
  return res;
}

int lit_or(std::set<int> lits){
  return lit_op(lits,1);
}

int lit_and(std::set<int> lits){
  return lit_op(lits,-1);
}

using A4 = std::array<int,4>;

void add_maze_conditions(Maze m, std::vector<A4> dirs, int len){
  int mp[9][2];
  int rp[9];
  for(int p=0; p<9; ++p)
    if((1ull << encoded_pos[p]) & m.reach)
      rp[p] = mp[p][0] = encoded_pos[p]==m.start ? TRUELIT : FALSELIT;
  int t=0;
  for(int i=0; i<len; ++i){
    std::set<int> posn {};
    for(int p=0; p<9; ++p){
      int ep = encoded_pos[p];
      if((1ull << ep) & m.reach){
        std::set<int> reach_pos {};
        for(int d=0; d<4; ++d){
          int np = do_move(m.walls, ep, move_offsets[d]);
          reach_pos.insert( lit_and({mp[unencoded_pos[np]][t],
                                  dirs[i][d ^ ((np==ep)?0:2)]    }));
        }
        int pl = lit_or(reach_pos);
        mp[p][!t] = pl;
        rp[p] = lit_or({rp[p], pl});
        posn.insert(pl);
      }
    }
    at_most_one(posn);
    t=!t;
  }
  for(int p=0; p<9; ++p)
    if((1ull << encoded_pos[p]) & m.reach)
      clause({rp[p]});
}

void usage(char* argv0){
  std::cout << "usage: " << argv0 <<
    " <string>\n   where <string> consists of 'N', 'E', 'S', 'W' and '*'.\n" ;
  std::exit(2);
}

const std::string nesw{"NESW"};

int main(int argc, char** argv) {
  if (argc!=2)
    usage(argv[0]);
  std::vector<Maze> mazes = gen_valid_mazes();
  std::cout << "Mazes with start positions: " << mazes.size() << "\n" ;
  lgl = lglinit();
  int len = std::strlen(argv[1]);
  std::cout << argv[1] << "\n   with length " << len << "\n";

  std::vector<A4> dirs;
  for(int i=0; i<len; ++i){
    switch(argv[1][i]){
    case 'N':
      dirs.emplace_back(A4{TRUELIT,FALSELIT,FALSELIT,FALSELIT});
      break;
    case 'E':
      dirs.emplace_back(A4{FALSELIT,TRUELIT,FALSELIT,FALSELIT});
      break;
    case 'S':
      dirs.emplace_back(A4{FALSELIT,FALSELIT,TRUELIT,FALSELIT});
      break;
    case 'W':
      dirs.emplace_back(A4{FALSELIT,FALSELIT,FALSELIT,TRUELIT});
      break;
    case '*': {
      dirs.emplace_back();
      std::generate_n(dirs[i].begin(),4,new_var);
      std::set<int> dirs_here { dirs[i].begin(), dirs[i].end() };
      at_most_one(dirs_here);
      clause(dirs_here);
      for(int l : dirs_here)
        lglfreeze(lgl,l);
      break;
      }
    default:
      usage(argv[0]);
    }
  }

  int maze_nr=0;
  for(;;) {
    std::cout << "Solving...\n";
    int res=lglsat(lgl);
    if(res==LGL_UNSATISFIABLE)
      unsat();
    assert(res==LGL_SATISFIABLE);
    std::string sol(len,' ');
    for(int i=0; i<len; ++i)
      for(int d=0; d<4; ++d)
        if (lit_is_true(dirs[i][d])){
          sol[i]=nesw[d];
          break;
    }
    std::cout << sol << "\n";

    Maze m=unsolved(str_to_moves(sol),mazes);
    if (m.is_dummy()){
      std::cout << "That solves all!\n";
      return 0;
    }
    std::cout << "Adding maze " << ++maze_nr << ": " << 
      m.walls << "/" << m.start <<
      " (" << m.size() << "/" << 12-m.simplicity() << ")\n";
    add_maze_conditions(m,dirs,len);
  }
}  

First configure.sh and make the lingeling solver, then compile the program with something like g++ -std=c++11 -O3 -I ... -o m3sat m3sat.cc -L ... -llgl, where ... is the path where lglib.h resp. liblgl.a are, so both could for example be ../lingeling-<version>. Or just put them in the same directory and do without the -I and -L options.

The program takes one mandatory command line argument, a string consisting of N, E, S, W (for fixed directions) or *. So you could search for a general solution of size 78 by giving a string of 78 *s (in quotes), or search for a solution starting with NEWS by using NEWS followed by as many *s as you want for additional steps. As a first test, take your favourite solution and replace some of the letters with *. This finds a solution fast for a surprisingly high value of "some".

The program will tell which maze it adds, described by wall structure and start position, and also give the number of reachable positions and walls. The mazes are sorted by these criteria, and the first unsolved one is added. Therefore most added mazes have (9/4), but sometimes others appear as well.

I took the known solution of length 79, and for each group of adjacent 26 letters, tried to replace them with any 25 letters. I also tried to remove 13 letters from the beginning and from the end, and replace them by any 13 at the beginning and any 12 at the end, and vice versa. Unfortunately, it all came out unsatisfiable. So, can we take this as indicator that length 79 is optimal? No, I similarly tried to improve the length 80 solution to length 79, and that was also not successful.

Finally, I tried combining the beginning of one solution with the end of the other, and also with one solution transformed by one of the symmetries. Now I'm running out of interesting ideas, so I decided to show you what I have, even though it didn't lead to new solutions.

Christian Sievers

Posted 2015-07-17T16:09:56.477

Reputation: 6 366

That was a really interesting read. Both the new approach and the different ways of reducing the number of mazes to check. To be a valid answer this will need to include a valid string. It doesn't need to be a new shortest string, just any valid string of any length to give a current score for this approach. I mention this because without a score, the answer will be at risk of deletion, and I'd really like to see it stay. – trichoplax – 2019-05-06T13:03:53.617

Also nice work on finding the optimal length solution for the older related challenge!

– trichoplax – 2019-05-06T13:20:32.487