Find the longest path, avoiding obstacles in a 2D plane

6

3

The Scenario

You are given a matrix of size m x n (width x height) with m*n spots where there are a few obstacles. Spots with obstacles are marked as 1, and those without are marked as 0. You can move vertically or horizontally, but can visit each spot only once. The goal is to cover as much area/spots as possible, avoiding obstacles.

Input

A 2D Matrix consisting of 0s and 1s and the coordinates of starting point.

Example Matrix1:

0(0,0) 1(0,1) 0(0,2)
0(1,0) 0(1,1) 0(1,2)
1(2,0) 0(2,1) 0(2,2)

Example Matrix2:

0(0,0) 0(0,1) 1(0,2) 0(0,3) 0(0,4)
0(1,0) 1(1,1) 0(1,2) 0(1,3) 0(1,4)
0(2,0) 0(2,1) 0(2,2) 1(2,3) 0(2,4)
0(3,0) 1(3,1) 0(3,2) 0(3,3) 0(3,4)
0(4,0) 0(4,1) 1(4,2) 0(4,3) 0(4,4)

The coordinates shown above, next to matrix elements is for explanation only. They are nothing but array indices. Actual input would just be the matrix. for example1 the input is just,

0 1 0
0 0 0
1 0 0

Input for example2,

0 0 1 0 0
0 1 0 0 0
0 0 0 1 0
0 1 0 0 0
0 0 1 0 0

Starting Point (0,0).

Output

Output should be a line containing comma separated moves. A move is a space separated direction (n,s,e,w) and number of units of movement in the direction (range 1-1000).

Example output for the matrix in example1 would be,

s 1,e 1,s 1,e 1,n 2

(0,0) --> (1,0) --> (1,1) --> (2,1) --> (2,2) --> (1,2) --> (0,2)

and for the matrix in example2 would be,

s 2,e 2,n 1,e 1,n 1,e 1,s 4,w 1,n 1,w 1 

(0,0) --> (1,0) --> (2,0) --> (2,1) --> (2,2) --> (1,2) --> (1,3) --> (0,3) --> (0,4) --> (1,4) --> (2,4) --> (3,4) --> (4,4) --> (4,3) --> (3,3) --> (3,2)

which is the longest path, visiting each coordinate only once and avoiding those coordinates with obstacles.

Objective Winning Criteria

Should compute the distance efficiently(efficient: performance wise, fast program basically) for huge matrices (Of order say, 500x500). One which computes the fastest wins (All submissions run on my machine, for fairness). No restrictions on number of bytes!

Here is a 20x20 matrix which can be used to get some idea of the performace, for tuning purposes.

0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0
0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0
0 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0

GeT_RiGhT

Posted 2016-03-12T14:07:00.113

Reputation: 63

2

Welcome to Programming Puzzles & Code Golf. Every challenge here needs an objective winning criterion, so its possible to determine a winner. In your case you should go with code-golf, so the solution with the fewest bytes of code wins. For future posts please check out the sandbox to get feedback first, before you post your challenge.

– Denker – 2016-03-12T14:26:41.850

Edited. Thanks. Sorry! This was my first time here. – GeT_RiGhT – 2016-03-12T14:46:51.067

3If you want this to be reopened, you need to improve the spec a bit as well. Firstly you should add more testcases, one is not enough. Also you shold format them in a way that makes it easy to copy-paste them (the matrix could just be a 2D array. Furthermore you need to specify the expected output more. I guess `s,e,w,n' are used to indicate north, west, ... but you need to make it clear. Also it is generally discouraged to require a specific I/O format. Just let everyone choose the most convenient input format that fits the used language best. – Denker – 2016-03-12T14:56:29.723

1@xnor Really? I think limiting it to only orthogonal moves (and not allowing moves from any point to any other) should make this one a bit different? – Martin Ender – 2016-03-12T17:01:07.187

@MartinBüttner Oh, I totally misunderstood that challenge, my bad. – xnor – 2016-03-12T23:49:55.280

1I think this would be more interesting if output format was more flexible, for example sesenn. As it stands, a significant part of the efffort is formatting the output – Luis Mendo – 2016-03-13T16:14:47.903

@DonMuesli Yes, you are right. Output can be as you said. It can even just be coordinates/array indices of points to be visited. – GeT_RiGhT – 2016-03-13T16:59:33.530

1If the winning criterion is running speed, you should remove the "code golf" tag. And you should define precisely how speed will be measured. For example, some answer may be the fastest for small matrices, and some other for large matrices. Which one wins then? – Luis Mendo – 2016-03-13T17:47:57.777

1Also posted on CS.SE. Please do not post the same question on multiple sites. Each community should have an honest shot at answering without anybody's time being wasted. Also, where did you encounter this problem? Please make sure to attribute your sources. Thank you! – D.W. – 2016-03-13T21:25:26.523

@DonMuesli I have already mentioned this in the winning criteria, but going to reiterate the one which computes the path for large matrices (largest being of the order 1000x1000) wins! – GeT_RiGhT – 2016-03-14T05:13:29.267

@D.W. My intention of posting it here was just to share this problem since I felt it is interesting and fun to solve! Wanted this problem to reach out to as many people as possible. The intention of posting it on CS.SE was purely for learning purposes. I wanted some advice on best algorithms to use for solving this. The problem might be the same but the intent with which it was posted was entirely different. Thats why different SE sites exist? Each with a different purpose? – GeT_RiGhT – 2016-03-14T05:20:04.783

1@xor I don't see how fastest for large matrix is fair - whoever has the biggest mainframe wins... what if I can't afford a mainframe? Shouldn't it be fastest on some specific computer (maybe yours)? That way the fastest algorithm wins instead of the most expensive computer. – Jerry Jeremiah – 2016-03-14T11:48:19.147

Are the outputs for the given test cases canonical? i.e. where (in case 2) there is an equivalent option, if the algorithm happens to choose a different but equally-covering route (that differs from the output in the question) is this valid? Or must the algorithm always be bias a certain way under equivalency choices? (and obviously, if so, how does this bias work/which way/etc?) – Tersosauros – 2016-03-14T18:53:18.590

Also, just out of interest... Is: s 2,e 1,w 2,s 3,n 2,e 1,s 2,s 1,s 3,s 1,s 3,n 2,n 3,s 2,n 7,e 1,n 5,e 1,s 4,s 3,n 1,e 1,s 3,s 1,e 1,w 2,s 1,s 1,s 2,e 3,w 2,e 3,n 1,s 2,s 1,w 1,e 2,e 2,w 1,n 2,n 1,n 2,w 1,w 3,e 2,w 1,s 1,n 1,n 1,n 1,n 1,n 1,n 1,n 3,w 1,n 5,e 1,e 2,e 5,w 3,w 2,s 2,s 1,s 1,e 1,n 2,s 1,s 1,s 1,s 1,e 1,e 2,e 1,w 1,e 2,s 1,w 1,e 4,s 1,e 1,n 2,e 1,n 5,e 2,w 2,n 3,e 2,w 2,e 2,e 1,w 1,s 1,s 3,w 1,w 1,e 1,w 2,s 2,s 2,n 1,s 1,s 3,w 4,n 1 the correct path for the 20x20 matrix? ;P – Tersosauros – 2016-03-14T18:55:29.570

@Tersosauros. Please, correct me if I'm wrong. I've summed up those values and it gave me 193. I did also a very simple algorithm (without any strategy for selecting directions) and the highest value was 219 check here. I think the output for the 20x20 matriz in the example should be some path with steps around 300... :/

– removed – 2016-03-15T01:10:13.907

@Washington Guedes Your 219 move solution skips 22 accessible cells/nodes - see this diagram of your solution's path. "some path with steps around 300..." It is true that there are 301 *unblocked* nodes. However, 43 of these are inaccessible (i.e. any [near-]optimum solution cannot traverse them). The largest path I've been able to find in the 20x20 matrix is 258. Obviously, the 193 node path I suggest above is not optimum, as you have pointed out.

– Tersosauros – 2016-03-25T00:26:14.603

Answers

8

Python 3, 2619 bytes (90 LOC), ~6 seconds (20x20)

 

#!/usr/bin/env python

import networkx as nx

class input(object):
    def __init__(self,world_string):
        rows=world_string.split("\n")
        column_count=len(rows[0])
        blocked_cells=[]
        x=0
        y=-1
        for row in rows:
            if len(row)>column_count:
                column_count=len(row)
            y+=1
            for column in row:
                if column=='1':
                    blocked_cells.append((y,x))
                x+=1
            x=0
        self.matrix=world(len(rows),column_count)
        self.block_graph(blocked_cells)
        self.matrix.add_sink()

    def block_graph(self,block_list):
        for cell in block_list:
            self.matrix.block_cell(cell)


class world(object):
    SINK="END"

    def __init__(self, rows=3, columns=3):
        self.graph=nx.grid_2d_graph(rows, columns)
        for e in self.graph.edges():
            self.graph[e[0]][e[1]]["weight"]=-1.0

    def block_cell(self,node):
        neighbours=list(self.graph[node].keys())
        for linked_node in neighbours:
            self.graph.remove_edge(node,linked_node)

    def add_sink(self):
        matrix=self.graph.nodes()[:]
        self.graph.add_node(world.SINK)
        for source_cell in matrix:
            self.graph.add_edge(source_cell, world.SINK, weight=1.0)


class algorithm(object):
    SOURCE=(0,0)
    def find_path(graph):
        def _heuristic(current, target):
            if current==target:return 0.0
            cost=nx.astar_path(graph, algorithm.SOURCE,current,lambda x,y:1)
            return -1 * len(cost)
        path=nx.astar_path(graph, algorithm.SOURCE, world.SINK, _heuristic)
        return path[:-1]

class output(object):
    directions={
        ( 1,  0)  : "n",
        (-1,  0)  : "s",
        ( 0,  1)  : "w",
        ( 0, -1)  : "e"
    }

    def edge_to_direction(source_node, dest_node):
        row_delta=source_node[0]-dest_node[0]
        column_delta=source_node[1]-dest_node[1]
        return output.directions[(row_delta, column_delta)]

    def path_to_directions(path):
        result=""
        for step in range(len(path)-1):
            result+=output.edge_to_direction(path[step],path[step+1])
        return result

    def simplify_directions(directions):
        directions+="\0"
        count=0
        result=""
        last=""
        for d in directions:
            if d!=last or d is "\0":
                result+="{} {},".format(last, count)
                count=0
                last=""
            if count==0:
                last=d
                count=1
                continue
            if d==last:
                count+=1
        return result[3:-1]



if __name__=='__main__':
    contents="010\n000\n100"    #read from file?

    data=input(contents).matrix.graph
    result_path=algorithm.find_path(data)
    directions=output.path_to_directions(result_path)
    directions=output.simplify_directions(directions)
    print(directions+"\n")
    print(str(result_path)[1:-1].replace('), (',') --> (').replace(', ',','))
 

 

Performance

I ran this solution (on my old Core i3 laptop) on the 20x20 'performance' example input and it ran in around ~6 seconds (of course, YMMV). I am SURE there are ways to do this faster, probably by orders of magnitude. However, I didn't feel like re-writing my own A* algorithm in assembly and renting some time slices on a super computer!

About this Code

The Idea

The idea for this approach comes from my answer over here in Code Review. You can also read a little bit more about it and the end of my answer here.

Dependencies

As you may have notice by the (rather prominent) line import networkx as nx - this code relies upon the Python NetworkX module to function. So do >pip install networkx if you get import errors when running.

Breakdown of Solution Elements by Lines of Code

I also thought it would be interesting (and it was), after Don Muesli's comment (particularly the part "a significant part of the efffort[sic] is formatting the output"), to examine how much of the code is for various things.

      Breakdown:

  • 39 lines (or 43.33%) of the lines of code in the program are concerned with parsing the input and constructing/weighting the graph data structure.

  • 37 lines (approx. 41%) directly relate solely to displaying and formatting the desired output!

  • 5 lines (5.5%) are 'other' stuff, which relates to running the program (shebang line, imports, the if __name__=="__main__"..., etc).

  • And 9 (yes.. nine) lines of code (or just 10%) actually relates to the algorithm usage.

      Obligatory Pie Chart:

                  Pie Chart illustration of Solution Elements by Lines of Code


About the Approach

As I mentioned above, most of the idea for this approach comes from my answer over on Code Review regarding this problem.

The process itself is relatively straightforward:

  1. Transform the given matrix into an undirected graph data structure
  2. Annotate (i.e. add weights) the edges of that graph
  3. Add a "sink" node - to be used as a search target
  4. Add (weighted) edges between all nodes and the sink node
  5. Use a Graph Search Algorithm (in this case A*)

 

So, taking the 3x3 matrix from the question for example. The transformation into a graph would yield something like the following:

Graph of example 3x3 Matrix

(Note: the 'blocked' nodes do not have any edges)

Step 2; imagine each green line in the above picture has a weight of -1.

Next, for step 3 and 4 together (I didn't want to have to make another picture), take a look at the following graph:

Matrix Graph with Sink Node

And again imagine the edges are weighted, only red lines are weighted +1.

We can then apply the A-star algorithm from the NetworkX library to this graph, using the custom heuristic function. This makes the search algorithm score paths in an inverse manner (longest path rather than shortest). And the longest path will be the one that visits the most cells of the problem matrix!

Tersosauros

Posted 2016-03-12T14:07:00.113

Reputation: 226

1Welcome to Programming Puzzles & Code Golf! Amazing first answer! – Denker – 2016-03-14T21:00:57.147

This fails to find the longest path even on the smallest 3×3 example case: it finds a path through 6 nodes instead of 7. I don’t particularly blame you for that, because even the question author doesn’t seem to realize that this problem is NP-complete, but you could have at least mentioned it. There is no way A-star can produce optimal solutions, no matter what heuristic you give it, because it will never revisit a vertex that it previously found on a different path.

– Anders Kaseorg – 2018-07-23T02:35:46.030

0

Clingo, 202 seconds for 20×20

This problem is NP-complete, even if you were to guarantee that all 0s are used, so your dreams of a program that runs efficiently for 500×500 may be a little bit unrealistic.

(The other answer using A* search doesn’t find the optimal solution, even on the 3×3 example case, because A* never revisits a node that was previously found on a different path.)

#script (python)

import clingo
import itertools
import sys

def main(prg):
    matrix = [list(line.split()) for line in sys.stdin]

    def data():
        for i, row in enumerate(matrix):
            for j, cell in enumerate(row):
                if cell == '0':
                    yield f'cell(({i}, {j})).\n'

    def on_model(model):
        step = {}
        for atom in prg.symbolic_atoms:
            if atom.symbol.name == 'step' and model.contains(atom.symbol):
                a, d, b = atom.symbol.arguments
                step[a] = d, b
        moves = []
        a = clingo.Tuple((0, 0))
        while step[a][1] != clingo.Tuple((0, 0)):
            d, a = step[a]
            moves.append(d)
        print(len(moves) + 1)
        print(','.join(f'{d} {len(list(g))}' for d, g in itertools.groupby(moves)))

    prg.add('data', [], ''.join(data()))
    prg.ground([('base', []), ('data', [])])
    prg.solve(on_model=on_model)

#end.

#program base.

dir(s, (1, 0); e, (0, 1); n, (-1, 0); w, (0, -1)).
adj((X, Y), D, (X+DX, Y+DY)) :- cell((X, Y)), dir(D, (DX, DY)), cell((X+DX, Y+DY)).
adj(A, end, (0, 0)) :- cell(A).
{step(A, D, B) : adj(A, D, B)} = 1 :- visit(A).
:- visit(B), {step(A, D, B) : adj(A, D, B)} != 1.
visit((0, 0)).
visit(B) :- step(A, D, B).
:~ visit(A). [-1, A]

#show step/3.

Demo

$ clingo -q longest.lp < 20x20.in
clingo version 5.2.2
Reading from longest.lp
Solving...
1

2
s 1
3
s 2
4
s 3
5
e 3,s 1
6
e 3,s 2
7
e 3,s 3
9
e 3,s 5
…
264
s 4,e 1,n 4,e 2,s 1,w 1,s 1,e 1,s 1,w 1,s 1,e 1,s 1,w 1,s 1,w 1,n 1,w 1,s 3,e 1,n 1,e 1,s 12,e 1,n 13,e 2,s 1,e 2,n 7,e 1,s 2,e 2,n 1,w 1,n 1,e 4,s 5,w 1,s 1,e 1,s 1,w 2,s 1,e 2,s 1,w 1,s 1,e 1,s 1,w 2,s 1,w 1,s 2,w 1,n 3,e 1,n 1,e 1,n 1,w 1,n 3,e 1,n 1,w 1,n 1,e 2,n 3,w 1,s 2,w 3,s 1,e 1,s 1,w 1,s 1,e 1,s 4,w 1,n 2,w 1,s 3,e 1,s 1,w 1,s 1,e 1,s 1,w 1,s 1,e 3,s 1,e 1,n 3,e 1,n 1,e 1,s 2,w 1,s 1,e 1,s 1,w 1,s 2,w 1,n 1,w 2,n 1,w 2,s 1,e 1,s 1,w 1,s 1,e 2,n 1,e 1,s 1,e 3,n 2,e 2,n 1,e 2,s 3,e 1,n 1,e 1,n 6,w 1,n 1,e 1,n 9,w 1,n 1,e 1,n 1,w 2,s 3,e 1,s 1,w 1,s 1,e 1,s 1,w 1,s 1,e 1,s 1,w 1,s 1,e 1,s 1,w 1,s 3,e 1,s 1,w 1,s 1,e 1,s 2
265
s 4,e 1,n 4,e 2,s 1,w 1,s 1,e 1,s 1,w 1,s 1,e 1,s 1,w 1,s 1,w 1,n 1,w 1,s 3,e 1,n 1,e 1,s 12,e 1,n 13,e 2,s 1,e 2,n 7,e 1,s 2,e 2,n 1,w 1,n 1,e 4,s 5,w 1,s 1,e 1,s 1,w 2,s 1,e 2,s 1,w 1,s 1,e 1,s 1,w 2,s 1,w 1,s 2,w 1,n 3,e 1,n 1,e 1,n 1,w 1,n 3,e 1,n 1,w 1,n 1,e 2,n 3,w 1,s 2,w 3,s 1,e 1,s 1,w 1,s 1,e 1,s 4,w 1,n 2,w 1,s 3,e 1,s 1,w 1,s 1,e 1,s 1,w 1,s 1,e 3,s 1,e 1,n 3,e 1,n 1,e 1,s 2,w 1,s 1,e 1,s 1,w 1,s 2,w 1,n 1,w 2,n 1,w 2,s 1,e 1,s 1,w 1,s 1,e 2,n 1,e 1,s 1,e 3,n 2,e 2,n 1,e 2,s 1,e 1,s 1,w 1,s 1,e 2,n 3,w 1,n 1,w 1,n 1,e 1,n 1,w 1,n 1,e 1,n 1,w 1,n 1,e 1,n 1,w 1,n 1,e 1,n 1,w 1,n 1,e 1,n 1,w 1,n 1,e 1,n 1,w 1,n 1,e 1,n 1,w 1,n 1,e 2,s 15
OPTIMUM FOUND

Models       : 99
  Optimum    : yes
Optimization : -265
Calls        : 1
Time         : 201.664s (Solving: 201.56s 1st Model: 0.00s Unsat: 0.00s)
CPU Time     : 201.602s

The final output line (above OPTIMUM FOUND) corresponds to the following path through 265 nodes:

• ┌───┐ 1 0 1 ┌─┐ ┌───────┐ 1 0 1 ┌───┐
│ │ ┌─┘ 1 0 1 │ │ └─┐ ┌─┐ │ 1 0 1 └─┐ │
│ │ └─┐ 1 0 1 │ └───┘ │ │ │ 1 0 1 ┌─┘ │
│ │ ┌─┘ 1 0 1 │ ┌─────┘ │ │ 1 0 1 └─┐ │
└─┘ └─┐ 1 0 1 │ └─┐ ┌───┘ │ 1 0 1 ┌─┘ │
┌─┐ ┌─┘ 1 0 1 │ ┌─┘ └─┐ ┌─┘ 1 0 1 └─┐ │
│ └─┘ ┌───┐ 1 │ └─┐ ┌─┘ └─┐ 1 0 1 ┌─┘ │
│ ┌─┐ │ 1 └───┘ 1 │ │ ┌───┘ 1 0 1 └─┐ │
└─┘ │ │ 1 0 1 ┌─┐ │ │ └───┐ 1 0 1 ┌─┘ │
0 1 │ │ 1 1 1 │ │ │ └─┐ ┌─┘ 1 0 1 └─┐ │
0 1 │ │ 1 1 1 │ └─┘ ┌─┘ └─┐ 1 0 1 ┌─┘ │
0 1 │ │ 1 1 1 └─┐ ┌─┘ ┌───┘ 1 0 1 └─┐ │
0 1 │ │ 1 1 1 ┌─┘ │ ┌─┘ ┌─┐ 1 0 1 ┌─┘ │
0 1 │ │ 1 1 1 └─┐ │ │ ┌─┘ │ 1 0 1 └─┐ │
0 1 │ │ 1 1 1 ┌─┘ └─┘ │ ┌─┘ 1 0 1 ┌─┘ │
0 1 │ │ 1 1 1 └─────┐ │ └─┐ 1 0 1 └─┐ •
0 1 │ │ 1 1 1 ┌───┐ └─┘ ┌─┘ 1 ┌───┐ └─┐
0 1 │ │ 1 1 1 └─┐ └───┐ │ ┌───┘ 1 └─┐ │
0 1 │ │ 1 1 1 ┌─┘ ┌─┐ └─┘ │ 1 0 1 ┌─┘ │
0 1 └─┘ 1 1 1 └───┘ └─────┘ 1 0 1 └───┘

Anders Kaseorg

Posted 2016-03-12T14:07:00.113

Reputation: 29 242