# Finding the longest path, avoiding obstacles in a 2D plane

Posted on

Problem

## The Scenario

You are given a matrix of size m x n (width x height) with m*n spots where there are a few obstacles. Spots with obstacles are marked as 1, and those without are marked as 0. You can move vertically or horizontally, but can visit each spot only once. The goal is to cover as much area/spots as possible, avoiding obstacles.

## Input

A 2D Matrix consisting of 0s and 1s and the coordinates of starting point.

Example Matrix:

``````0 1 0
0 0 0
1 0 0
``````

Starting Point (0,0).
The coordinates are nothing but array indices.

## Output

Output should be a line containing comma separated moves. A move is a space separated direction (n,s,e,w) and number of units of movement in the direction (range 1-1000).

Example output for the above matrix would be,

``````s 1,e 1,s 1,e 1,n 2

(0,0) --> (1,0) --> (1,1) --> (2,1) --> (2,2) --> (1,2) --> (0,2)
``````

which is the longest path, visiting each coordinate only once and avoiding those coordinates with obstacles.

## My algorithm

for every vertex/coordinate, recursively choose the neighbor which
yields the longest path

## Code

``````def getDirections(p1, p2):
return {(1,0):"n", (-1,0):"s", (0,1):"w", (0,-1):"e"}[(p1-p2, p1-p2)]

def findPath(start, mat):
m, paths = getMax(mat, getNeighbours(mat), start, list())
print(paths)
if len(paths) == 1:
return ""
directions = ""
prev = getDirections(paths, paths)
current = ""
ctr = 1
for i in range(1, len(paths)-1):
current = getDirections(paths[i], paths[i+1])
if current == prev:
ctr += 1
else:
directions, prev, ctr = directions + prev + " " + str(ctr) + ",", current, 1
return directions + current+ " " +str(ctr)

#Neighbor is (-1,-1) if indices are out of range. For top and bottom rows for example.
def getNeighbours(mat):
neighbours = dict()
for i in range(len(mat)):
for j in range(len(mat[i])):
n = (i-1, j) if i-1 >= 0 else (-1, -1)
s = (i+1, j) if i+1 < len(mat) else (-1, -1)
w = (i, j-1) if j-1 >= 0 else (-1, -1)
e = (i, j+1) if j+1 < len(mat[i]) else (-1, -1)
neighbours[(i,j)] = {'n':n, 's':s, 'w':w, 'e':e}
return neighbours

def getMax(mat, neighbours, coordinates, visited):
if coordinates == (-1,-1):
return -1, []
elif mat[coordinates][coordinates] == 1:
return -1, []
elif coordinates in visited:
return -1, []
else:
visited.append(coordinates)
n, nlist = getMax(mat, neighbours, neighbours[coordinates]['n'], visited[:])
s, slist = getMax(mat, neighbours, neighbours[coordinates]['s'], visited[:])
w, wlist = getMax(mat, neighbours, neighbours[coordinates]['w'], visited[:])
e, elist = getMax(mat, neighbours, neighbours[coordinates]['e'], visited[:])
if max(n,s,w,e) == -1:
return 0, visited
if max(n,s,w,e) == n:
visited = nlist
elif max(n,s,w,e) == s:
visited = slist
elif max(n,s,w,e) == w:
visited = wlist
elif max(n,s,w,e) == e:
visited = elist
return 1+max(n,s,w,e), visited
``````

Solution

Well, first of all (as I said in my earlier comment); “Extended algorithmic discussion may be better on Programmers.SE, however performance improvement is definitely on-topic here!“. So, let me start with (1) a quick review of some more basic things, then maybe give you (2) some advice about your approach and (3) the performance of your code.

## 1.1) Avoid Magical Complicated One-Liners

The very first thing I see in your code (really, before anything else), is the second line (the implementation of `getDirections`):

``````return {(1,0):"n", (-1,0):"s", (0,1):"w", (0,-1):"e"}[(p1-p2, p1-p2)]
``````

While Python is a particularly powerful language when it comes to doing a LOT of things in one line… This usually (almost always) is not the best idea. Both for readability, as well as for maintainability. After all, this isn’t Code Golf.

Additionally, you are also constructing the dictionary in each call – this is certainly not best practise performance wise (although in this case Python has probably optimized this for you).
But in general, if you have a lookup data structure that will not change throughout your program, you would make it a global variable, or class variable if you were using OOP.

## 1.2) Function Naming, and other Coding Standards

Taken from PEP8 section on Function Names:

Function names should be lowercase, with words separated by underscores as necessary to improve readability.
mixedCase is allowed only in contexts where that’s already the prevailing style (e.g. threading.py), to retain backwards compatibility

Your function names resemble Java methods, rather than Python functions. This strikes me as particularly unusual since you seem to be coding in a procedural style. (Often Java programmers will name things in a Java way, but they also use OO).

Rather than:

• `getDirections`
• `findPath`
• `getNeighbours`
• `getMax`

You should use a more Pythonic naming scheme, like this:

• `path_direction`
• `find_path`
• `cell_neighbours`
• `max_path_length`

## 1.3) Argument and Variable Names – Meaning is Important

In a lot (almost all) of your functions, I see some argument names like `p1`, or `mat`. As well as variable names like `n`, `s`, `i`, `j`, and `wlist`, etc.

Certainly, at a minimum typing `matrix` instead of `mat` is MUCH better. `p1` and `p2` can be figured to mean `path1` and `path2`, but I feel the latter is much more expressive and clear.
When it comes to iteration variables like `i` and `j`, you are okay using things like this. But it is better to restructure you loops such that you can do things like `for row in matrix:`, rather than `for i in range(len(mat)):`.

## 2.1) The Idea

As I eluded to in my earlier comment; “have you looked into any of the Graph Search Algorithms, such as Dijkstra and A-star (A)?*” – I (personally) think they key to this answer lays in one or another graph search algorithm.

## 2.2) Transformation

Let’s take the example 3×3 matrix from your question:

``````0 1 0
0 0 0
1 0 0
``````

Now we want to represent each of these 9 matrix cells as a node in a graph. Given that `0` represents an empty node, and `1` is a blocked node, such a representation may look like the following: Note that as the cell at row `0`, column `0` is always the start node, we can label it as such. This is important as most graph search algorithms will require a starting node to work from. My apologies for the lack of planarity, however a matrix does not lend itself easily to having no cross-overs for 15 vertices!

Note, giving vertices from the blocked nodes back to the end node is (probably, depending on which exact algorithm you use) optional. I’ve included it here for completeness, but you may very well decide for performance reasons to just ignore blocked nodes all together.

## 2.3) Weighting the Graph

The key to this approach, is the different colours links (vertices) in the graph above.
Imagine if each green vertex has a zero (or negative) weight, and each red vertex had some positive weight (say `1`).
then an algorithm seeking to find the “best” (*i.e. optimum, in terms of cost) path would naturally visit the most nodes (via green links).

Taking the path produced in your question as an example:

``````s 1,e 1,s 1,e 1,n 2
``````

This would represent a path with a cost of `-1 + -1 + -1 + -1 + -2 + 1= -5`.
Which is cheaper (in terms of aggregated weight cost), than the sub-optimal path:

`````` s 1,e 2,n 1
``````

Which would have the cost: `-1 + -2 + -1 + 1 = -3`!

## 2.4) Finding the Algorithm

Obviously, there are many choices of Graph Search algorithms out there. And going through and examining each one is certainly not within the scope of an answer on Rode Review. But there is plenty of information on Google/Wikipedia regarding these.

Essentially, you’ll want an algorithm that can avoid cycles (i.e. only visits nodes once), find the “best” (or shortest, or lowest score, etc) path – and (most importantly) can do this in a reasonable time.
Algorithms such as DFS come to mind, although you’ll no doubt need to alter most implementations for your purposes.

## 3.1) Data Structures

The main reason for your codes slow performance is that you appear to not be using the correct data structures for your problem space.

Instead of using lists and tuples for everything, I suggest you take a look at something like Graph-tool and use graph based data structures.

## 3.2) Use Existing Algorithms

As I mentioned in §2 (A Graphical Approach) above, your best bet is to find an existing algorithm to accomplish this task (i.e. change the data structures you’re working with to fit an algorithm rather than the other way around as you’re currently doing).