Problem

Part of my program is a variable-sized set of Star Systems randomly linked by Warp Points. I have an A* algorithm working rather well in the grid, but the random warp point links mean that even though the systems have X,Y coordinates for where they’re located on a galactic map, a system at 2,3 isn’t always linked directly to a system at 2,4 and so the shortest path may actually lead away from the target before it heads back towards it. I think this limitation eliminates A* since there’s almost no way to get a good heuristic figured out.

What I’ve done instead is a recursive node search (I believe this specific pattern is a Depth-First Search), and while it gets the job done, it also evaluates every possible path in the entire network of systems and warp points, so I’m worried it will run very slowly on larger sets of systems. My test data is 11 systems with 1-4 warp points each, and it averages over 700 node recursions for any non-adjacent path.

My knowledge of search/pathfinding algorithms is limited, but surely there’s a way to not search *every single node* without needing to calculate a heuristic, or at least is there a heuristic here I’m not seeing?

Here’s my code so far:

```
private int getNextSystem(StarSystem currentSystem, StarSystem targetSystem,
List<StarSystem> pathVisited)
{
// If we're in the target system, stop recursion and
// start counting backwards for comparison to other paths
if (currentSystem == targetSystem)
return 0;
// Arbitrary number higher than maximum count of StarSystems
int countOfJumps = 99;
StarSystem bestSystem = currentSystem;
foreach (StarSystem system in currentSystem.GetConnectedStarSystems()
.Where(f=>!pathVisited.Contains(f)))
{
// I re-create the path list for each node-path so
// that it doesn't modify the source list by reference
// and mess up other node-paths
List<StarSystem> newPath = new List<StarSystem>();
foreach (StarSystem s in pathVisited)
newPath.Add(s);
newPath.Add(system);
// recursive call until current == target
int jumps = getNextSystem(system, targetSystem, newPath);
// changes only if this is better than previously found
if (jumps < countOfJumps)
{
countOfJumps = jumps;
bestSystem = system;
}
}
// returns 100 if current path is a dead-end
return countOfJumps + 1;
}
```

Solution

## Complete and Incomplete Algorithms

Search algorithms can be classed into two categories: *complete* and *incomplete*.

A *complete* algorithm will always succeed in finding what your searching for. And not surprisingly an *incomplete* algorithm may not always find your target node.

For arbitrary connected graphs, without any a priori knowledge of the graph topology a *complete* algorithm may be forced to visit all nodes. But **may** find your sought for node before visiting all nodes. A* is a complete best-first method with a heuristic to try to avoid searching unlikely parts of the graph unless absolutely necessary.

So unfortunately you can not guarantee that you will never visit all nodes whatever algorithm you choose. But you can reduce the likelihood of that happening.

## Without pre-processing

If you cannot consider pre-processing your graph then you’re stuck with a myriad of on-line algorithms such as depth-first, breadth-first, A* and greedy-best-first. Out of the bunch I’d bet on A* in most cases if the heuristic is even half good and the graphs are non-trivial.

If you expect all routes to be short, a breadth-first algorithm with cycle detection and duplicate node removal may outperform A* with a poor heuristic. I wouldn’t bet on it though, you need to evaluate.

## With pre-processing

In your case I’d see if I could pre-process the graph, even if you need to repeatedly re-do the pre-processing when the graph changes as long as you do sufficiently many searches between pre-processing it would be worth it.

You should look up Floyd-Warshall (or some derivative) and calculate the pairwise cost/distance/jumps between all nodes and use this table as a heuristic for your A*. This heuristic will not only be admissible, it will be exact and your A* search will complete in no-time.

Unless you of course modify the algorithm to store all pairwise routes as they are detected in a matrix of vectors, then you have O(1) run time at the cost of memory.

After typing up my problem, I realized I hadn’t tried implementing a Breadth-First Search (BFS) algorithm for the same problem as my Depth-First Search (DFS). I did some additional research, and I think BFS code will result in quicker results on average^{1}, since it terminates and returns as soon as it has an answer instead of continuing to evaluate all further possible solutions.

^{1} – I’d love to know if there’s mathematical logic to BFS vs DFS efficiency. I think it’s related to the number of connections per node versus the number of nodes, but I’m not sure. If there’s a logical way to determine when to switch between the two implementations, I’d love to hear about it.

Here’s the BFS code I came up with:

```
public StarSystem GetNextSystem(StarSystem startingSystem, StarSystem targetSystem)
{
// Players might call for path to current system
if (startingSystem == targetSystem)
return startingSystem;
// Queue needs to store path-thus-far and current system
Queue<Tuple<List<StarSystem>, StarSystem>> StarSystemQueue =
new Queue<Tuple<List<StarSystem>, StarSystem>>();
// Need to track visited systems to prevent infinite loops
List<StarSystem> visitedSystems = new List<StarSystem>();
// Starting system is already visited
visitedSystems.Add(startingSystem);
// For connected systems that we haven't already visited
foreach (StarSystem system in startingSystem.GetConnectedStarSystems()
.Where(f => !visitedSystems.Contains(f)))
{
List<StarSystem> pathList = new List<StarSystem>();
pathList.Add(system);
// Add to visited systems so it's not evaluated in the loop
visitedSystems.Add(system);
// Enqueue the path & system
StarSystemQueue.Enqueue(
new Tuple<List<StarSystem>, StarSystem>(pathList, system));
}
// Loop til there's an answer or all paths are exausted
while(StarSystemQueue.Count>0)
{
// Grab current from the queue
Tuple<List<StarSystem>,StarSystem> currentSystem = StarSystemQueue.Dequeue();
// If current is the target, return the first system from the path
if (currentSystem.Item2 == targetSystem)
return currentSystem.Item1.First();
// For connected systems that we haven't already visited
foreach (StarSystem system in currentSystem.Item2.GetConnectedStarSystems()
.Where(f => !visitedSystems.Contains(f)))
{
// rebuild path list to prevent changing other paths by reference
List<StarSystem> pathList = new List<StarSystem>();
foreach (var previous in currentSystem.Item1)
pathList.Add(previous);
pathList.Add(system); // add new system to path
visitedSystems.Add(system); // add new system to visited
// Enqueue the path & system
StarSystemQueue.Enqueue(
new Tuple<List<StarSystem>, StarSystem>(pathList, system));
}
}
// No valid answer at this point, return starting system and handle in outer code
return startingSystem;
}
```

I’m not entirely sure about the efficiency of this, but there is a mathematical way of looking at this.

Create a n x n matrix of your systems, for example

```
a b c d e
-----------
a| 0 1 1 0 0
b| 1 0 0 1 0
c| 1 0 0 0 0
d| 0 1 0 0 1
e| 0 0 0 1 0
```

We will call this matrix `Paths`

.

This matrix is for a graph with the **nodes** a–e (your star systems). And the **edges** (paths) as follows: a->b, a->c, b->a, b->d, c->a, d->b, d->e, e->d.

Now, let’s say that we want to find the shortest path from a to e. We check **row** a and **column** e and we find that no, there is no direct path.

To find the paths of length two, you can multiply the Matrix with itself. So, Paths * Paths results in:

```
2 0 0 1 0
0 2 1 0 1
0 1 1 0 0
1 0 0 2 0
0 1 0 0 1
```

Let’s call this Matrix `PathsSize2`

This gives us all paths of length two. We see that the paths **from** node a is `2 0 0 1 0`

. So that’s two paths to itself and one way to `d`

. But still no paths to b, c, or e. So we’ll multiply `PathsSize2`

with the previous matrix, `Paths`

:

```
0 3 2 0 1
3 0 0 3 0
2 0 0 1 0
0 3 1 0 2
1 0 0 2 0
```

And so we see that we have a path from `a`

to `e`

! You should be able to find out *which* path that is from the matrices that you have calculated above. (I cannot remember myself exactly at the moment). I believe this method which I have partially described here is the Floyd-Warshall algorithm that Emily L. has also mentioned in her answer.

Instead of returning `countOfJumps + 1`

for a dead end, you should return something like a `-1`

, this way it will never change no matter how many `systems`

you add to the code system, in other words it would be more maintainable.

maybe that isn’t the best way to do it.

I am thinking that you should have a logical flow for if the first jump leads to a dead end or if any number of jumps leads to a dead end.

you should probably wrap your for each inside of an if statement as well. something like:

```
if (currentSystem.GetConnectedStarSystems().Where(f=>!pathVisited.Contains(f))) {
Return countOfJumps;
} else if {
foreach (StarSystem system in currentSystem.GetConnectedStarSystems()) {
.....
}
}
```

I think that fits with what you have going on there, doing this might actually reduce redundant code, unfortunately I am not entirely sure what all is going on here, I don’t have much experience with `A*`

algorithms.

I know that separating these two pieces of logic is logical, what you have looks like you tried to smoosh an if and a foreach together. I don’t want to even setup a foreach if this system is the target system.

you might also want to comment that in the code as well.

It looks like you might be able to change something here:

```
foreach (StarSystem s in pathVisited)
newPath.Add(s);
newPath.Add(system);
```

The efficiency of heuristic pathfinding algorithms comes from the fact that you are able to discard earlier (or that you work more directed).

Discarding any heuristic means the best you can get is a lucky hit (which is rather unlikely) or the Dijkstra algorithm, which is not that efficient (compared to some decent heuristic).

So I think the way to go is a heuristic. As you mentioned the pure euclideon distance is not a good measure as it steers away from the warp holes.

My idea would be to build a coarser representation of the map (maybe nodes = clusters of systems …) and for this new map you store the shortest distances between the clusters. On the finer level you store the nearest point in the cluster to each other cluster (lets call them cluster transfer points or CTP) and your heuristic uses the minimal cluster distance plus the distance to the CTP in the current cluster + the distance from the target clusters CTP to the target system. This hierarchical structure can be build up for many levels to handle big systems.

The two important parts are:

– Complex calculation stays local in the current detail level while the higher levels are used to give you a heuristic

– don’t forget to ensure that the heuristic is admissible (that could get a bit harder)

I cannot evaluate your code very much as I am no C# expert but it makes sense to me as far as I can read it.