Assume directed graph with each edge being weighted by some positive float
.
Problem: find all simple paths (in other words,
[0, 1, 2, 0]
is simple while[0, 1, 0, 1, 2, 0]
is not due to internal[0, 1, 0]
cycle) with length<= cutoff
and "push" them usingpusher
callback.
P.S. There exists some realtime engine (abstracted out by pusher
) that does such traversal frequently and algorithm should be optimized to be as fast as possible.
from collections import deque
def algorithm(G, source_node, target_node, pusher, cutoff=5):
queue = deque([[source_node]] if source_node in G else deque())
while len(queue):
path = queue.popleft()
if len(path) == cutoff: # no sense to keep this path, it not gonna fit the limit
continue
adjacency_nodes = G[path[-1]] # all the neighbours from the latest node in this path
if not len(adjacency_nodes): # deadend reached: you can't get any further from this node and it doesn't fit the requirement, so just forgetting it
continue
for adjacency_node in adjacency_nodes:
extended_path = path.copy() + [adjacency_node] # might some smart linked list do any better here?
if adjacency_node == target_node:
pusher(extended_path) # wow! found one! does not make sense to keep it, since any other path from here will introduce an internal cycle, right?
else:
queue.append(extended_path) # stay calm and keep trying
-
1\$\begingroup\$ Why did you write this code? What is it being used for? What would you like out of a review? \$\endgroup\$Reinderien– Reinderien2019年09月15日 13:35:04 +00:00Commented Sep 15, 2019 at 13:35
-
\$\begingroup\$ @Reinderien, I'd like community to review it in terms of "Python ideomatic code" (which I am not good with) and, more importantly, algorithm preformance. For example, knowledgeble people might suggest a better datasctructure or whatever. \$\endgroup\$Zazaeil– Zazaeil2019年09月15日 13:36:56 +00:00Commented Sep 15, 2019 at 13:36
1 Answer 1
An empty container like a list or deque is False, so it isn't necessary to use len()
on them. This is more "pythonic"
while queue:
...
if not adjacency_nodes:
...
It seems wasteful to add a path to the queue only to discard it because it is too long. It would be better to check the length before making a copy of the path and adding it to the queue. Moreover, if len(path) == is cutoff - 1
the only possible solution is if target_node
is in adjacency_nodes
. So, something like this might be faster:
from collections import deque
def algorithm(G, source_node, target_node, pusher, cutoff=5):
if source_node not in G:
return
queue = deque([[source_node]])
while queue:
path = queue.popleft()
adjacency_nodes = G[path[-1]]
if not adjacency_nodes:
continue
if len(path) == cutoff - 1:
if target_node in adjacency_nodes:
pusher(path[:] + [target_node])
elif len(path) < cutoff - 1:
queue.extend(path[:] + [node] for node in adjacency_nodes)
If you are trying to optimize for speed, use the profiler in the standard library to see where the algorithm is spending time.
The algorithm is basically a breadth first search of the graph. Because there is a depth cutoff, a depth first search might be faster because there isn't the overhead of copying the path and keeping the queue of paths to search. But the only way to know is to implement it and see.
Lastly, the networkx library provides all_simple_paths(G, source, target, cutoff=None)
which is a generator of all simple paths from source
to target
with a maximum length cuttoff
. FYI, networkx uses a DFS search (source code).