Any pointers, or any advice on how to clean up my code/make more pythonic would be appreciated.
graph = {'A': ['B','Y','D', 'E',],
'B': ['F','F'],
'C': ['F'],
'D': [],
'E': ['F','G'],
'F': ['F','A'],
'G': ['E','K'],
'K': ['M','L']
}
def bfs(graph,start,end):
vertex = [start]
history = []
# create new node to be iterated through
# update history
# update vertex
while vertex: # while vertex is not empty. len(li) == 0:
node = vertex[0] # get the 0th element of the vertex
history.append(node)
print(node)
vertex.pop(0) # pop the 0th element of the vertex
if node == end:
return end
# iterate through the graph. gather all of the values per node
for i in graph.get(node, '/'): # if key does not exist, print '/'
if i: #if i is not empty
vertex.append(i)
# append the dict values into vertex list
print('vertex',vertex)
if i in history:
vertex.pop(-1)
2 Answers 2
1. Bug
A correct implementation of breadth-first search visits each node at most once, but the code in the post visits some nodes multiple times. In this example, \$C\$ gets visited twice:
>>> graph = {'A': 'BC', 'B': 'C', 'C': 'A'}
>>> bfs(graph, 'A', 'Z')
A
vertex ['B']
vertex ['B', 'C']
B
vertex ['C', 'C']
C
vertex ['C', 'A']
C
vertex ['A']
The problem is that a node is only added to the history
list after you pop it from the vertex
list. But that means that if there are multiple paths to a node, then it can be added to the vertex
list multiple times (as in the example above, where node \$C\$ gets added twice).
Instead, a node should be added to the history
list at the same time as you add it to the vertex
list.
2. Review
The name
queue
would be better thanvertex
(it's a queue of nodes that have been visited in the search but whose neighbours have not yet been considered for visiting).The queue is implemented using a list. But this is inefficient because popping the leftmost element of a list takes time proportional to the length of the list (because the other items in the list get moved down in the list to fill the space). It is better to use a
collections.deque
, which has an efficientpopleft
method.The name
visited
would be better thanhistory
(it's a collection of nodes that have been visited in the search).The collection of visited nodes is implemented using a list. But this is inefficient because finding an element in a list takes time proportional to the length of the list (because the element gets compared against each member of the list in turn). It is better to use a
set
, which has efficient membership determination.The name
neighbour
would be better thani
(it's a neighbour ofnode
in the graph).It's pointless to add a node to the queue and then immediately remove it again:
if i: #if i is not empty vertex.append(i) if i in history: vertex.pop(-1)
Instead of testing for
node == end
after popping the node from the queue, it would be better to test before adding it to the queue. That way you wouldn't have to wait for the end node to move all the way up the queue.
3. Revised code
from collections import deque
def bfs(graph, start, end):
"""Return True if there is a path from start to end in graph, False if
not. graph must be a dictionary mapping a node to a collection of its
neighbours.
"""
if start == end:
return True
queue = deque([start])
visited = set(queue)
while queue:
node = queue.popleft()
for neighbour in graph.get(node, ()):
if neighbour not in visited:
if neighbour == end:
return True
queue.append(neighbour)
visited.add(neighbour)
return False
First off, I do not have an interpreter to test the validity of code, these are only "pythonic" comments: Instead of:
node = vertex[0] # get the 0th element of the vertex
history.append(node)
print(node)
vertex.pop(0) # pop the 0th element of the vertex
Do this:
node = vertex.pop(0) #.pop() returns the value popped.
history.append(node)
print(node)
I don't think you need to check if i exists, get() (I think) the way you wrote it will either return the value you want or "/". Therefore, checking if it exists is redundant, so you can probably get rid of that. Appending i to vertex and then maybe getting rid of it, seems unnecessary. Unless you really want to print the array like an array, then I would put the check before appending:
if not i in history:
vertex.append(i)
print('vertex: ", vertex ", ", i)
Those are my only comments about the pythonic-ness.