The algorithm works in steps:
Produce the graph in the form of an adjacency list
e.g. this graph
0 - - 2 \ 1 -- 3
produces this adjacency list
{0: [], 1: [0], 2: [0], 3: [1, 2]}
0 depends on nothing, 1 depends on 0 etc..
Iterate through the graph and find nodes that does not have any dependencies
Questions:
What should the correct running time be for a well implemented topological sort. I am seeing different opinions:
My implementation is running at \$O(V*E)\$. Because at worst, I will need to loop through the graph V times and each time I will need to check E items. How do I make my implementation into linear time.
def produce_graph(prerequisites):
adj = {}
for course in prerequisites:
if course[0] in adj:
# append prequisites
adj[course[0]].append(course[1])
else:
adj[course[0]] = [course[1]]
# ensure that prerequisites are also in the graph
if course[1] not in adj:
adj[course[1]] = []
return adj
def toposort(graph):
sorted_courses = []
while graph:
# we mark this as False
# In acyclic graph, we should be able to resolve at least
# one node in each cycle
acyclic = False
for node, predecessors in graph.items():
# here, we check whether this node has predecessors
# if a node has no predecessors, it is already resolved,
# we can jump straight to adding the node into sorted
# else, mark resolved as False
resolved = len(predecessors) == 0
for predecessor in predecessors:
# this node has predecessor that is not yet resolved
if predecessor in graph:
resolved = False
break
else:
# this particular predecessor is resolved
resolved = True
# all the predecessor of this node has been resolved
# therefore this node is also resolved
if resolved:
# since we are able to resolve this node
# We mark this to be acyclic
acyclic = True
del graph[node]
sorted_courses.append(node)
# if we go through the graph, and found that we could not resolve
# any node. Then that means this graph is cyclic
if not acyclic:
# if not acyclic then there is no order
# return empty list
return []
return sorted_courses
graph = produce_graph([[1,0],[2,0],[3,1],[3,2]])
print toposort(graph)
1 Answer 1
Documentation
Please add documentation to tell what the function is supposed to do and what input it is expecting.
Avoid repetition
In produce_graph
, you repeat course[0]
many times which smells like something wrong.
It can be rewritten :
def produce_graph(prerequisites):
adj = {}
for course in prerequisites:
first, second = course[0], course[1]
if first in adj:
# append prequisites
adj[first].append(second)
else:
adj[first] = [second]
# ensure that prerequisites are also in the graph
if second not in adj:
adj[second] = []
return adj
Using the right tool
What you are doing in produce_graph
can be easily achieved using the set_default
function :
adj = {}
for course in prerequisites:
first, second = course[0], course[1]
adj.setdefault(first, []).append(second)
(One might also suggest using defaultdict
, I'll let you pick your favorite).
Using the right tool is also choosing the right data structure. It would probably make sense for produce_graph
to take a list of tuples. Also, if you know that each elements contains 2 elements, you can use tuple unpacking like this :
for course in prerequisites:
first, second = course
or
for first, second in prerequisites:
adj.setdefault(first, []).append(second)
Avoid premature optimisation/keep it simple
Regarding :
resolved = len(predecessors) == 0
for predecessor in predecessors:
# this node has predecessor that is not yet resolved
if predecessor in graph:
resolved = False
break
else:
# this particular predecessor is resolved
resolved = True
At the beginning, you iterate resolved to True if predecessors
is True, False otherwise. I think this does the same as :
resolved = True
for predecessor in predecessors:
# this node has predecessor that is not yet resolved
if predecessor in graph:
resolved = False
break
As resolved will be False if and only if an element verifies the property in graph
.
Also, this can be rewritten using builtin all/any :
if all(p not in graph for p in predecessors):
produce_graph
? I think the secondif
test should be in thefor
loop. \$\endgroup\$