7
\$\begingroup\$

The algorithm works in steps:

  1. Produce the graph in the form of an adjacency list

    e.g. this graph

    0 - - 2
     \
     1 -- 3
    

    produces this adjacency list

    {0: [], 1: [0], 2: [0], 3: [1, 2]}
    

    0 depends on nothing, 1 depends on 0 etc..

  2. Iterate through the graph and find nodes that does not have any dependencies

Questions:

  1. What should the correct running time be for a well implemented topological sort. I am seeing different opinions:

    Wikipedia says: \$O(log^2(n)\$)

    Geeksforgeeks says: \$O(V+E)\$

  2. My implementation is running at \$O(V*E)\$. Because at worst, I will need to loop through the graph V times and each time I will need to check E items. How do I make my implementation into linear time.

def produce_graph(prerequisites):
 adj = {}
 for course in prerequisites:
 if course[0] in adj:
 # append prequisites
 adj[course[0]].append(course[1])
 else:
 adj[course[0]] = [course[1]]
 # ensure that prerequisites are also in the graph
 if course[1] not in adj:
 adj[course[1]] = []
 return adj
def toposort(graph):
 sorted_courses = []
 while graph:
 # we mark this as False
 # In acyclic graph, we should be able to resolve at least
 # one node in each cycle
 acyclic = False
 for node, predecessors in graph.items():
 # here, we check whether this node has predecessors
 # if a node has no predecessors, it is already resolved,
 # we can jump straight to adding the node into sorted
 # else, mark resolved as False
 resolved = len(predecessors) == 0
 for predecessor in predecessors:
 # this node has predecessor that is not yet resolved
 if predecessor in graph:
 resolved = False
 break
 else:
 # this particular predecessor is resolved
 resolved = True
 # all the predecessor of this node has been resolved
 # therefore this node is also resolved
 if resolved:
 # since we are able to resolve this node
 # We mark this to be acyclic
 acyclic = True
 del graph[node]
 sorted_courses.append(node)
 # if we go through the graph, and found that we could not resolve
 # any node. Then that means this graph is cyclic
 if not acyclic:
 # if not acyclic then there is no order
 # return empty list
 return []
 return sorted_courses
graph = produce_graph([[1,0],[2,0],[3,1],[3,2]])
print toposort(graph)
asked Jun 23, 2015 at 21:19
\$\endgroup\$
3
  • \$\begingroup\$ Is your indentation messed up in produce_graph? I think the second if test should be in the for loop. \$\endgroup\$ Commented Jun 24, 2015 at 6:52
  • \$\begingroup\$ Please read the whole sentence from Wikipedia: "... it can be computed in O(log^2 n) time on a parallel computer using a polynomial number O(n^k) of processors, for some constant k." \$\endgroup\$ Commented Jun 24, 2015 at 10:53
  • \$\begingroup\$ Please don't cross-post - it doesn't seem like you actually want a review in terms of the scope of this site. \$\endgroup\$ Commented Jun 24, 2015 at 11:01

1 Answer 1

2
\$\begingroup\$

Documentation

Please add documentation to tell what the function is supposed to do and what input it is expecting.

Avoid repetition

In produce_graph, you repeat course[0] many times which smells like something wrong.

It can be rewritten :

def produce_graph(prerequisites):
 adj = {}
 for course in prerequisites:
 first, second = course[0], course[1]
 if first in adj:
 # append prequisites
 adj[first].append(second)
 else:
 adj[first] = [second]
 # ensure that prerequisites are also in the graph
 if second not in adj:
 adj[second] = []
 return adj

Using the right tool

What you are doing in produce_graph can be easily achieved using the set_default function :

adj = {}
for course in prerequisites:
 first, second = course[0], course[1]
 adj.setdefault(first, []).append(second)

(One might also suggest using defaultdict, I'll let you pick your favorite).

Using the right tool is also choosing the right data structure. It would probably make sense for produce_graph to take a list of tuples. Also, if you know that each elements contains 2 elements, you can use tuple unpacking like this :

for course in prerequisites:
 first, second = course

or

for first, second in prerequisites:
 adj.setdefault(first, []).append(second)

Avoid premature optimisation/keep it simple

Regarding :

 resolved = len(predecessors) == 0
 for predecessor in predecessors:
 # this node has predecessor that is not yet resolved
 if predecessor in graph:
 resolved = False
 break
 else:
 # this particular predecessor is resolved
 resolved = True

At the beginning, you iterate resolved to True if predecessors is True, False otherwise. I think this does the same as :

 resolved = True
 for predecessor in predecessors:
 # this node has predecessor that is not yet resolved
 if predecessor in graph:
 resolved = False
 break

As resolved will be False if and only if an element verifies the property in graph.

Also, this can be rewritten using builtin all/any :

 if all(p not in graph for p in predecessors):
answered Jun 25, 2015 at 11:58
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.