def dfs(graph, node):
"""Run DFS through the graph from the starting
node and return the nodes in order of finishing time.
"""
stack = [[node, True]]
while True in [x[1] for x in stack]:
i = 0
for x in xrange(len(stack)):
if stack[x][1] == True:
i = x
break
stack[i][1] = False
stack = [[x, True] for x in graph[stack[i][0]] \
if (([x, True] not in stack) and ([x, False] not in stack))] + stack
return [x[0] for x in stack]
Input: Graph as adjacency list Eg: {1:[2, 3], 2:[1], 3:[]}
Node is an integer value from where depth first search begins Eg: 1
Output: The list of nodes in order of their finishing times.
Eg: [3, 2, 1]
For large graphs, the above code is taking too long to execute. Is there a better way to do this?
Edit: I could do better time by implementing the following code. Is it possible to do even better?
def dfs(graph, node):
"""Run DFS through the graph from the starting
node and return the nodes in order of finishing time.
"""
seen = defaultdict(bool)
stack = [node]
while True:
flag = True
for x in stack:
if not seen[x]:
stack = [a for a in graph[x] if not seen[a]] + stack
seen[x] = True
flag = False
break
if flag:
break
return stack
Edit: I've changed seen and stack variables to sets. But I'm not getting the required performance boost. What am I doing wrong?
def dfs(graph, node):
"""Run DFS through the graph from the starting
node and return the nodes in order of finishing time.
"""
seen = set()
stack = {node}
while True:
flag = True
for x in stack:
if x not in seen:
stack = stack.union(set(graph[x]))
seen = seen.union({x})
flag = False
break
if flag:
break
return list(stack)
-
\$\begingroup\$ This does not do depth-first search, it does breadth-first search. Which do you want? I'll write an answer later today when I have time. Btw: The third function doesn't work because sets aren't ordered (unless all you want to do is to find all nodes that are reachable from the starting node). \$\endgroup\$flornquake– flornquake2013年08月06日 12:17:59 +00:00Commented Aug 6, 2013 at 12:17
-
\$\begingroup\$ Winston has already answered it better than I could, so nevermind. \$\endgroup\$flornquake– flornquake2013年08月07日 06:18:07 +00:00Commented Aug 7, 2013 at 6:18
2 Answers 2
Some hints.
Set type can be used for seen variable.
That stack = [...] + stack is a monster, both + operation and list comprehension.
Is it possible to use set there (what type graph[x] is?). Maybe, its more efficient to have stack in a list of lists form?
UPDATE: loop over the whole stack again and again can be another point of optimization.
UPDATE2: in case you have not solved the problem, one possible way is to eliminate the inefficient filtering of the stack against seen values. This can be achieved by having a set of not yet visited nodes and using set operations to get the next level from the current level. This is why I've advised to keep levels in the stack (list of sets - as particular order inside a level is not important, is it?). Operating on relatively small number of nodes in the latest level is more efficient than looping over the whole stack. Actually, already seen nodes can also be subtracted from "next level", so if the speed is ok, set of unvisited nodes can be redundant.
UPDATE3:
I think, the following produces result, which is stated in the docstring. Finishing time is a time when the node has no more children to process and is popped from the stack. Reverse order can also be easily achieved by changing .append(x) to insert(0, x).
def dfs(graph, node):
"""Run DFS through the graph from the starting
node and return the nodes in order of finishing time.
"""
seen = {node}
stack = [node]
result = []
while stack:
x = stack[-1]
nxt = set(graph.get(x, [])) - seen
if nxt:
stack.extend(list(nxt))
seen |= nxt
else:
result.append(x)
stack.pop()
return result
Remark: the algorithm does not preserve the order of children. Minor modification may be needed to preserve the order of graph.get(x, []), in if nxt:
stack.extend(nx for nx in graph[x] if nx in nxt)
-
\$\begingroup\$ I tried using sets, I've edited the question description to include the edited code. The running time is not very different from my second attempt. Can you tell me where I'm going wrong? \$\endgroup\$vaishaks– vaishaks2013年08月06日 10:52:30 +00:00Commented Aug 6, 2013 at 10:52
-
\$\begingroup\$ @Roman The type of
graph[x]
is a list: (quoting vaishaks) "Input: Graph as adjacency list Eg:{1:[2, 3], 2:[1], 3:[]}
" \$\endgroup\$flornquake– flornquake2013年08月06日 12:20:14 +00:00Commented Aug 6, 2013 at 12:20 -
\$\begingroup\$ Those can be easily converted to sets, of course. (if the order of traversal is not important) \$\endgroup\$Roman Susi– Roman Susi2013年08月06日 15:19:43 +00:00Commented Aug 6, 2013 at 15:19
-
\$\begingroup\$ @Roman Update3 is achieving exactly what I needed and it's running in good time. Thanks a lot Roman! \$\endgroup\$vaishaks– vaishaks2013年08月09日 07:23:11 +00:00Commented Aug 9, 2013 at 7:23
def dfs(graph, node):
This isn't a depth first search
"""Run DFS through the graph from the starting
node and return the nodes in order of finishing time.
"""
seen = set()
stack = {node}
You aren't using this as a stack, don't call it that
while True:
flag = True
avoid flag variables, they confound code. Also that's a very undescriptive name.
for x in stack:
if x not in seen:
Seen these are sets, you should be probably use: for x in stack - seen:
which should be more efficient
stack = stack.union(set(graph[x]))
Using this approach will make things rather slow, you just keep making stack bigger and bigger.
seen = seen.union({x})
Use seen.add to add things to the set. Using union like this will be much slower
flag = False
break
if flag:
break
return list(stack)
The set gives you no guarantees as to the order of the elements. I'm guessing that's now what you wanted. As it stands,this will end up giving you all the elements in your graph in a random order.
My approach
def breadth_first_search(graph, start):
seen = set()
current = {start}
history = []
while current:
history.extend(current) # record when we found all these elements
seen.update(current) # add them to the list of elements we've already seen
neighbours = set() # generate a set of all elements that we can reach from these ones
for node in current:
neighbours.update(graph[node])
current = neighbours - seen # the next round of elements is this one, except those we've already seen
return history
-
\$\begingroup\$ That is a brilliant answer Winston. I learnt a lot from that. But my application requires me to return the nodes in order of finishing times the way dfs would finish. I agree my iterative code looks a lot like bfs. And your code works perfectly well if I needed the nodes to be returned in order of finishing times the way bfs would. But the code I wrote gives me the right answer for small graphs. It's not at all random(because of using sets) as one would expect it to be. I find that very weird now that I know sets are un-ordered. Can you please tell me how you would do it for dfs? \$\endgroup\$vaishaks– vaishaks2013年08月07日 14:09:06 +00:00Commented Aug 7, 2013 at 14:09
-
\$\begingroup\$ @vaishaks, your code working for small graphs is greatly mysterious. Any chance you can get me the data you ran it against? \$\endgroup\$Winston Ewert– Winston Ewert2013年08月07日 17:13:39 +00:00Commented Aug 7, 2013 at 17:13
-
\$\begingroup\$ Sorry the output of the set code makes no sense actually. My bad. I've now reverted to function 2. The one with the defaultdict. Graph: {1: [3], 2: [1], 3: [2], 4: [3, 5, 6], 6: [8], 7: [6], 8: [7]} Start: 8, Your approach Output: [8, 7, 6] Required output: [6, 7, 8]. Start: 4, Your approach Output: [4, 3, 5, 6, 8, 2, 1, 7], Required output: [7, 8, 1, 2, 3, 5, 6, 4] \$\endgroup\$vaishaks– vaishaks2013年08月07日 18:20:11 +00:00Commented Aug 7, 2013 at 18:20
-
\$\begingroup\$ @vaishaks, ok. why do you need this in dfs order? \$\endgroup\$Winston Ewert– Winston Ewert2013年08月07日 20:36:29 +00:00Commented Aug 7, 2013 at 20:36
-
\$\begingroup\$ @vaishaks: You're right that it's not bfs (I was mistaken earlier), but it's not dfs either. (This may or may not be what you want.) E.g., in your last example, dfs would start at 4, go to one of the successors of 4, (let's say 3). Then dfs would take one of the children of 3, and so on. The other successor of 4 would only be traversed later. Also, is it important that the output be in "reversed" order (starting point at the end)? \$\endgroup\$flornquake– flornquake2013年08月08日 07:18:24 +00:00Commented Aug 8, 2013 at 7:18
Explore related questions
See similar questions with these tags.