1
\$\begingroup\$

As I was coding this algorithm I found that some variables that are meant to be global behave as they are intended to behave, but others do not. Specifically, my boolean arrays, e.g. processed and discovered, can be modified within any method without the need to declare them.

On the other hand, I had trouble with the variable time. Python complained about incrementing a variable without it being declared. The solution to this was to specifically declare time as a Global, and the complaints went away.

Can someone, in a clear way that makes sense, explain the why time needs to be declared Global and the other global variables do not? What is the difference between time and the other variables?

One other thing. Can someone check the correctness of the way I'm keeping track of time of entry and time of exit? According to The Algorithm Design Manual by Skiena, half the difference between time of entry and time of exit for a vertex equals the number of descendants. But I don't think that's what I'm seeing.

# Global variables
MAXV = 7
processed = [None]*MAXV # boolean list of which vertices have been processed
discovered = [None]*MAXV # boolean list of which vertices have been found
parent = [None]*MAXV # integer list of discovery relation
finished = False
entry_time = [None]*MAXV
exit_time = [None]*MAXV
time = 0
class Edgenode:
 def __init__(self, y = None, weight = None, next = None):
 """
 @param y: adjacency info
 @param weight: edge weight, if any
 @param next: next edge in list
 """
 self.y = y
 self.weight = weight
 self.next = next
class Graph:
 def __init__(self, edges = [], degree = [], nvertices = None, \
 nedges = None, directed = None):
 """
 @param edges: adjency info
 @param degree: outdegree of each vertex
 @param nvertices: number of vertices in graph
 @param nedges: number of edges in graph
 @param directed: is the graph directed? (boolean)
 """
 self.edges = edges # this might be a dictionary, with key being vertex and values being edges
 self.degree = degree
 self.nvertices = nvertices
 self.nedges = nedges
 self.directed = directed
def initialize_graph(graph, directed):
 """
 @param graph: Graph object
 @param directed: boolean
 """
 graph.nvertices = 0
 graph.nedges = 0
 graph.directed = directed
 for i in xrange(MAXV):
 graph.degree.append(0)
 graph.edges.append(None)
def read_graph(graph, directed):
 """
 @param graph: Graph object
 @param directed: boolean
 """
 initialize_graph(graph, directed)
 data = graph_data()
 graph.nvertices = data[0][0] # number of vertices
 m = data[0][1] # number of edges
 for i in xrange(1,m+1):
 x = data[i][0] # vertex x
 y = data[i][1] # vertex y
 insert_edge(graph,x,y,directed)
def insert_edge(graph, x, y, directed):
 """
 @param graph: Graph object
 @param x, y: vertices in edge (x,y)
 @param directed: boolean
 """
 p = Edgenode()
 p.y = y
 p.next = graph.edges[x] #p.next point to whatever is in edges[x] 
 graph.edges[x] = p #edges[x], gets replaced by the new p, which points to whatever was in edges[x] before
 graph.degree[x] += 1
 if (directed == False):
 insert_edge(graph, y, x, True)
 else:
 graph.nedges += 1
def graph_data():
 data = [
 (6,7),
 (1,2),
 (1,6),
 (1,5),
 (2,3),
 (2,5),
 (5,4),
 (3,4)]
 return data
def initialize_search(graph):
 finished = False
 for i in xrange(1,graph.nvertices+1):
 processed[i] = discovered[i] = False
 parent[i] = -1 # the parent of vertex i is parent[i]
 entry_time[i] = exit_time[i] = None
def dfs(graph,v):
 global time
 if finished: #allow for search termination
 return
 discovered[v] = True
 time = time + 1
 entry_time[v] = time
 process_vertex_early(v)
 p = graph.edges[v]
 while p != None:
 y = p.y
 if discovered[y] == False:
 parent[y] = v
 process_edge(v,y)
 dfs(graph,y)
 elif not processed[y] or graph.directed:
 process_edge(v,y)
 if finished:
 return
 p = p.next
 process_vertex_late(v)
 time = time + 1
 exit_time[v] = time
 processed[v] = True
def process_vertex_early(v):
 print "Discovered vertex %d\n"%v
def process_edge(x,y):
 print "Processed edge (%d,%d)\n"%(x,y)
def process_vertex_late(v):
 print "Processed vertex %d\n"%v
def main():
 graph = Graph()
 read_graph(graph, False)
 start = 1
 initialize_search(graph)
 dfs(graph,start)
 for i in xrange(1,graph.nvertices+1):
 print "Entry time for vertix %d: %d"%(i,entry_time[i])
 print "Exit time for vertix %d: %d\n"%(i,exit_time[i])
if __name__ == "__main__":
 main()
asked Mar 10, 2014 at 4:56
\$\endgroup\$
3
  • 1
    \$\begingroup\$ Some minor points to consider time is a stdlib module - that makes it a bad candidate for a global variable name! But as the code is written each new call to dfs() needs it. Why don't you pass the time down to each call and pass it back up as each recursion finishes? \$\endgroup\$ Commented Mar 10, 2014 at 6:07
  • \$\begingroup\$ I implemented it that way too, and it works, but I liked keeping time as global since it's consistent with the style of the algorithm. Plus, the Java implementation that I translated to Python treats time as global variable. I figure it's for a good reason. \$\endgroup\$ Commented Mar 10, 2014 at 6:13
  • \$\begingroup\$ Note that initialize_search assigns a local variable finished while dfs accesses a global variable of the same name. \$\endgroup\$ Commented Mar 10, 2014 at 17:53

1 Answer 1

3
\$\begingroup\$

Can someone, in a clear way that makes sense, explain the why time needs to be declared Global and the other global variables do not? What is the difference between time and the other variables?

The time variable is an integer and the boolean arrays are lists. In python lists can be modified in place but integers cannot; you need to assign a new value to time variable to modify it. The global declaration is only needed for the assignment to the global variable to work.

You would also need to declare the boolean arrays as global in any function or method that would need to assign a new value to them; modifying the existing lists in place as you are now doing does not require the global declaration.

Now, on a bit deeper level, this is about Python using a thing called references when declaring variables. Each variable is a reference to the actual "value". Changing the reference is done with the assignment operator (=). You can freely assign a new reference to any variable that you have declared locally (i.e. inside a function). But when a variable is declared in the global scope, changing the global reference inside another scope requires the global declaration in that scope. The reason for this is to prevent accidental modification of a global variable when the intention was to declare a new local variable with the same name; you have to be explicit when you want to assign to global variables.


As a general advice, you should avoid using global data because of the problems it tends to create when the code grows and needs to be maintained. If you intend to expand this code further, you should look into refactoring it to either a more object-oriented or more functional design.

An object-oriented solution would encapsulate the global variables and their management inside classes. One obvious starting point would be moving dfs() inside the Graph class with all the necessary data as its members.

A functional design would pass all the variables as parameters to the functions. A simple change to your code would be to declare the global variables inside the main() function and pass them to dfs() and other functions as needed.

answered Mar 10, 2014 at 9:27
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.