I am implementing a flood fill algorithm using Python and NumPy. I have written the following fill function which works fine:
def fill(self, data, xsize, ysize, x_start, y_start):
stack = [(x_start, y_start)]
while stack:
x, y, stack = stack[0][0], stack[0][1], stack[1:]
if data[x, y] == 1:
data[x, y] = 2
if x > 0:
stack.append((x - 1, y))
if x < (xsize - 1):
stack.append((x + 1, y))
if y > 0:
stack.append((x, y - 1))
if y < (ysize - 1):
stack.append((x, y + 1))
I wanted to implement a way that would eliminate the need to re-check a point if it had already been checked. By creating a new list of checked points and then comparing to see if the current point is already in that list, but this ended up making it slower and I wanted to understand if there is a more efficient way to improve upon my existing implementation.
1 Answer 1
Instead of using a list
to store your points (which has an \$O(n)\$ look-up time), you can use a set
object which has an \$O(1)\$ look-up time. For more information regarding Pythonic time complexity, you can look here.
I have a couple comments on your style:
- Instead of
x
andy
, I would usecolumn/col
androw
. Even thoughx
andy
are mathematical coordinate names, I still prefer to be more explicit. Group declarations are OK in the right circumstances, like unpacking an iterator or defining groups of like variables. The way you do this doesn't quite fit into the 'right circumstances' category: you are assigning and then slicing.
To get a better implementation, there are a couple of options:
- Split the statement into its components (assigning
x
,y
and slicing the list) Calling
list.pop()
to do both at the same timeThese two options, in terms of efficiency, are probably identical. So, its up to you to choose which one you want to use. Here is their syntax:
# 2 statements. col, row = stack[0] stack = stack[1:] # Using `pop`. col, row = stack.pop(0)
In Python 3, use the
*
operator to unpack the listThis option is the least applicable (due to your comments below). However, for reference here is that syntax:
(col, row), *stack = stack
This assigns the first element of the stack to
(col, row)
and the rest to itself, effectively popping off the first element.
- Split the statement into its components (assigning
P.S. Congratulations. This is probably my shortest review on this site.
-
\$\begingroup\$ Thanks for your response. I am probably missing something fundamental here (new to python and the set data type) my data is of type numpy.ndarray and my stack is declared as a list. When I try to implement your change for the group declaration in place of the assigning and slicing I have currently, I get an invalid syntax error indicating the * character. Would you be able to provide any idea as to how I have implemented it incorrectly (I have just tried inserting the sample line you gave directly in place of what I had before). \$\endgroup\$Aesir– Aesir2014年06月17日 22:03:11 +00:00Commented Jun 17, 2014 at 22:03
-
2\$\begingroup\$ @Aesir
(col, row), *stack = stack
is a syntax error on Python 2 but works on Python 3. \$\endgroup\$Janne Karila– Janne Karila2014年06月18日 06:47:05 +00:00Commented Jun 18, 2014 at 6:47 -
1\$\begingroup\$ @JanneKarila Good catch. \$\endgroup\$BeetDemGuise– BeetDemGuise2014年06月18日 11:48:21 +00:00Commented Jun 18, 2014 at 11:48