I tried solving Leetcode 01 Matrix problem. It is running too slow when solved using DFS approach.
Given a matrix consists of 0 and 1, find the distance of the nearest 0 for each cell.
The distance between two adjacent cells is 1.
Example 1
Input:
[[0,0,0],
[0,1,0],
[0,0,0]]
Output:
[[0,0,0],
[0,1,0],
[0,0,0]]
Note:
- The number of elements of the given matrix will not exceed 10,000.
- There are at least one 0 in the given matrix.
- The cells are adjacent in only four directions: up, down, left and right.
class Solution(object):
def updateMatrix(self, matrix):
if not matrix or not matrix[0]:
return []
m, n = len(matrix), len(matrix[0])
op = [[-1 for _ in range(n)] for _ in range(m)]
directions = [(1,0), (-1,0), (0, 1), (0, -1)]
def dfs(i,j):
if matrix[i][j] == 0:
return 0
if op[i][j] != -1:
return op[i][j]
matrix[i][j] = -1
closest_zero = float('inf')
for direction in directions:
x,y = direction[0] + i , direction[1] + j
if 0 <= x < m and 0 <= y < n and matrix[x][y] != -1:
closest_zero = min(dfs(x,y), closest_zero)
closest_zero += 1
matrix[i][j] = 1
return closest_zero
for i in range(m):
for j in range(n):
if matrix[i][j] == 1 and op[i][j] == -1:
op[i][j] = dfs(i,j)
elif matrix[i][j] == 0:
op[i][j] = 0
return op
It is running too slowly and I don't understand what is the reason for that. Most optimised solution have solved this using BFS.
-
2\$\begingroup\$ Fundamentally your problem is that you recompute things too many times. Consider what your program does on the 1000 x 1000 matrix which is all 1s except the bottom right which is a 0. If you could remember results you have computed, your approach could be performant. The bfs approach isn't just bfs, it's a bfs starting from the 0s, where you update the ops matrix as you go with candidate distances, in this way you avoid recomputations. \$\endgroup\$Countingstuff– Countingstuff2020年10月11日 21:08:35 +00:00Commented Oct 11, 2020 at 21:08
-
\$\begingroup\$ There is no reason to do a DFS. Find all the zeros, and mark them with a distance of 0. The cells up/down/left/right from them are a distance of 1, unless it already has a lower distance. The cells adjacent to the "1"s are a distance of 2, and so on until all cells are marked. Taking a look at flood fill algorithms may be helpful. \$\endgroup\$RootTwo– RootTwo2020年10月13日 20:45:22 +00:00Commented Oct 13, 2020 at 20:45
1 Answer 1
The algorithm is slow since it creates paths in all 4 directions at
each step. The algorithm is also using recursion, which is also slower
than a simple for
loop.
Consider a 5x5 matrix A
:
[[1 1 1 1 0]
[1 1 1 1 1]
[1 1 1 1 1]
[1 0 1 1 1]
[1 1 1 1 1]]
to find the distance of the top-left cell, the algorithm first moves down, then up, then right, and then left. It marks the cells it has already visited by a -1 to avoid infinite loops. So the first five steps will move down:
[[-1 1 1 1 0]
[-1 1 1 1 1]
[-1 1 1 1 1]
[-1 0 1 1 1]
[-1 1 1 1 1]]
now the algorithm cannot move further down since it has reached the maximum row number, and it tries to move in the next direction which is upwards. Here, it encounters a -1 and gives up on that direction since a -1 indicates that it has already visited that cell. Now it tries to move to the right instead:
[[-1 1 1 1 0]
[-1 1 1 1 1]
[-1 1 1 1 1]
[-1 0 1 1 1]
[-1 -1 1 1 1]]
In cell A(4,1)
(i.e. bottom row, second column) it does the same
checks and finds that it cannot move down, then it tries to move
upwards and encounters a 0 in cell A(3,1)
. At this point,
we are 6 levels deep into the recursion and the distance from
A(0,0)
to A(3,1)
is hence found to be 6 for now. So ideally the algorithm should now reject any further paths that exceeds 6 levels of
recursion. Unfortunately this is not the case; first the algorithm steps
back to recursion level 5 in cell A(4,1)
and continues with cell
A(4,2)
:
[[-1 1 1 1 0]
[-1 1 1 1 1]
[-1 1 1 1 1]
[-1 0 1 1 1]
[-1 -1 -1 1 1]]
from this cell it moves upwards all the way up to cell A(0,2)
:
[[-1 1 -1 1 0]
[-1 1 -1 1 1]
[-1 1 -1 1 1]
[-1 0 -1 1 1]
[-1 -1 -1 1 1]]
reaches a recursion level of 11. Here it can move either to the right or
to the left. Since the algorithm always tries right before left, it
moves to cell A(0,3)
and then continues downward to cell A(4,3)
:
[[-1 1 -1 -1 0]
[-1 1 -1 -1 1]
[-1 1 -1 -1 1]
[-1 0 -1 -1 1]
[-1 -1 -1 -1 1]]
The recursion level is now 16. Then it moves to right to cell A(4,4)
and
then upwards to cell A(0,4)
.
[[-1 1 -1 -1 0]
[-1 1 -1 -1 -1]
[-1 1 -1 -1 -1]
[-1 0 -1 -1 -1]
[-1 -1 -1 -1 -1]]
The recursion level is now 21. A zero is
finally found in cell A(0,4)
indicating a distance of 21 from cell
A(0,0)
. Still, the
algorithm continues to investigate useless paths (that is: paths
with recursion level more that 6 (remember that we have already found
a zero at distance 6) and moves back to cell A(1,4)
at recursion
level 20. Here it tries the remaining directions (left and right) but
none of those works, so level 20 is done. Then it enters back into
level 19, 18, 17, 16, 15:
[[-1 1 -1 -1 0]
[-1 1 -1 -1 1]
[-1 1 -1 -1 1]
[-1 0 -1 -1 1]
[-1 -1 -1 1 1]]
notice that it replaces the -1 with 1 as it completes a level. So now
A(1,4)
, A(2,4)
, A(3,4)
, A(4,4)
, and A(4,3)
are all reset to value 1. At
level 15, i.e. cell A(3,3)
, it has already tried to move down, so now
it tries to move up, but that does not work since cell A(3,2)
has a
-1. Then it tries to move to the right, to cell A(3,4)
, which works
since A(3,4)
is now 1 (and not -1). From cell A(3,4)
it first tries to
move down and reaches cell A(4,4)
. From that cell the only alternative
is to move left and at recursion level 17 it reaches cell A(4,3)
:
[[-1 1 -1 -1 0]
[-1 1 -1 -1 1]
[-1 1 -1 -1 1]
[-1 0 -1 -1 -1]
[-1 -1 -1 1 -1]]
In this cell it cannot get further, there is a -1 in all directions, and it gives up on level 17, (and moves back to level ...).
The procedure should be clear by now. I will not continue further with this example, the point was just to illustrate why the algorithm is so slow.
In fact, in order to find the distance for A(0,0)
in this 5x5
matrix example it executes a whopping 22254 (!) calls to the recursive dfs()
method.
This simply to determine that the distance is 4 (which btw is found easily
by moving horizontally to the zero in cell A(0,4)
).
I think it is a fair guess that the algorithm has an exponential complexity. And it should take forever to run cases with more than say 100 cells (i.e. a 10x10 matrix).
Finally, here is an example of a much faster algorithm that should be able to find a solution for a 100x100 matrix in a fraction of a second:
import numpy as np
class Solution:
""" Solution to leetCode problem 542. 01 Matrix
Given a matrix consisting of 0 and 1, find the distance of the
nearest 0 for each cell. The distance between two adjacent cells is 1.
"""
def __init__(self, A):
self.A = A
def get_dist(self):
""" Get the distance matrix for self.A as defined in the
problem statement for problem 542. 01.
"""
A = self.A
(N, M) = A.shape
B = np.zeros(A.shape, dtype=int)
for i in range(N):
for j in range(M):
if A[i,j] == 1: # if A[i,j] == 0, B[i,j] is already set to 0
dist = 1
found = False
while not found:
for (x,y) in self.points(i, j, dist):
if A[x,y] == 0:
B[i,j] = dist
found = True
break
if not found:
dist = dist + 1
if dist > M+N:
raise Exception('Unexpected')
return B
def points(self, i, j, dist):
""" Generate all valid points a distance 'dist' away from (i,j)
The valid points will lie on the edge of a diamond centered on
(i,j). Use a generator to avoid computing unecessary points.
"""
(N, M) = self.A.shape
for k in range(dist):
if (i+k < N) and (j-dist+k >= 0):
yield (i+k, j-dist+k)
if (i+dist-k < N) and (j+k < M):
yield (i+dist-k, j+k)
if (i-k >= 0) and (j+dist-k < M):
yield (i-k, j+dist-k)
if (i-dist+k >= 0) and (j-k >= 0):
yield (i-dist+k, j-k)
Explore related questions
See similar questions with these tags.