01 Matrix is too slow when solved using DFS

Question 1

I tried solving Leetcode 01 Matrix problem. It is running too slow when solved using DFS approach.

Given a matrix consists of 0 and 1, find the distance of the nearest 0 for each cell.

The distance between two adjacent cells is 1.

Example 1

Input:
[[0,0,0],
 [0,1,0],
 [0,0,0]]
Output:
[[0,0,0],
 [0,1,0],
 [0,0,0]]

Note:

The number of elements of the given matrix will not exceed 10,000.
There are at least one 0 in the given matrix.
The cells are adjacent in only four directions: up, down, left and right.

class Solution(object):
 def updateMatrix(self, matrix):
 if not matrix or not matrix[0]:
 return []
 m, n = len(matrix), len(matrix[0])
 op = [[-1 for _ in range(n)] for _ in range(m)]
 directions = [(1,0), (-1,0), (0, 1), (0, -1)]
 def dfs(i,j):
 if matrix[i][j] == 0:
 return 0
 if op[i][j] != -1:
 return op[i][j]
 matrix[i][j] = -1
 closest_zero = float('inf')
 for direction in directions:
 x,y = direction[0] + i , direction[1] + j
 if 0 <= x < m and 0 <= y < n and matrix[x][y] != -1:
 closest_zero = min(dfs(x,y), closest_zero)
 closest_zero += 1
 matrix[i][j] = 1
 return closest_zero
 for i in range(m):
 for j in range(n):
 if matrix[i][j] == 1 and op[i][j] == -1:
 op[i][j] = dfs(i,j)
 elif matrix[i][j] == 0:
 op[i][j] = 0
 return op

It is running too slowly and I don't understand what is the reason for that. Most optimised solution have solved this using BFS.

Question 2

Fundamentally your problem is that you recompute things too many times. Consider what your program does on the 1000 x 1000 matrix which is all 1s except the bottom right which is a 0. If you could remember results you have computed, your approach could be performant. The bfs approach isn't just bfs, it's a bfs starting from the 0s, where you update the ops matrix as you go with candidate distances, in this way you avoid recomputations.

Question 3

There is no reason to do a DFS. Find all the zeros, and mark them with a distance of 0. The cells up/down/left/right from them are a distance of 1, unless it already has a lower distance. The cells adjacent to the "1"s are a distance of 2, and so on until all cells are marked. Taking a look at flood fill algorithms may be helpful.

Question 4

The algorithm is slow since it creates paths in all 4 directions at each step. The algorithm is also using recursion, which is also slower than a simple for loop.

Consider a 5x5 matrix A:

[[1 1 1 1 0]
 [1 1 1 1 1]
 [1 1 1 1 1]
 [1 0 1 1 1]
 [1 1 1 1 1]]

to find the distance of the top-left cell, the algorithm first moves down, then up, then right, and then left. It marks the cells it has already visited by a -1 to avoid infinite loops. So the first five steps will move down:

[[-1 1 1 1 0]
 [-1 1 1 1 1]
 [-1 1 1 1 1]
 [-1 0 1 1 1]
 [-1 1 1 1 1]]

now the algorithm cannot move further down since it has reached the maximum row number, and it tries to move in the next direction which is upwards. Here, it encounters a -1 and gives up on that direction since a -1 indicates that it has already visited that cell. Now it tries to move to the right instead:

[[-1 1 1 1 0]
 [-1 1 1 1 1]
 [-1 1 1 1 1]
 [-1 0 1 1 1]
 [-1 -1 1 1 1]]

In cell A(4,1) (i.e. bottom row, second column) it does the same checks and finds that it cannot move down, then it tries to move upwards and encounters a 0 in cell A(3,1). At this point, we are 6 levels deep into the recursion and the distance from A(0,0) to A(3,1) is hence found to be 6 for now. So ideally the algorithm should now reject any further paths that exceeds 6 levels of recursion. Unfortunately this is not the case; first the algorithm steps back to recursion level 5 in cell A(4,1) and continues with cell A(4,2):

[[-1 1 1 1 0]
 [-1 1 1 1 1]
 [-1 1 1 1 1]
 [-1 0 1 1 1]
 [-1 -1 -1 1 1]]

from this cell it moves upwards all the way up to cell A(0,2):

[[-1 1 -1 1 0]
 [-1 1 -1 1 1]
 [-1 1 -1 1 1]
 [-1 0 -1 1 1]
 [-1 -1 -1 1 1]]

reaches a recursion level of 11. Here it can move either to the right or to the left. Since the algorithm always tries right before left, it moves to cell A(0,3) and then continues downward to cell A(4,3) :

[[-1 1 -1 -1 0]
 [-1 1 -1 -1 1]
 [-1 1 -1 -1 1]
 [-1 0 -1 -1 1]
 [-1 -1 -1 -1 1]]

The recursion level is now 16. Then it moves to right to cell A(4,4) and then upwards to cell A(0,4).

[[-1 1 -1 -1 0]
 [-1 1 -1 -1 -1]
 [-1 1 -1 -1 -1]
 [-1 0 -1 -1 -1]
 [-1 -1 -1 -1 -1]]

The recursion level is now 21. A zero is finally found in cell A(0,4) indicating a distance of 21 from cell A(0,0). Still, the algorithm continues to investigate useless paths (that is: paths with recursion level more that 6 (remember that we have already found a zero at distance 6) and moves back to cell A(1,4) at recursion level 20. Here it tries the remaining directions (left and right) but none of those works, so level 20 is done. Then it enters back into level 19, 18, 17, 16, 15:

[[-1 1 -1 -1 0]
 [-1 1 -1 -1 1]
 [-1 1 -1 -1 1]
 [-1 0 -1 -1 1]
 [-1 -1 -1 1 1]]

notice that it replaces the -1 with 1 as it completes a level. So now A(1,4), A(2,4), A(3,4), A(4,4), and A(4,3) are all reset to value 1. At level 15, i.e. cell A(3,3), it has already tried to move down, so now it tries to move up, but that does not work since cell A(3,2) has a -1. Then it tries to move to the right, to cell A(3,4), which works since A(3,4) is now 1 (and not -1). From cell A(3,4) it first tries to move down and reaches cell A(4,4). From that cell the only alternative is to move left and at recursion level 17 it reaches cell A(4,3):

[[-1 1 -1 -1 0]
 [-1 1 -1 -1 1]
 [-1 1 -1 -1 1]
 [-1 0 -1 -1 -1]
 [-1 -1 -1 1 -1]]

In this cell it cannot get further, there is a -1 in all directions, and it gives up on level 17, (and moves back to level ...).

The procedure should be clear by now. I will not continue further with this example, the point was just to illustrate why the algorithm is so slow.

In fact, in order to find the distance for A(0,0) in this 5x5 matrix example it executes a whopping 22254 (!) calls to the recursive dfs() method. This simply to determine that the distance is 4 (which btw is found easily by moving horizontally to the zero in cell A(0,4)).

I think it is a fair guess that the algorithm has an exponential complexity. And it should take forever to run cases with more than say 100 cells (i.e. a 10x10 matrix).

Finally, here is an example of a much faster algorithm that should be able to find a solution for a 100x100 matrix in a fraction of a second:

import numpy as np
class Solution:
 """ Solution to leetCode problem 542. 01 Matrix
 Given a matrix consisting of 0 and 1, find the distance of the
 nearest 0 for each cell. The distance between two adjacent cells is 1.
 """
 def __init__(self, A):
 self.A = A
 def get_dist(self):
 """ Get the distance matrix for self.A as defined in the
 problem statement for problem 542. 01.
 """
 A = self.A
 (N, M) = A.shape
 B = np.zeros(A.shape, dtype=int)
 for i in range(N):
 for j in range(M):
 if A[i,j] == 1: # if A[i,j] == 0, B[i,j] is already set to 0
 dist = 1
 found = False
 while not found:
 for (x,y) in self.points(i, j, dist):
 if A[x,y] == 0:
 B[i,j] = dist
 found = True
 break
 if not found:
 dist = dist + 1
 if dist > M+N:
 raise Exception('Unexpected')
 return B
 def points(self, i, j, dist):
 """ Generate all valid points a distance 'dist' away from (i,j)
 The valid points will lie on the edge of a diamond centered on
 (i,j). Use a generator to avoid computing unecessary points.
 """
 (N, M) = self.A.shape
 for k in range(dist):
 if (i+k < N) and (j-dist+k >= 0):
 yield (i+k, j-dist+k)
 if (i+dist-k < N) and (j+k < M):
 yield (i+dist-k, j+k)
 if (i-k >= 0) and (j+dist-k < M):
 yield (i-k, j+dist-k)
 if (i-dist+k >= 0) and (j-k >= 0):
 yield (i-dist+k, j-k)

Håkon Hægland Håkon Hægland 9716 silver badges14 bronze badges · Answer 1 · 2020-10-14 21:50:01Z

The algorithm is slow since it creates paths in all 4 directions at each step. The algorithm is also using recursion, which is also slower than a simple for loop.

Consider a 5x5 matrix A:

[[1 1 1 1 0]
 [1 1 1 1 1]
 [1 1 1 1 1]
 [1 0 1 1 1]
 [1 1 1 1 1]]

to find the distance of the top-left cell, the algorithm first moves down, then up, then right, and then left. It marks the cells it has already visited by a -1 to avoid infinite loops. So the first five steps will move down:

[[-1 1 1 1 0]
 [-1 1 1 1 1]
 [-1 1 1 1 1]
 [-1 0 1 1 1]
 [-1 1 1 1 1]]

now the algorithm cannot move further down since it has reached the maximum row number, and it tries to move in the next direction which is upwards. Here, it encounters a -1 and gives up on that direction since a -1 indicates that it has already visited that cell. Now it tries to move to the right instead:

[[-1 1 1 1 0]
 [-1 1 1 1 1]
 [-1 1 1 1 1]
 [-1 0 1 1 1]
 [-1 -1 1 1 1]]

In cell A(4,1) (i.e. bottom row, second column) it does the same checks and finds that it cannot move down, then it tries to move upwards and encounters a 0 in cell A(3,1). At this point, we are 6 levels deep into the recursion and the distance from A(0,0) to A(3,1) is hence found to be 6 for now. So ideally the algorithm should now reject any further paths that exceeds 6 levels of recursion. Unfortunately this is not the case; first the algorithm steps back to recursion level 5 in cell A(4,1) and continues with cell A(4,2):

[[-1 1 1 1 0]
 [-1 1 1 1 1]
 [-1 1 1 1 1]
 [-1 0 1 1 1]
 [-1 -1 -1 1 1]]

from this cell it moves upwards all the way up to cell A(0,2):

[[-1 1 -1 1 0]
 [-1 1 -1 1 1]
 [-1 1 -1 1 1]
 [-1 0 -1 1 1]
 [-1 -1 -1 1 1]]

reaches a recursion level of 11. Here it can move either to the right or to the left. Since the algorithm always tries right before left, it moves to cell A(0,3) and then continues downward to cell A(4,3) :

[[-1 1 -1 -1 0]
 [-1 1 -1 -1 1]
 [-1 1 -1 -1 1]
 [-1 0 -1 -1 1]
 [-1 -1 -1 -1 1]]

The recursion level is now 16. Then it moves to right to cell A(4,4) and then upwards to cell A(0,4).

[[-1 1 -1 -1 0]
 [-1 1 -1 -1 -1]
 [-1 1 -1 -1 -1]
 [-1 0 -1 -1 -1]
 [-1 -1 -1 -1 -1]]

The recursion level is now 21. A zero is finally found in cell A(0,4) indicating a distance of 21 from cell A(0,0). Still, the algorithm continues to investigate useless paths (that is: paths with recursion level more that 6 (remember that we have already found a zero at distance 6) and moves back to cell A(1,4) at recursion level 20. Here it tries the remaining directions (left and right) but none of those works, so level 20 is done. Then it enters back into level 19, 18, 17, 16, 15:

[[-1 1 -1 -1 0]
 [-1 1 -1 -1 1]
 [-1 1 -1 -1 1]
 [-1 0 -1 -1 1]
 [-1 -1 -1 1 1]]

notice that it replaces the -1 with 1 as it completes a level. So now A(1,4), A(2,4), A(3,4), A(4,4), and A(4,3) are all reset to value 1. At level 15, i.e. cell A(3,3), it has already tried to move down, so now it tries to move up, but that does not work since cell A(3,2) has a -1. Then it tries to move to the right, to cell A(3,4), which works since A(3,4) is now 1 (and not -1). From cell A(3,4) it first tries to move down and reaches cell A(4,4). From that cell the only alternative is to move left and at recursion level 17 it reaches cell A(4,3):

[[-1 1 -1 -1 0]
 [-1 1 -1 -1 1]
 [-1 1 -1 -1 1]
 [-1 0 -1 -1 -1]
 [-1 -1 -1 1 -1]]

In this cell it cannot get further, there is a -1 in all directions, and it gives up on level 17, (and moves back to level ...).

The procedure should be clear by now. I will not continue further with this example, the point was just to illustrate why the algorithm is so slow.

In fact, in order to find the distance for A(0,0) in this 5x5 matrix example it executes a whopping 22254 (!) calls to the recursive dfs() method. This simply to determine that the distance is 4 (which btw is found easily by moving horizontally to the zero in cell A(0,4)).

I think it is a fair guess that the algorithm has an exponential complexity. And it should take forever to run cases with more than say 100 cells (i.e. a 10x10 matrix).

Finally, here is an example of a much faster algorithm that should be able to find a solution for a 100x100 matrix in a fraction of a second:

import numpy as np
class Solution:
 """ Solution to leetCode problem 542. 01 Matrix
 Given a matrix consisting of 0 and 1, find the distance of the
 nearest 0 for each cell. The distance between two adjacent cells is 1.
 """
 def __init__(self, A):
 self.A = A
 def get_dist(self):
 """ Get the distance matrix for self.A as defined in the
 problem statement for problem 542. 01.
 """
 A = self.A
 (N, M) = A.shape
 B = np.zeros(A.shape, dtype=int)
 for i in range(N):
 for j in range(M):
 if A[i,j] == 1: # if A[i,j] == 0, B[i,j] is already set to 0
 dist = 1
 found = False
 while not found:
 for (x,y) in self.points(i, j, dist):
 if A[x,y] == 0:
 B[i,j] = dist
 found = True
 break
 if not found:
 dist = dist + 1
 if dist > M+N:
 raise Exception('Unexpected')
 return B
 def points(self, i, j, dist):
 """ Generate all valid points a distance 'dist' away from (i,j)
 The valid points will lie on the edge of a diamond centered on
 (i,j). Use a generator to avoid computing unecessary points.
 """
 (N, M) = self.A.shape
 for k in range(dist):
 if (i+k < N) and (j-dist+k >= 0):
 yield (i+k, j-dist+k)
 if (i+dist-k < N) and (j+k < M):
 yield (i+dist-k, j+k)
 if (i-k >= 0) and (j+dist-k < M):
 yield (i-k, j+dist-k)
 if (i-dist+k >= 0) and (j-k >= 0):
 yield (i-dist+k, j-k)

Stack Exchange Network

01 Matrix is too slow when solved using DFS

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

01 Matrix is too slow when solved using DFS

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions