5
\$\begingroup\$

I finished a program to do connected component analysis using union - find algorithm. Is it true that the complexity of the code is \$O(N logN)\,ドル where \$N\$ is the total number of pixels (512x512 by example, not 512).

Is there any way to improve the performance?

import cv2
import numpy as np
import random
class QuickUnionUF:
 def __init__(self, N):
 self.id = range(N)
 self.sz = [0] * N
 @classmethod
 def fromimage(self, im):
 self.id = im
 self.sz = [0] * len(im)
 def root(self, i):
 while (i != self.id[i]):
 i = self.id[i]
 return i
 def getresult(self):
 result = [self.root(i) for i in self.id]
 return result
 def connected(self, p, q):
 return self.root(p) == self.root(q)
 def union(self, p, q):
 i = self.root(p)
 j = self.root(q)
 if (i == j):
 return
 if (self.sz[i] < self.sz[j]):
 self.id[i] = j
 self.sz[j] += self.sz[i]
 else:
 self.id[j] = i
 self.sz[j] += self.sz[i] 
def bwlabel(im):
 M, N = im.shape[:2]
 qf = QuickUnionUF(M * N)
 for i in range(M - 1):
 for j in range(N - 1):
 if (im[i][j] == im[i][j+1]):
 qf.union(i * N + j, i * N + j + 1)
 if (im[i + 1][j] == im[i][j]):
 qf.union(i * N + j, (i + 1) * N + j)
 mask = np.reshape(np.array(qf.getresult()), (M, N))
 values = np.unique(mask).tolist()
 random.seed()
 colors = [(random.randint(0,255), random.randint(0,255), random.randint(0,255)) for k in range(len(values))]
 out = np.zeros((M, N, 3))
 for i in range(M):
 for j in range(N):
 label = values.index(mask[i][j])
 out[i,j] = colors[label]
 return out
im = cv2.imread("bw.jpg",cv2.IMREAD_GRAYSCALE)
out = bwlabel(im > 100)
cv2.imwrite("result.png", out)
asked Jun 1, 2014 at 19:34
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

Is it true that the complexity of the code is O(NlogN), where N is the total number of pixels (512x512 by example, not 512).

You main loop in bwlabelis:

for i in range(M - 1):
 for j in range(N - 1):
 if (im[i][j] == im[i][j+1]):
 qf.union(i * N + j, i * N + j + 1)
 if (im[i + 1][j] == im[i][j]):
 qf.union(i * N + j, (i + 1) * N + j)

So you are calling qf.union \$O(N*M) = O(n)\$ times. There you call self.root twice:

def root(self, i):
 while (i != self.id[i]):
 i = self.id[i]
 return i

What's the worst case running time of this? At most \$O(n)\,ドル since self.id has \$n\$ values and there are no cycles. However, since you are comparing ranks in union:

 if (self.sz[i] < self.sz[j]):
 self.id[i] = j
 self.sz[j] += self.sz[i]
 else:
 self.id[j] = i
 self.sz[j] += self.sz[i] 

You will have \$O(log\ n)\$ paths to root, so the total running time is \$O(n\ log\ n)\$.

Is there any way to improve the performance?

Yes, path compression. You descended down two paths from p and q to i and j and it took \$O(log\ n)\$. At each step you can point a node to its root, instead of just the next element. You can do this recursively like so:

def root(self, i):
 if i == self.id[i]:
 return i
 self.id[i] = self.root(self.id[i])
 return self.id[i]

Recursion adds its own cost, though, so whether it's faster in practice would require testing. An alternative would be to add a second iterative compression function:

def compress(self, i, r):
 while i != r:
 i, self.id[i] = self.id[i], r

Now you are walking the path twice, which again may or may not be faster than recursion. You should only need to call this for one of the paths, though.

answered Jun 6, 2014 at 8:34
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.