The following is my solution for an inverse mapping with bilinear interpolation on an image. The original image is img
and newmatrix
is the transformed image. invRot
is the inverse transformation matrix.
How can I optimize the nested for loops (or remove them altogether) to give me a better time complexity for large values of row and col?
newmatrix = np.zeros((row, col, 3), np.uint8)
for r in range(row):
for c in range(col):
if offset > 0:
offset = -1 * offset
pt = np.array([r+offset,c,1]) #Adjust the offset.
newpt = np.matmul(invRot, pt) #Reverse map by reverse rotation and pick up color.
#Check the bounds of the inverse pts we got and if they lie in the original image,
#then copy the color from that original pt to the new matrix/image.
if (newpt[0] >= 0 and newpt[0] < (yLen - 1) and newpt[1] >= 0 and newpt[1] < (xLen - 1)):
x = np.asarray(newpt[1])
y = np.asarray(newpt[0])
x0 = np.floor(x).astype(int)
x1 = x0 + 1
y0 = np.floor(y).astype(int)
y1 = y0 + 1
Ia = img[y0, x0]
Ib = img[y1, x0]
Ic = img[y0, x1]
Id = img[y1, x1]
color1 = (x1-x) * (y1-y) * Ia
color2 = (x1-x) * (y-y0) * Ib
color3 = (x-x0) * (y1-y) * Ic
color4 = (x-x0) * (y-y0) * Id
weightedAvgColor = color1 + color2 + color3 + color4
newmatrix[r][c] = weightedAvgColor
1 Answer 1
I'm sure there are ways to vectorize this, but I've learned from past examples that it's often easier to just use numba.jit
in these cases (where "these cases" means entirely numeric operations with simple loops and no complex python object interactions beyond numpy). You technically ask for a better "time complexity", which would require a different algorithm/approach to this problem. That's not so easy to provide, since your current approach fundamentally requires individually processing how each source pixel is represented in each target pixel; that's O(N)
or O(H*W)
depending on how you want to frame the number of pixels, and I don't see any way around that. This process could be parallelized using a GPU or vectorized code; however, the easiest thing to try first is to make your current code more efficient. If that provides the speedup you require, then you can stop there.
Just naively putting your code through numba.jit
, however, doesn't provide any speedup. To figure out why, I used numba.jit(nopython=True)
, which errors out when doing anything that numba can't convert into efficient C code. This showed a few minor things to change, such as converting np.floor(x).astype(int)
to int(np.floor(x))
(which is equivalent, since these are single integers and not arrays). Also, your modification of offset
seems like it would only run once and only on the first iteration, if at all, so I moved it outside the loop. Your bounds checking condition can be simplified with a little Python-Fu. And finally, I've modified your variable names to conform to PEP8 style.
The below code produces the same results are your original code, but is able to be efficiently compiled by numba.jit
, providing ~20x speedup in my tests.
import numpy as np
from numba import jit
@jit(nopython=True)
def numba_func(img, inv_rot, offset, row, col):
y_len, x_len, _ = img.shape
new_matrix = np.zeros((row, col, 3), np.uint8)
if offset > 0:
offset *= -1
for r in range(row):
for c in range(col):
pt = np.array([r + offset, c, 1])
y, x, _ = inv_rot @ pt #Reverse map by reverse rotation and pick up color.
#Check the bounds of the inverse pts we got and if they lie in the original image,
#then copy the color from that original pt to the new matrix/image.
if 0 <= y < (y_len - 1) and 0 <= x < (x_len - 1):
x0 = int(np.floor(x))
x1 = x0 + 1
y0 = int(np.floor(y))
y1 = y0 + 1
Ia = img[y0, x0]
Ib = img[y1, x0]
Ic = img[y0, x1]
Id = img[y1, x1]
color1 = (x1-x) * (y1-y) * Ia
color2 = (x1-x) * (y-y0) * Ib
color3 = (x-x0) * (y1-y) * Ic
color4 = (x-x0) * (y-y0) * Id
weighted_avg_color = color1 + color2 + color3 + color4
new_matrix[r, c] = weighted_avg_color
return new_matrix
If that's not fast(er) enough, there are other options, but certainly they'll require a more significant re-work of the code. Again, though, due to the nature of the problem, I don't think you'll be able to achieve better "time complexity", just faster code; this doesn't seem like the kind of problem with a better algorithmic approach, just better implementations.
-
\$\begingroup\$ I'll note, I also explored using numba's parallelization feature, and got either no speedup or some slowdown. \$\endgroup\$scnerd– scnerd2019年02月16日 09:27:43 +00:00Commented Feb 16, 2019 at 9:27
Explore related questions
See similar questions with these tags.
numba
if at all possible. \$\endgroup\$yLen
andxLen
? Androw
andcol
? Are they just the shape of the input image? \$\endgroup\$