Input:
- data - a matrix of size n*m*3*3 (complex values)
- indices - a list of coordinates (x,y), where x < n and y < m
- fp - a feature parameter which is a tuple of ((fp11, fp12), (fp21, fp22)), id)
- reference - a list of 3*3 matrices
- swrd - a function which computes a similarity value between two complex valued 3*3 matrices
Output:
- feature_values - a list of features - one feature for each index in (indices)
Functionality:
Given an image (data) were each pixel is a 3*3 matrix. And there is a list of target pixels (indices). For each target pixel, I want to extract features of the patch surrounding it.
A patch feature is either: a) the swrd of a pixel in the patch with a reference matrix or b) the swrd of two pixels in the patch
Thus a feature can be described by the relative coordinates fp11, fp12 (x and y offset of pixel of interest 1) and fp21, fp22 (x and y offset of pixel of interest 2). If fp11 == fp21 and fp12== fp22, then i want to compute a), else i want to compute b). The reference matrix of interest is defined by the feature parameter called id.
Note that the indices of interest are already filtered so that the sum x+fp__ < n and y+fp__< m for all possible fp__.
Code
Computing the symetric revised wishart distance with regularization in case a matrix A or B is not invertible
def srwd(A, B):
"""This function computes the symetric revised wishart distance as from the paper
SPECTRAL CLUSTERING OF POLARIMETRIC SAR DATA WITH WISHART-DERIVED DISTANCE MEASURES"""
try:
dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)
except:
A, B = A.reshape(3, 3) + np.eye(3) * 0.000001, B.reshape(3, 3) + np.eye(3) * 0.000001
dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)
return abs(dist)
Getting the features with the input as given above:
def feature(data, indices, fp, reference):
# fp is a tuple of 2 coordinates in a patch ((x1,x2),(y1,y2),ref),
# where ref is an index of a random reference matrix in reference only relevant in case x1=y1 and x2=y2
res = []
if fp[0] != fp[1]:
for i in indices:
x, y = i
res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], data[x + fp[1][0]][y + fp[1][1]]))
else:
for i in indices:
x, y = i
res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], reference[fp[2]]))
return res
Finally there is another loop such as:
for fp in feature_params:
feature_values = feature(data, indices, fp, reference)
#here work on feature_values
The current implementation is rather inefficent and a bottleneck of the whole process. How could I improve it?
Is there a chance to efficiently compute a feature matrix efficiently and operate on it afterwards?
An executable toy example including the whole code is given here (allowing copy-paste)
import numpy as np
from numpy.linalg import inv
#toy example
data = np.random.rand(1000, 1000, 3, 3) #an image of 1000*1000 pixels, each pixel a 3*3 matrix
indices = np.random.randint(3,96, size = (10000,2)) # a list of 10000 target pixels (lets assume they are unique)
reference = [np.random.rand(3,3)] # a single reference matrix in a list (in actual application there are multiple reference matrices)
feature_params = [((0,0),(-1,-1), 0), ((0,0), (0,0), 0), ((0,1), (0,0), 0), ((1,0), (0,0), 0), ((1,1), (0,0), 0)]
def srwd(A, B):
"""This function computes the symetric revised wishart distance as from the paper
SPECTRAL CLUSTERING OF POLARIMETRIC SAR DATA WITH WISHART-DERIVED DISTANCE MEASURES"""
try:
dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)
except:
A, B = A.reshape(3, 3) + np.eye(3) * 0.000001, B.reshape(3, 3) + np.eye(3) * 0.000001
dist = 0.5 * np.trace(np.dot(A, inv(B)) + np.dot(B, inv(A))) - len(A)
return abs(dist)
def feature(data, indices, fp, reference):
# fp is a tuple of 2 coordinates in a patch ((x1,x2),(y1,y2),ref),
# where ref is an index of a random reference matrix in reference only relevant in case x1=y1 and x2=y2
res = []
if fp[0] != fp[1]:
for i in indices:
x, y = i
res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], data[x + fp[1][0]][y + fp[1][1]]))
else:
for i in indices:
x, y = i
res.append(srwd(data[x + fp[0][0]][y + fp[0][1]], reference[fp[2]]))
return res
for fp in feature_params:
feature_values = feature(data, indices, fp, reference)
#here work on feature_values
A final note about the dimensions of the actual problem:
- Image of size 6000*1700,
- around 500 features in feature_params
- indices is list of around 8.000.000 target indices
-
\$\begingroup\$ If there is any information that could improve the chance of a good answer, please make a request and i will further explain. \$\endgroup\$Nikolas Rieble– Nikolas Rieble2017年02月13日 14:00:22 +00:00Commented Feb 13, 2017 at 14:00
-
\$\begingroup\$ What exception is caught on srwd try-catch block/what causes an exception there? It might not be pythonic, but checking where to go before calculating might save some time over fail&catch \$\endgroup\$SchreiberLex– SchreiberLex2017年02月16日 14:27:10 +00:00Commented Feb 16, 2017 at 14:27
-
\$\begingroup\$ In case one of the two matrics (A or B) is not invertible, the inv() function returns an error. Regularization is therefore only used if exception is triggered. \$\endgroup\$Nikolas Rieble– Nikolas Rieble2017年02月16日 15:38:21 +00:00Commented Feb 16, 2017 at 15:38
1 Answer 1
I don't know enough matrix math to be able to help with those bits, but if indices
is large, you can get at least a small speedup by deconstructing fp
into individual local variables.
def feature(data, indices, fp, reference):
# fp is a tuple of 2 coordinates in a patch ((x1,x2),(y1,y2),ref),
# where ref is an index of a random reference matrix in reference only relevant in case x1=y1 and x2=y2
((x1, x2), (y1, y2), ref) = fp
res = []
if fp[0] != fp[1]:
for i in indices:
x, y = i
res.append(srwd(data[x + x1][y + x2], data[x + y1][y + y2]))
else:
for i in indices:
x, y = i
res.append(srwd(data[x + x1][y + x2], reference[ref]))
return res
It reads a bit cleaner, but it also runs faster because indexing into lists is (a hair) slower than accessing a local variable. But 'a hair' times 8x10^9 is likely measurable.
-
\$\begingroup\$ Although a tiny improvement regarding performance, it is an improvement and increases readability. Therefore I consider this a valid answer and reward the bounty. \$\endgroup\$Nikolas Rieble– Nikolas Rieble2017年02月20日 08:19:28 +00:00Commented Feb 20, 2017 at 8:19
Explore related questions
See similar questions with these tags.