Optimizing nested loop for calculating geodesic distance

Question 1

I segmented an image into N superpixels and constructed a graph based on that in which each superpixel is considered as a node. The information about neighbor superpixels is stored in glcms array. The weight between each pair of neighboring superpixels is stored in matrix W.

Finally, I want to compute geodesic distance between non-adjacent superpixels using graphshortestpath function. The following code does the mentioned process, however it is very time-consuming. Specifically, the last section of the code that computes geodesic distance takes more time than expected (more than 15 sec).

 Img=imread('input.jpg');
 [rows, columns, numberOfColorChannels] = size(Img);
 [L,N] = superpixels(Img,250);
 %Identifying neighborhood relationships
 glcms = graycomatrix(L,'NumLevels',N,'GrayLimits',[1,N],'Offset', [0,1;1,0]); %Create gray-level co-occurrence matrix from image
 glcms = sum(glcms,3); % add together the two matrices
 glcms = glcms + glcms.'; % add upper and lower triangles together, make it symmetric
 glcms(1:N+1:end) = 0; % set the diagonal to zero, we don't want to see "1 is neighbor of 1"
 data = zeros(N,3);
 for labelVal = 1:N
 redIdx = idx{labelVal};
 greenIdx = idx{labelVal}+numRows*numCols;
 blueIdx = idx{labelVal}+2*numRows*numCols;
 data(labelVal,1) = mean(Img(redIdx));
 data(labelVal,2) = mean(Img(greenIdx));
 data(labelVal,3) = mean(Img(blueIdx));
 end 
 Euc=zeros(N);
% Euclidean Distance
for i=1:N
 for j=1:N
 if glcms(i,j)~=0
 Euc(i,j)=sqrt(((data(i,1)-data(j,1))^2)+((data(i,2)-data(j,2))^2)+((data(i,3)-data(j,3))^2));
 end
 end
end
W=zeros(N);
W_num=zeros(N);
W_den=zeros(N);
OMG1=0.1;
for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 W_num(i,j)=exp(-OMG1*(Euc(i,j)));
 W_den(i,i)=W_num(i,j)+W_den(i,i);
 end
 end
end
for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 W(i,j)=(W_num(i,j))/(W_den(i,i)); % Connectivity Matrix W
 end
 end
end
s_star_temp=zeros(N); %temporary variable for geodesic distance measurement
W_sparse=zeros(N);
W_sparse=sparse(W);
for i=1:N
 for j=1:N
 if W(i,j)==0 & i~=j;
 s_star_temp(i,j)=graphshortestpath(W_sparse,i,j); % Geodesic Distance
 end
 end
end

The question is how to optimize the code to be more efficient , i.e. less time-consuming.

Question 2

Please format the code to make its structure clear to your potential reviewers. Having well-formatted and consistently indented code will help the reviewers concentrate on the more interesting topics.

Question 3

I'll first go through the different section and outline possible improvements, then give some general comments at the end. I am not checking the algorithm for correctness, as I don't know it.

Img=imread('input.jpg');
[rows, columns, numberOfColorChannels] = size(Img);
[L,N] = superpixels(Img,250);
%Identifying neighborhood relationships
glcms = graycomatrix(L,'NumLevels',N,'GrayLimits',[1,N],'Offset', [0,1;1,0]); %Create gray-level co-occurrence matrix from image
glcms = sum(glcms,3); % add together the two matrices
glcms = glcms + glcms.'; % add upper and lower triangles together, make it symmetric
glcms(1:N+1:end) = 0; % set the diagonal to zero, we don't want to see "1 is neighbor of 1"

You need to have defined idx by now, but you haven't, so the code doesn't run. Based on the superpixels documentation, you need to add idx = label2idx(L);

data = zeros(N,3);
for labelVal = 1:N
 redIdx = idx{labelVal};
 greenIdx = idx{labelVal}+numRows*numCols;
 blueIdx = idx{labelVal}+2*numRows*numCols;
 data(labelVal,1) = mean(Img(redIdx));
 data(labelVal,2) = mean(Img(greenIdx));
 data(labelVal,3) = mean(Img(blueIdx));
end

This is fine, it probably is fine to just write data(labelVal,1) = mean(Img(idx{1:N})), etc. but the performance should be similar.

Euc = zeros(N);
% Euclidean Distance
for i=1:N
 for j=1:N
 if glcms(i,j)~=0
 Euc(i,j)=sqrt(((data(i,1)-data(j,1))^2)+((data(i,2)-data(j,2))^2)+((data(i,3)-data(j,3))^2));
 end
 end
end

This can be replaced by Euc = pdist2(data,data).*(glcms~=0);

W=zeros(N);
W_num=zeros(N);
W_den=zeros(N);
OMG1=0.1;
for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 W_num(i,j)=exp(-OMG1*(Euc(i,j)));
 W_den(i,i)=W_num(i,j)+W_den(i,i);
 end
 end
end

This can be replaced by W_num = exp(-OMG1*Euc).*(Euc~=0); and W_den = sum(W_num); (here W_den is a vector, not a matrix, but that's what you want because you were only using diagonal elements of your W_den matrix).

for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 W(i,j)=(W_num(i,j))/(W_den(i,i)); % Connectivity Matrix W
 end
 end
end

This can be replace by W = W_num./W_den.'; and W(isnan(W)) = 0;.

s_star_temp=zeros(N); %temporary variable for geodesic distance measurement
W_sparse=zeros(N);
W_sparse=sparse(W);
for i=1:N
 for j=1:N
 if W(i,j)==0 & i~=j;
 s_star_temp(i,j)=graphshortestpath(W_sparse,i,j); % Geodesic Distance
 end
 end
end

Here you define W_sparse twice, you don't need to initialise it with zeros(N). This part is finding all shortest paths, and there is a builtin for that (which is probably a lot more efficient that generating the paths independently). Replace this whole section with s_star_temp = graphallshortestpaths(W).*(W==0); This seems to speed it up from ~15 seconds to 0.005 seconds.

So this is how I would write your algorithm

Img = imread('input.jpg');
[rows, columns, numberOfColorChannels] = size(Img);
[L,N] = superpixels(Img,250);
idx = label2idx(L);
%Identifying neighborhood relationships
glcms = graycomatrix(L,'NumLevels',N,'GrayLimits',[1,N],'Offset', [0,1;1,0]); %Create gray-level co-occurrence matrix from image
glcms = sum(glcms,3); % add together the two matrices
glcms = glcms + glcms.'; % add upper and lower triangles together, make it symmetric
glcms(1:N+1:end) = 0; % set the diagonal to zero, we don't want to see "1 is neighbor of 1"
data = zeros(N,3);
for labelVal = 1:N
 data(labelVal,1) = mean(Img(idx{labelVal}));
 data(labelVal,2) = mean(Img(idx{labelVal}+rows*columns));
 data(labelVal,3) = mean(Img(idx{labelVal}+2*rows*columns));
end
Euc = sparse(pdist2(data,data).*(glcms~=0));
OMG1 = 0.1;
W_num = exp(-OMG1*Euc).*(Euc~=0);
W_den = sum(W_num);
W = W_num./W_den.';
W(isnan(W)) = 0;
s_star_temp = graphallshortestpaths(W).*(W==0);

Notice that I made Euc a sparse matrix. This makes the resulting matrices (W_num, W_den, W) also sparse. Your original code left them as full matrices, which was unnecessary. This code runs in about 1.3 seconds on my machine, compared to about 16 for your original code (with the picture I used).

Minor general comments on your code:

Use spaces around your = signs, it makes things easier to read
Try to avoid very long lines, for example the graycomatrix line with its comment will go over the end of the window and be lost
Write more comments
Matlab best practice is to avoid using i and j for loop variables, prefer ii and jj. It's not very important, but i and j can also mean the imaginary unit.

Question 4

@CrisLuengo Euc is symmetric so it doesn't matter, but it does let you do W = W_num./W_den; on the next line.

Question 5

You're absolutely right, didn't notice that.

Question 6

Just a few tips and tricks to add to David's excellent answer.

data = zeros(N,3);
for labelVal = 1:N
 redIdx = idx{labelVal};
 greenIdx = idx{labelVal}+numRows*numCols;
 blueIdx = idx{labelVal}+2*numRows*numCols;
 data(labelVal,1) = mean(Img(redIdx));
 data(labelVal,2) = mean(Img(greenIdx));
 data(labelVal,3) = mean(Img(blueIdx));
end

Here, idx is linear indices into the first channel of Img. Img is a 3D array, with channels along the 3rd dimension. The indexing Img(redIdx), Img(greenIdx) and Img(blueIdx) simply index into idx, idx+number_of_pixels and idx+2*number_of_pixels. Instead, we can transform Img into a 2D array, where all pixels are along one dimension and channels along the other. Now idx indexes the first dimension, and Img(idx,:) are the pixels given by idx. The code above simplifies to:

data = zeros(N,3);
Img = reshape(Img,[],3); % note that this is essentially free, no data is copied.
for labelVal = 1:N
 data(labelVal,:) = mean(Img(idx{labelVal},:),1); % compute mean along 1st dimension
end

(Don't overwrite Img if you need it later, you can use a temporary variable instead, in either case there won't be any data copied.)

W_den=zeros(N);
for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 %...
 W_den(i,i)=W_num(i,j)+W_den(i,i);
 end
 end
end

Here you only use the diagonal elements of W_den, the other N*(N-1) elements are never used. Instead, define W_den as a vector:

W_den = zeros(N,1);
for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 %...
 W_den(i) = W_num(i,j) + W_den(i);
 end
 end
end

This saves a lot of memory, and also speeds up computation because of the reduced cache pressure.

Written this way, it is trival to see that we can rewrite:

W_den = zeros(N,1);
for i=1:N
 W_den(i) = sum(W_num(i,Euc(i,:)~=0));
end

But David already showed how to simplify this bit of code even further by using the knowledge that W_num is zero where Euc is zero. In this case, W_den = sum(W_num,2).

Euc=zeros(N);
% Euclidean Distance
for i=1:N
 for j=1:N
 if glcms(i,j)~=0
 Euc(i,j)=sqrt(((data(i,1)-data(j,1))^2)+((data(i,2)-data(j,2))^2)+((data(i,3)-data(j,3))^2));
 end
 end
end

David suggested using pdist, but we can also do this manually, which I show here to illustrate how to compute with matrices in MATLAB avoiding loops. I'll simplify the code step by step. First the expression within the inner loop can be simplified using vector operations:

Euc=zeros(N);
% Euclidean Distance
for i=1:N
 for j=1:N
 if glcms(i,j)~=0
 Euc(i,j)=sqrt(sum((data(i,:)-data(j,:)).^2));
 end
 end
end

Next we remove the if statement:

Euc=zeros(N);
% Euclidean Distance
for i=1:N
 for j=1:N
 Euc(i,j) = (glcms(i,j)~=0) * sqrt(sum((data(i,:)-data(j,:)).^2));
 end
end

Multiplying by the logical value glcms(i,j)~=0 will set any elements where glcms is zero to 0. Next we remove the inner loop:

Euc=zeros(N);
% Euclidean Distance
for i=1:N
 Euc(i,:) = (glcms(i,:)~=0) .* sqrt(sum((data(i,:)-data).^2,2));
end

Here we explicitly sum over the 2nd dimension, by default MATLAB sums over the first non-singleton dimension (dimension with more than one element). It is good practice to always explicitly state which dimension to sum over.

Finally, removing the outer loop is more complicated because it requires reshaping arrays:

% Euclidean Distance
Euc = (glcms~=0) .* sqrt(sum((reshape(data,[],1,3)-reshape(data,1,[],3)).^2,3));

We've reshaped data in two different ways, to a Nx1x3 array and to a 1xNx3 array. Subtracting these two arrays leads to a NxNx3 array. We take the square of the array, then sum along the new 3rd dimension (which corresponds to the original 2nd dimension).

David David 2411 silver badge6 bronze badges · Answer 1 · 2020-02-16 23:33:17Z

I'll first go through the different section and outline possible improvements, then give some general comments at the end. I am not checking the algorithm for correctness, as I don't know it.

Img=imread('input.jpg');
[rows, columns, numberOfColorChannels] = size(Img);
[L,N] = superpixels(Img,250);
%Identifying neighborhood relationships
glcms = graycomatrix(L,'NumLevels',N,'GrayLimits',[1,N],'Offset', [0,1;1,0]); %Create gray-level co-occurrence matrix from image
glcms = sum(glcms,3); % add together the two matrices
glcms = glcms + glcms.'; % add upper and lower triangles together, make it symmetric
glcms(1:N+1:end) = 0; % set the diagonal to zero, we don't want to see "1 is neighbor of 1"

You need to have defined idx by now, but you haven't, so the code doesn't run. Based on the superpixels documentation, you need to add idx = label2idx(L);

data = zeros(N,3);
for labelVal = 1:N
 redIdx = idx{labelVal};
 greenIdx = idx{labelVal}+numRows*numCols;
 blueIdx = idx{labelVal}+2*numRows*numCols;
 data(labelVal,1) = mean(Img(redIdx));
 data(labelVal,2) = mean(Img(greenIdx));
 data(labelVal,3) = mean(Img(blueIdx));
end

This is fine, it probably is fine to just write data(labelVal,1) = mean(Img(idx{1:N})), etc. but the performance should be similar.

Euc = zeros(N);
% Euclidean Distance
for i=1:N
 for j=1:N
 if glcms(i,j)~=0
 Euc(i,j)=sqrt(((data(i,1)-data(j,1))^2)+((data(i,2)-data(j,2))^2)+((data(i,3)-data(j,3))^2));
 end
 end
end

This can be replaced by Euc = pdist2(data,data).*(glcms~=0);

W=zeros(N);
W_num=zeros(N);
W_den=zeros(N);
OMG1=0.1;
for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 W_num(i,j)=exp(-OMG1*(Euc(i,j)));
 W_den(i,i)=W_num(i,j)+W_den(i,i);
 end
 end
end

This can be replaced by W_num = exp(-OMG1*Euc).*(Euc~=0); and W_den = sum(W_num); (here W_den is a vector, not a matrix, but that's what you want because you were only using diagonal elements of your W_den matrix).

for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 W(i,j)=(W_num(i,j))/(W_den(i,i)); % Connectivity Matrix W
 end
 end
end

This can be replace by W = W_num./W_den.'; and W(isnan(W)) = 0;.

s_star_temp=zeros(N); %temporary variable for geodesic distance measurement
W_sparse=zeros(N);
W_sparse=sparse(W);
for i=1:N
 for j=1:N
 if W(i,j)==0 & i~=j;
 s_star_temp(i,j)=graphshortestpath(W_sparse,i,j); % Geodesic Distance
 end
 end
end

Here you define W_sparse twice, you don't need to initialise it with zeros(N). This part is finding all shortest paths, and there is a builtin for that (which is probably a lot more efficient that generating the paths independently). Replace this whole section with s_star_temp = graphallshortestpaths(W).*(W==0); This seems to speed it up from ~15 seconds to 0.005 seconds.

So this is how I would write your algorithm

Img = imread('input.jpg');
[rows, columns, numberOfColorChannels] = size(Img);
[L,N] = superpixels(Img,250);
idx = label2idx(L);
%Identifying neighborhood relationships
glcms = graycomatrix(L,'NumLevels',N,'GrayLimits',[1,N],'Offset', [0,1;1,0]); %Create gray-level co-occurrence matrix from image
glcms = sum(glcms,3); % add together the two matrices
glcms = glcms + glcms.'; % add upper and lower triangles together, make it symmetric
glcms(1:N+1:end) = 0; % set the diagonal to zero, we don't want to see "1 is neighbor of 1"
data = zeros(N,3);
for labelVal = 1:N
 data(labelVal,1) = mean(Img(idx{labelVal}));
 data(labelVal,2) = mean(Img(idx{labelVal}+rows*columns));
 data(labelVal,3) = mean(Img(idx{labelVal}+2*rows*columns));
end
Euc = sparse(pdist2(data,data).*(glcms~=0));
OMG1 = 0.1;
W_num = exp(-OMG1*Euc).*(Euc~=0);
W_den = sum(W_num);
W = W_num./W_den.';
W(isnan(W)) = 0;
s_star_temp = graphallshortestpaths(W).*(W==0);

Notice that I made Euc a sparse matrix. This makes the resulting matrices (W_num, W_den, W) also sparse. Your original code left them as full matrices, which was unnecessary. This code runs in about 1.3 seconds on my machine, compared to about 16 for your original code (with the picture I used).

Minor general comments on your code:

Use spaces around your = signs, it makes things easier to read
Try to avoid very long lines, for example the graycomatrix line with its comment will go over the end of the window and be lost
Write more comments
Matlab best practice is to avoid using i and j for loop variables, prefer ii and jj. It's not very important, but i and j can also mean the imaginary unit.

@CrisLuengo Euc is symmetric so it doesn't matter, but it does let you do W = W_num./W_den; on the next line.

Cris Luengo Cris Luengo 7,0011 gold badge14 silver badges37 bronze badges · Answer 2 · 2020-02-18 17:59:29Z

Just a few tips and tricks to add to David's excellent answer.

data = zeros(N,3);
for labelVal = 1:N
 redIdx = idx{labelVal};
 greenIdx = idx{labelVal}+numRows*numCols;
 blueIdx = idx{labelVal}+2*numRows*numCols;
 data(labelVal,1) = mean(Img(redIdx));
 data(labelVal,2) = mean(Img(greenIdx));
 data(labelVal,3) = mean(Img(blueIdx));
end

Here, idx is linear indices into the first channel of Img. Img is a 3D array, with channels along the 3rd dimension. The indexing Img(redIdx), Img(greenIdx) and Img(blueIdx) simply index into idx, idx+number_of_pixels and idx+2*number_of_pixels. Instead, we can transform Img into a 2D array, where all pixels are along one dimension and channels along the other. Now idx indexes the first dimension, and Img(idx,:) are the pixels given by idx. The code above simplifies to:

data = zeros(N,3);
Img = reshape(Img,[],3); % note that this is essentially free, no data is copied.
for labelVal = 1:N
 data(labelVal,:) = mean(Img(idx{labelVal},:),1); % compute mean along 1st dimension
end

(Don't overwrite Img if you need it later, you can use a temporary variable instead, in either case there won't be any data copied.)

W_den=zeros(N);
for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 %...
 W_den(i,i)=W_num(i,j)+W_den(i,i);
 end
 end
end

Here you only use the diagonal elements of W_den, the other N*(N-1) elements are never used. Instead, define W_den as a vector:

W_den = zeros(N,1);
for i=1:N
 for j=1:N
 if(Euc(i,j)~=0)
 %...
 W_den(i) = W_num(i,j) + W_den(i);
 end
 end
end

This saves a lot of memory, and also speeds up computation because of the reduced cache pressure.

Written this way, it is trival to see that we can rewrite:

W_den = zeros(N,1);
for i=1:N
 W_den(i) = sum(W_num(i,Euc(i,:)~=0));
end

But David already showed how to simplify this bit of code even further by using the knowledge that W_num is zero where Euc is zero. In this case, W_den = sum(W_num,2).

Euc=zeros(N);
% Euclidean Distance
for i=1:N
 for j=1:N
 if glcms(i,j)~=0
 Euc(i,j)=sqrt(((data(i,1)-data(j,1))^2)+((data(i,2)-data(j,2))^2)+((data(i,3)-data(j,3))^2));
 end
 end
end

David suggested using pdist, but we can also do this manually, which I show here to illustrate how to compute with matrices in MATLAB avoiding loops. I'll simplify the code step by step. First the expression within the inner loop can be simplified using vector operations:

Euc=zeros(N);
% Euclidean Distance
for i=1:N
 for j=1:N
 if glcms(i,j)~=0
 Euc(i,j)=sqrt(sum((data(i,:)-data(j,:)).^2));
 end
 end
end

Next we remove the if statement:

Euc=zeros(N);
% Euclidean Distance
for i=1:N
 for j=1:N
 Euc(i,j) = (glcms(i,j)~=0) * sqrt(sum((data(i,:)-data(j,:)).^2));
 end
end

Multiplying by the logical value glcms(i,j)~=0 will set any elements where glcms is zero to 0. Next we remove the inner loop:

Euc=zeros(N);
% Euclidean Distance
for i=1:N
 Euc(i,:) = (glcms(i,:)~=0) .* sqrt(sum((data(i,:)-data).^2,2));
end

Here we explicitly sum over the 2nd dimension, by default MATLAB sums over the first non-singleton dimension (dimension with more than one element). It is good practice to always explicitly state which dimension to sum over.

Finally, removing the outer loop is more complicated because it requires reshaping arrays:

% Euclidean Distance
Euc = (glcms~=0) .* sqrt(sum((reshape(data,[],1,3)-reshape(data,1,[],3)).^2,3));

We've reshaped data in two different ways, to a Nx1x3 array and to a 1xNx3 array. Subtracting these two arrays leads to a NxNx3 array. We take the square of the array, then sum along the new 3rd dimension (which corresponds to the original 2nd dimension).

Stack Exchange Network

Optimizing nested loop for calculating geodesic distance

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Optimizing nested loop for calculating geodesic distance

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions