This is a follow-up question for Calculate distances between two multi-dimensional arrays in Matlab. Given a set SX
$$ S_{X} = \{ { X_{1}, X_{2}, X_{3}, \dots, X_{n} } \} $$
where n is the count of elements in set SX (the cardinality of set SX)
There are $${{ n }\choose{2}}$$ element-wise distances in set SX.
The mean and variance of these element-wise distances can be calculated with AverageIntraEuclideanDistancesPar
and VarIntraEuclideanDistancesPar
functions.
Example
%% Preparing data
DataCount = 10;
sizex = 8;
sizey = 8;
sizez = 8;
Collection = ones(sizex, sizey, sizez, DataCount);
for i = 1:DataCount
Collection(:, :, :, i) = ones(sizex, sizey, sizez) .* i;
end
%% Function testing
AIED = AverageIntraEuclideanDistancesPar(Collection)
VIED = VarIntraEuclideanDistancesPar(Collection)
The output result of example above:
AIED =
82.9672
VIED =
4.0328e+03
The experimental implementation
AverageIntraEuclideanDistancesPar
functionfunction [output] = AverageIntraEuclideanDistancesPar(X1) D = 0; for i = 1:size(X1, 4) element1 = X1(:, :, :, i); parfor j = i:size(X1, 4) element2 = X1(:, :, :, j); D = D + EuclideanDistance(element1, element2); end end k = 2; NormalizationFactor = (1 / nchoosek(size(X1, 4), k)); output = NormalizationFactor * D; end
VarIntraEuclideanDistancesPar
functionfunction [output] = VarIntraEuclideanDistancesPar(X1) Avg = AverageIntraEuclideanDistancesPar(X1); D = 0; for i = 1:size(X1, 4) element1 = X1(:, :, :, i); for j = i:size(X1, 4) element2 = X1(:, :, :, j); D = D + (EuclideanDistance(element1, element2) - Avg)^2; end end k = 2; NormalizationFactor = ( 1 / nchoosek(size(X1, 4), k)); output = NormalizationFactor * D; end
EuclideanDistance
functionfunction [output] = EuclideanDistance(X1, X2) %EUCLIDEANDISTANCE Calculate Euclidean distance between two inputs if ~isequal(size(X1), size(X2)) error('Sizes of inputs are not equal!') end output = sqrt(SquaredEuclideanDistance(X1, X2)); end
All suggestions are welcome.
The summary information:
Which question it is a follow-up to?
Calculate distances between two multi-dimensional arrays in Matlab
What changes has been made in the code since last question?
I am trying to implement
AverageIntraEuclideanDistancesPar
andVarIntraEuclideanDistancesPar
functions in this post.Why a new review is being asked for?
If there is any possible improvement, please let me know.
1 Answer 1
My only comment here is that
AIED = AverageIntraEuclideanDistancesPar(Collection)
VIED = VarIntraEuclideanDistancesPar(Collection)
computes the average twice, since VarIntraEuclideanDistancesPar
also calls AverageIntraEuclideanDistancesPar
. I would suggest writing a function that returns both the average and the variance.
That said, your way of computing variance is precise, but expensive because it computes the distances twice, first to determine the mean, and then again to determine the variance. I suggest you read this Wikipedia article on computing variance. Depending on the properties of the input, either the naive algorithm or Welford’s could be used.
Three more details:
VarIntraEuclideanDistancesPar
had "par" in the name, but doesn’t actually do its computation in parallel.parfor
should ideally be the outer loop, not the inner one. Starting up a parallel computation had overhead, you want to limit this overhead as much as possible.- Reshaping the arrays to be 2D would make your indexing operations simpler and easier to read. Reshaping is an essentially free operation.