Given a dictionary including multiple X-Y pairs where X, Y are both three dimensional structure. dictionaryBasedNonlocalMean
function returns a output
which is weighted sum of each Y in the dictionary. Each weight of Y are based on the Manhattan distance between the input
of dictionaryBasedNonlocalMean
function and the corresponding X in the dictionary. In mathematical form, output is calculated with the following way.
$$output = K\sum_{{i = 1}}^{{N}_{D}} \left[ G_{\mu = 0, \sigma} (\left\|input - X(i)\right\|_{1}) \cdot Y(i) \right] $$
where ND is the count of X-Y pairs (the cardinality of set X and set Y) in the dictionary and the Gaussian function
$$G_{\mu = 0, \sigma} (\left\|input - X(i)\right\|_{1}) = \left.e^{\frac{-(\left\|input - X(i)\right\|_{1} - \mu)^2}{2 {\sigma}^{2}}}\right|_{\mu=0} $$
Moreover, K is a normalization factor
$$K = \frac{1}{\sum_{{i = 1}}^{{N}_{D}} G_{\mu = 0, \sigma} (\left\|input - X(i)\right\|_{1})} $$
The experimental implementation
ManhattanDistance
functionfunction output = ManhattanDistance(X1, X2) output = sum(abs(X1 - X2), 'all'); end
dictionaryBasedNonlocalMean
functionfunction [output] = dictionaryBasedNonlocalMean(Dictionary, input) gaussian_sigma = 0.1; gaussian_mean = 0; if size(Dictionary.X) ~= size(Dictionary.Y) disp("Size of data in dictionary incorrect."); output = []; return end [X, Y, Z, DataCount] = size(Dictionary.X); weightOfEach = zeros(1, DataCount); for i = 1:DataCount % Gaussian of distance between X and input weightOfEach(i) = gaussmf(ManhattanDistance(input, Dictionary.X(:, :, :, i)), [gaussian_sigma gaussian_mean]); end sumOfDist = sum(weightOfEach, 'all'); output = zeros(X, Y, Z); %%% if sumOfDist too small if (sumOfDist < 1e-160) fprintf("sumOfDist = %d\n", sumOfDist); return; end for i = 1:DataCount output = output + Dictionary.Y(:, :, :, i) .* weightOfEach(i); end output = output ./ sumOfDist; end
Test case
ND = 10;
xsize = 8;
ysize = 8;
zsize = 1;
Dictionary = CreateDictionary(ND, xsize, ysize, zsize);
output = dictionaryBasedNonlocalMean(Dictionary, ones(xsize, ysize, zsize) .* 0.66)
function Dictionary = CreateDictionary(ND, xsize, ysize, zsize)
Dictionary.X = zeros(xsize, ysize, zsize, ND);
Dictionary.Y = zeros(xsize, ysize, zsize, ND);
for i = 1:ND
Dictionary.X(:, :, :, i) = ones(xsize, ysize, zsize) .* (i / ND);
Dictionary.Y(:, :, :, i) = ones(xsize, ysize, zsize) .* (1 + i / ND);
end
end
The output result of testing code above:
output =
1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622
1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622
1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622
1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622
1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622
1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622
1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622
1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622 1.6622
All suggestions are welcome.
-
\$\begingroup\$ Can you explain how this differs from the standard implementation? Usually you’d search the local neighborhood of a point for similar points. Why do you call this a dictionary, if you’re not indexing with a key? \$\endgroup\$Cris Luengo– Cris Luengo2025年04月06日 16:55:43 +00:00Commented Apr 6 at 16:55
-
\$\begingroup\$ @Cris Luengo As far as I know, the common non-local mean algorithm does not separate X-Y pairs. The formulation is like $$output = K\sum_{{i = 1}}^{{N}_{D}} \left[ G_{\mu = 0, \sigma} (\left\|input - X(i)\right\|_{1}) \cdot X(i) \right] $$ \$\endgroup\$JimmyHu– JimmyHu2025年04月07日 07:06:12 +00:00Commented Apr 7 at 7:06
1 Answer 1
This function, dictionaryBasedNonlocalMean()
, takes a "dictionary" of patches and an input patch, and returns the weighted mean of the dictionary patches. This dictionary contains two sets of patches: the ones that are compared to the input to determine the weight, and the ones averaged over. I do not know what the purpose is of making these different. Normally, this function would be called for each input pixel in an image, where the input patch would be the local neighborhood of that pixel.
That the set of sample patches is called a "dictionary" here is awkward, since we expect a dictionary to be a data structure where you index with a key rather than an integer. In more recent versions of MATLAB we now even have a dictionary
type (introduced after you wrote this code), which makes the naming even more confusing. But I assume that the paper you got this idea from uses this name.
The comparison size(Dictionary.X) ~= size(Dictionary.Y)
will produce an error if these two elements have different dimensionality (the size arrays have different length). Instead, use isequal(size(Dictionary.X), size(Dictionary.Y))
, which will be true only if the two inputs have the same size and the same values.
Note that the input patch should have the same sizes as the dictionary, except the third dimension. You could do
dataCount = size(Dictionary.X, 4);
patchSize = size(input, 1:3);
if ~isequal(size(Dictionary.X), [patchSize, dataCount]) ||
~isequal(size(Dictionary.Y), [patchSize, dataCount]) ||
~isequal(size(input), patchSize)
...
end
The last test there is to ensure that input
doesn't have more than 3 dimensions. Note that these tests will work also for 2D patches.
Note that I'm using patchSize
rather than [X, Y, Z]
here. It's easier with a single variable, but mostly I don't like using X
and Y
with two different meanings (you use the same names inside the dictionary). Also, the first dimension is vertical in MATLAB, so should be called Y
. Note that when the MATLAB documentation uses x
and y
, they refer to the horizontal and vertical axis respectively; they use i
and j
to refer to the first and section array dimensions.
Other than that, this is perfectly fine MATLAB code. You can think about vectorizing things, which might make the code a bit simpler, but could slow things down if the dictionary is large (because we'd be using large intermediate arrays, and memory access is expensive) -- you only know what is fastest if you try both ways.
To vectorize, first have ManhattanDistance()
sum over the first three dimensions instead of all dimensions. You can now compute
weightOfEach = gaussfm(ManhattanDistance(input, Dictionary.X), ...);
without a loop.
Likewise, the output can then be computed as
output = sum(Dictionary.Y .* shiftdim(weightOfEach, -3), 4);
The shiftdim
here might need to be adjusted depending on the shape of weightOfEach
, but the idea is to turn it into a 1
x1
x1
xdataCount
array.
Explore related questions
See similar questions with these tags.