I have a very big Matlab simulation project in my hands, which I wanted to optimize, since I'm running it many times to tune parameters and the like.
Using Matlab's profile
I identified one function that is eating up most of my time, specifically the output(i,1)
assignment line.
This function is called a LOT, where input
is a 10x1
double passed as an argument, and output
is also a 10x1
vector.
function output = my_function(input)
a = size(input,1);
output = input*0;
dens = density(input);
% for each i, output(i) is the maximum between output(i+1) and mean(output(i+1:end))
for i = 1:a-1
output(i,1)= max(mean(dens(i+1:a,1)),dens(i+1,1));
end
output(a,1) = dens(a,1);
end
density()
function:
function dens = density(input)
dens = 1000*(1 - (input+288.9414)./(508929.2.*(input+68.12963)).*(input-3.9863).^2);
end
My ideas:
- I think vectorization would maybe help to get rid of the loop, but I'm not familiar at all with the technique.
- Is there a faster/alternative way to calculate the
mean
(maybe without Matlab's built-in function call?)
3 Answers 3
Vectorization
If you look at all the calls to mean
, you see that what you are calculating in each loop is a "Cumulative moving average", but in the oposite direction of what one usually would do.
A cumulative moving average can be calculated quite simply using the cumulative sum, cumsum
, and dividing by the number of elements. Since you're starting with all elements and up with only one, we need to flip it.
The running mean, (or cumulative moving average) can be calculated like this:
- First flip
dens
upside down - Take the cumulative sum of it
- Divide each element by the number of elements counted
- Flip it back up
Thus:
running_mean = flipud(cumsum(flipud(dens))./(1:numel(dens)).');
Now, we need to find which element is larger, the cumulative sum, or the original density variable. You are not including the first element of the cumulative sum, and you are appending dens(end)
at the end of the vector.
The maximum value of each row of a matrix can be found using max(A,[],2)
, where the number two represent the second dimension (columns). So, the values we want to use are:
max([running_mean(2:end), dens(2:end)],[],2)
In addition, we need to append the last element, so:
output = [max([running_mean(2:end), dens(2:end)],[],2); dens(end)];
To summarize, the entire operation requires no loops, one call to cumsum
, and one call to max
. In addition we need to flip it twice, but that takes virtually no time.
In total, your function can be written like this:
function output = my_function(vector)
n = numel(vector); % numel, not length
output = zeros(n,1) % prior to R2015b, use: output(n,1) = 0
% Create an anonymous function called density:
density = @(vector) 1000*(1 - (vector+288.9414)./(508929.2.*(vector+68.12963)).*(vector-3.9863).^2);
% Calculate the density:
dens = density(vector);
% Calculate the running mean, and create the final output vector:
running_mean = flipud(cumsum(flipud(dens))./(1:numel(dens)).');
output = [max([running_mean(2:end), dens(2:end)],[],2); dens(end)];
end
Time in Octave shows the improvement, as a function of vector size. For vectors with only ten elements, the execution time is 25% of the original code.
Review of your code:
Don't use
input
as a variable name!input
is used to ask users for a command line input. Using it as a variable name makes the function useless, but will also potentially create strange errors or warnings. (I called itvector
in this review. You should pick a more descriptive name)The same goes with
max
,mean
,length
etc. Don't use function names as variable names.Your input is a vector, so no need to specify that you always want the first column. Use linear indexing instead of subscripts. It's much faster, and you won't mess up the dimensions (oops, it was a column vector, not a row vector).
Don't use
size(vector,1)
, and don't uselength(vector)
, usenumel(vector)
. It's faster and more robust.You are pre-allocating
output
. Very good! That's a common bottleneck.output = vector*0;
is clever. Prior to Matlab R2015bzeros(n,1)
was actually far from the fastest way to create a vector of zeros! In fact,output(n,1) = 0
was up to 500 times faster! This is becausezeros
creates a matrix, then fills the elements with zeros, where as the latter just creates a matrix, where the values are zero by default.In your loop, don't use
i
as the iterator, it is used for the imaginary unit (sqrt(-1)
). It's common to use for instanceii
,jj
etc.If your
density
function is only what you have posted there, then you can simply create an anonymous function inside the main function:density = @(vector) 1000*(1 - (vector+288.9414)./(508929.2.*(vector+68.12963)).*(vector-3.9863).^2);
Call it like:
dens = density(vector);
Review of reviews:
There are a few things I don't like about the other reviews that I want to point out:
"You don't need to specify that
1
when indexing a vector, MATLAB knows". Many Matlab users usemax(A)
, when they want to find the maximum value of all columns in a matrix. It's much simpler thanmax(A, [], 1)
. You might however come into trouble ifA
has varying number of rows (and matrices often do). Because the moment that matrix has only one row,max(A)
will give the maximum value of the entire vector, instead of each (one row) column. This can give you a major headache!"Why not use
length
?" Well, in this case,length
is much better thansize
, butnumel
should still be the preferred choice.length
is better thansize
for two reasons. Usingsize
limits the function to only work on column vectors only, and will fail on row vectors. Second,numel
is much faster thanlength
for large vectors, and both are much faster thansize
for all vectors."Don't use the profiler, use
tic/toc
". True, the profiler includes the time it takes to run itself, but it's still the easiest way to identify bottlenecks. For exact timing however, don't usetic/toc
.timeit
was introduced in Matlab R2013b, and is much better thantic/toc
. If you are using an older version, the function is available at the File Exchange.tic/toc
is also dangerous, because it will reset the timer iftic
is called inside another function. Have a look at this question on SO from three years ago.
-
1\$\begingroup\$ That's a major improvement, nice answer! \$\endgroup\$2016年07月12日 14:06:05 +00:00Commented Jul 12, 2016 at 14:06
-
1\$\begingroup\$ Although it has been a long time since I touched this Matlab code, it is a very instructional and detailed answer. Very much appreciated! \$\endgroup\$titus.andronicus– titus.andronicus2016年07月12日 14:08:49 +00:00Commented Jul 12, 2016 at 14:08
A few notes:
I'm confused about the design of your function. If you are only processing column vectors, why not just process a row vector instead? That's what I would expect when calling this function.
my_function([1:2:20]') % how I have to call your function; convoluted and unexpected my_function(1:2:20) % how I expect to call your function; cleaner
Vectorization is definitely the way to go for efficiency in MatLab. One way to vectorize code is to use the
:
operator, which it seems you know how to use.Unfortunately, we can not nest colon operators in Matlab, so I do not know of a way to vectorize your code. Break apart how the colon operator works and how your code works to see why
% (i + 1):a where i is 1:(a - 1) [a is the length, 10 in this case] 2 3 4 5 6 7 8 9 10 % 2:10 3 4 5 6 7 8 9 10 % 3:10 4 5 6 7 8 9 10 % 4:10 5 6 7 8 9 10 % 5:10 % and so on % (2:10):10 2 3 4 5 6 7 8 9 10 % doesn't quite work... have to use a loop
There may be some other way to accomplish what you are trying to do, but I can't think of a different and more efficient way to approach it
You should be putting brackets around the return value(s) in your function declaration. They can be omitted, but it's better practice to keep them there.
Your current way of getting the size of your vector isn't how I would go about it.
a = size(input,1);
You are always going to have one of the dimensions be of size
1
, correct? So why not use thelength()
function instead, which always returns the greatest of the number of dimensions passed to it.a = length(input);
You have a clever way of getting a zeroed out vector of the same length.
output = input*0;
I'm not sure how the performance would compare, but using the function
zeros()
may be faster.You don't need to specify that
1
when indexing a vector, MatLab knows.a = size(input,1); a = size(input); % almost the same, is the same for your purposes
Don't use MatLab's profiler to measure the performance of your code. Why? Because in measuring the code, it also happens to measure itself running the profiling tests and compiling the results, which can take up quite some time on some machines (including mine).
The more accepted way to measure how long code takes to run is to use
tic
andtoc
, as such:>> tic; my_function([1:2:20]'); toc; Elapsed time is 0.000976 seconds.
Make sure you put them on the same line though like I did above, or it will also count the time it takes you to type out the function and execute it, as well as typing out
toc
and executing that.>> tic; >> my_function([1:2:20]'); >> toc; Elapsed time is 13.184153 seconds.
-
\$\begingroup\$ I've edited the question with a solution proposal. Not sure why the residual happens... \$\endgroup\$titus.andronicus– titus.andronicus2014年12月08日 11:48:18 +00:00Commented Dec 8, 2014 at 11:48
Warning: I don't have Matlab installed and haven't done much with it, so my syntax may be off.
% for each i, output(i) is the maximum between output(i+1) and mean(output(i+1:end))
for i = 1:a-1
output(i,1)= max(mean(dens(i+1:a,1)),dens(i+1,1));
end
output(a,1) = dens(a,1);
If I understand that correctly, the %
line is a comment. Then you have a for
loop from 1
to a-1
where you assign all but the last entry of output
. After the loop finishes, you assign the last entry of output.
The downside of this approach is that you have to sum up and divide the array for every entry. I think it would be better to do the end first and keep a running sum. I'm not sure of syntax, but I would expect that to look something like
count = 1;
sum = dens(a,1);
output(a,1) = dens(a,1);
for i = a-1:-1:2
output(i,1) = max((sum / count), dens(i+1,1));
sum = sum + dens(i,1);
count = count + 1;
end
output(1,1) = max(sum / count), dens(2,1));
Again, apologies if my syntax doesn't work. Hopefully this illustrates what I'm trying to do.
I stuck with the syntax that you were using in case I didn't understand syb0rg's comments properly.
Note: if you do this, don't forget to re-profile. It may be that there is an optimization on calculating the mean that outstrips the theoretical advantage of this approach.
-
\$\begingroup\$ The main point that bugged me with the question was that the mean function is called all the time, and it could probably be computed beforehand, to make the main loop faster. Your approach is similar to what I had in mind. \$\endgroup\$Gürkan Çetin– Gürkan Çetin2015年07月15日 20:34:50 +00:00Commented Jul 15, 2015 at 20:34
density
function defined? It isn't defined in the standard MatLab functions \$\endgroup\$density
is an external function I declared. It is not relevant for my question, asdens
is also a10x1
double vector \$\endgroup\$output(i,1)= max(mean(dens(i+1:a,1)),dens(i+1,1));
, thus the vectorization I mentioned \$\endgroup\$density
does matters because you might be able to calculateoutput(i,1)
from values inoutput
rather than taking a mean of values indens
. My offhand thought is that you are iterating in the wrong direction. Given what you are doing, you should iterate down not up and calculate the mean as you go. Someone like syb0rg with more knowledge of Matlab could probably give you better help if you offered a working example with all code. \$\endgroup\$density
function in the last edit \$\endgroup\$