There is a function which is getting maximum values of each period
-length interval in array.
I am trying to optimize this function by performance.
void maximumsT(const std::vector<double> &src, size_t period, std::vector<double> &dst)
{
dst.resize(src.size());
for (size_t i = 1; i <= period; ++i) {
dst[i - 1] = *std::max_element(src.begin(), src.begin() + i);
}
for (size_t i = period; i <= src.size(); ++i) {
dst[i - 1] = *std::max_element(src.begin() + i - period, src.begin() + i);
}
}
I asked this SO question
and got this solution.
Then I tried to implement the solution.
#include <vector>
#include <deque>
#include <algorithm>
void maximumsT(const std::vector<double> &src, size_t period, std::vector<double> &dst)
{
dst.resize(src.size());
for (size_t i = 1; i <= period; ++i) {
dst[i - 1] = *std::max_element(src.begin(), src.begin() + i);
}
for (size_t i = period; i <= src.size(); ++i) {
dst[i - 1] = *std::max_element(src.begin() + i - period, src.begin() + i);
}
}
void maximums(const std::vector<double> &src, size_t period, std::vector<double> &dst)
{
dst.resize(src.size());
std::deque<std::pair<size_t, double>> deq;
for (size_t i = 0; i < src.size(); ++i) {
if (!deq.empty() && i >= period && deq.front().first <= i - period) {
deq.pop_front();
}
while (!deq.empty() && deq.back().second < src[i]) {
deq.pop_back();
}
deq.push_back(std::make_pair(i, src[i]));
dst[i] = deq.front().second;
}
}
bool test()
{
std::vector<double> v(1 + rand() % 1000);
for (double &e : v) {
e = (double)rand() / RAND_MAX;
}
size_t period = 1 + (rand() % v.size());
std::vector<double> vv, vvv;
maximumsT(v, period, vv);
maximums (v, period, vvv);
return vv == vvv;
}
bool test(size_t iterations)
{
for (size_t i = 0; i < iterations; ++i) {
if (!test()) {
return false;
}
}
return true;
}
int main(int argc, char *argv[])
{
return test(10000);
}
How can I improve and optimize this code?
2 Answers 2
The code in maximums
is doing what it is supposed to do, based on that SO question. I have several remarks about the other parts though.
Do not use single letter variables. Use names that say something.
So no v, vv, vvv
.
Another naming issue, your two maximums functions do the same thing but with a different approach, you can make this clear in the name. They both use T=period, so this is not distinguishing. Maybe use the fact that a deque is used or that one is the simple variant.
You create double vectors in the test and then pass them to maximums
for example. There is absolutely no reason for test
to create them. Let maximums
create the vector, since it needs it and then return it. Some for maximumsT
.
Your maximumsT
is not intuitive. There is no need to do a double loop.
Why did you change your implementation as in your question on SO? It was better there.
Namely
for (size_t i = period; i <= v.size(); ++i) {
vv[i - 1] = *std::max_element(v.begin() + i - period, v.begin() + i);
}
The code will not be faster or save space, but its shorter and clearer.
Initializing random vectors to test against, could be done in a dedicated function. I would make the test also parameterized on the source vector.
My suggestion:
#include <vector>
#include <deque>
#include <algorithm>
std::vector<double> maximumsT(const std::vector<double> &src, size_t period)
{
std::vector<double> dst(src.size());
for (size_t i = period; i <= v.size(); ++i) {
vv[i - 1] = *std::max_element(v.begin() + i - period, v.begin() + i);
}
return dst;
}
std::vector<double> maximums(const std::vector<double> &src, size_t period)
{
std::vector<double> dst(src.size());
std::deque<std::pair<size_t, double>> deq;
for (size_t i = 0; i < src.size(); ++i) {
if (!deq.empty() && i >= period && deq.front().first <= i - period) {
deq.pop_front();
}
while (!deq.empty() && deq.back().second < src[i]) {
deq.pop_back();
}
deq.push_back(std::make_pair(i, src[i]));
dst[i] = deq.front().second;
}
return dst;
}
std::vector<double> randomVect()
{
std::vector<double> v(1 + rand() % 1000);
for (double &e : v) {
e = (double)rand() / RAND_MAX;
}
return v
}
bool compareMaximums(std::vector<double> v, size_t period)
{
return maximumsT(v, period) == maximums(v, period);
}
bool test(size_t iterations)
{
for (size_t i = 0; i < iterations; ++i) {
std::vector<double> srcVec = randomVect();
size_t period = 1 + (rand() % srcVec.size());
if (!compareMaximums(srcVec, period)) {
return false;
}
}
return true;
}
int main(int argc, char *argv[])
{
return test(10000);
}
There is a sub-field in image processing called Mathematical Morphology. The operation you are implementing is a core concept in this field, called dilation. Obviously, this operation has been studied extensively and we know how to implement it very efficiently.
The most efficient algorithm for this problem was proposed in 1992 and 1993, independently by van Herk, and Gil and Werman. This algorithm needs exactly 3 comparisons per sample, independently of the size of period
.
Some years later, Gil and Kimmel further refined the algorithm to need only 2.5 comparisons per sample. Though the increased complexity of the method might offset the fewer comparisons (I find that more complex code runs more slowly). I have never implemented this variant.
The HGW algorithm, as it's called, needs two intermediate buffers of the same size as the input. For ridiculously large inputs (billions of samples), you could split up the data into chunks and process it chunk-wise.
In sort, you walk through the data forward, computing the cumulative max over chunks of size period
. You do the same walking backward. Each of these require one comparison per sample. Finally, the result is the maximum over one value in each of these two temporary arrays. For data locality, you can do the two passes over the input at the same time.
I guess you could even do a running version, where the temporary arrays are of length 2*period
, but that would be more complex to implement.
van Herk, "A fast algorithm for local minimum and maximum filters on rectangular and octagonal kernels", Pattern Recognition Letters 13(7):517-521, 1992 (doi)
Gil, Werman, "Computing 2-D min, median, and max filters", IEEE Transactions on Pattern Analysis and Machine Intelligence 15(5):504-507 , 1993 (doi)
Gil, Kimmel, "Efficient dilation, erosion, opening, and closing algorithms", IEEE Transactions on Pattern Analysis and Machine Intelligence 24(12):1606-1617, 2002 (doi)
very much
is not a quantifiable amount. \$\endgroup\$