Speeding up recursive iteration over array in Java

Question 1

I have a function that needs to compute a system response y in response to an input time series defined by x. In practice, x will be determined by an external process. y is then exposed to external processes as class member timeArray. Computing y entails two levels of recursion.

The function I need to optimize in the class below is update() which has potential to become a process bottleneck at current runtime: ~72 ms on first cycle, ~21ms after a couple dozen iterations. An additional consideration is that in production, this function will be called as a small step in a much larger application, so I'm guessing the cache hits that benefit subsequent runs may have less success when competing with other threads and processes in the JVM.

Pulling the step response calculation requiring a call to exp() out of the inner loop provided the greatest time savings so far, but I'm running out of ideas to further tune. Are there any optimizations I can make to improve execution speed of update()?

import static java.lang.Math.exp;
import static java.lang.Math.sin;
public class Looper {
 private static final int MAX_PROPAGATION_TIME = 7200;
 private static final int SIM_TIME = 100;
 public double[] timeArray = new double[MAX_PROPAGATION_TIME];
 public static void main(String []args) {
 Looper go = new Looper();
 go.run();
 }
 public void run() {
 double x[] = new double[MAX_PROPAGATION_TIME];
 double tau = 3;
 double scale = 1;
 double start = 0;
 double end = 0;
 for (int i = 0; i < SIM_TIME - 1; i++){
 for (int j = 0; j < MAX_PROPAGATION_TIME - 1; j++){
 x[j] = sin(i) * (1 - exp(-j / 10));
 }
 start = System.currentTimeMillis();
 // update() is the primary function of interest
 update(x,tau,scale);
 end = System.currentTimeMillis();
 System.out.println(String.format("Run %3d: %.0f ms",i,end - start));
 //System.out.println(String.format("timeArray[i + 10]: %.2f",this.timeArray[i + 10]));
 }
 }
 public void update(
 double[] x,
 double tau,
 double scale) {
 if(tau < 1e-12 / 3) tau = 1e-12 / 3;
 double dx;
 double y[] = new double[MAX_PROPAGATION_TIME];
 // Pre-compute a unit step response for efficiency
 double stepResponse[] = new double[MAX_PROPAGATION_TIME];
 for (int i = 0; i < MAX_PROPAGATION_TIME - 1; i++){
 stepResponse[i] = scale * (1 - exp(-i / tau));
 }
 // Loop over the full time array and antipated change in x at each time
 for (int i = 0; i < MAX_PROPAGATION_TIME - 1; i++){;
 dx = x[i+1] - x[i];
 // For each element subsequent to time i, evaluate change in y as an
 // exponential decay type response and accumulate incremental changes over time
 for (int j = i; j < MAX_PROPAGATION_TIME - 1; j++){
 y[j] += dx * stepResponse[j - i + 1];
 }
 }
 this.timeArray = y;
 }
}

Current Run Time Performance

In response to @harold's question: step response of y in response to a a single unit step in x at t = 0, with tau = 10 would look as follows. Over the full simulation time, each change in x at a given time can be treated as an additional step change to be accumulated over time.

Step Response.

Update:

As suspected, I had been paying too much attention in all the wrong places. Eliminating the inner loop to turn this back to an O(n) problem saves 1-2 orders of magnitude on computation time, thanks Harold!

enter image description here

Question 2

The exponential decay here, if I read it correctly, starts near zero and then decays up to 1. Is that how it's supposed to be? It's fine if it is, I'm just a bit confused about what's going on here (and it ruined my plans for a linear time algorithm, but maybe I can still do something about it)

Question 3

You should also be able to calculate the stepResponse outside of the update function. it only depends on scale and tau which are constant across the SIM_TIME loop

Question 4

I see instances of iteration: What's recursive about it?

Question 5

With SIM_TIME = 100, what is the last run you want completed?

Question 6

Speed

dx * scale * (1 - exp(-i / tau)) can be split up into two parts:

A part that is accumulated without decay, dx * scale
And an exponentially decreasing part, dx * scale * exp(-i / tau), which is subtracted from that.

Both of these can be written in an incremental form, keeping track of a couple of extra values across the loop to influence the elements at higher indices that way, instead of using an extra loop to "spread out" the influence explicitly.

double decayFactor = exp(-1.0 / tau);
double decaying = 0.0;
double accumulate = 0.0;
for (int i = 0; i < MAX_PROPAGATION_TIME - 1; i++){;
 dx = x[i+1] - x[i];
 accumulate += dx;
 decaying = (decaying + dx) * decayFactor;
 y[i] = (accumulate - decaying) * scale;
}

As far as I've tested it, the results are the same (close enough, apart from the usual floating point stuff), and it is very fast.

Miscellaneous

It is not necessary to "pre-declare" dx, you can put double dx = x[i+1] - x[i] right there in the loop.

Thanks to the way the faster implementation works, it also allows the computation to be done in-place (without allocating a new array y), of course you may have other reasons not to do that.

Question 7

Boom! Updated response with my new timings above, thanks for a fresh perspective!

score 6 · Accepted Answer · 2022-11-18 19:50:15Z

Speed

dx * scale * (1 - exp(-i / tau)) can be split up into two parts:

A part that is accumulated without decay, dx * scale
And an exponentially decreasing part, dx * scale * exp(-i / tau), which is subtracted from that.

Both of these can be written in an incremental form, keeping track of a couple of extra values across the loop to influence the elements at higher indices that way, instead of using an extra loop to "spread out" the influence explicitly.

double decayFactor = exp(-1.0 / tau);
double decaying = 0.0;
double accumulate = 0.0;
for (int i = 0; i < MAX_PROPAGATION_TIME - 1; i++){;
 dx = x[i+1] - x[i];
 accumulate += dx;
 decaying = (decaying + dx) * decayFactor;
 y[i] = (accumulate - decaying) * scale;
}

As far as I've tested it, the results are the same (close enough, apart from the usual floating point stuff), and it is very fast.

Miscellaneous

It is not necessary to "pre-declare" dx, you can put double dx = x[i+1] - x[i] right there in the loop.

Thanks to the way the faster implementation works, it also allows the computation to be done in-place (without allocating a new array y), of course you may have other reasons not to do that.

Boom! Updated response with my new timings above, thanks for a fresh perspective!

Stack Exchange Network

Speeding up recursive iteration over array in Java

Update:

1 Answer 1

Speed

Miscellaneous

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Speeding up recursive iteration over array in Java

Update:

1 Answer 1

Speed

Miscellaneous

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions