Can this numpy code be vectorized?

Question 1

I've written the following function to produce n realizations of a CIR process for a given set of parameters:

def cir_simulations(alpha, mu, sigma, delta_t, n, num_sims):
 x = np.reshape(np.array([mu] * num_sims), (-1, 1))
 for i in range(0, n):
 x = np.concatenate((x, np.reshape(x[:, -1], (-1, 1)) + alpha * (
 mu - np.reshape(x[:, -1], (-1, 1))) * delta_t + sigma * np.sqrt(
 np.reshape(x[:, -1], (-1, 1))) * np.sqrt(delta_t) * np.random.normal(0, 1, size=(num_sims, 1))), axis=1)
 return x

This code works, but I am now wondering whether it would be possible to remove the loop and fully vectorize it. I've struggled to find a way of doing this, as the operation being performed in the loop is recursive (the values in the next column of the matrix are non-linearly dependent on the previous columns values).

Additionally, could this code be simplified? In particular, I feel like there may be needless complexity in the way that I am accessing the last column of the array using

np.reshape(x[:, -1], (-1, 1))

Question 2

Since the output of each step depends on the previous step, it is impossible to entirely eliminate the loop unless the formula can be simplified to allow a multi-step computation. However, the code can definitely still be improved.

Calling np.concatenate is generally inefficient since it requires memory reallocation and data copying. As long as possible, one should preallocate all the needed memory and update it iteratively.
Some computation can be moved out of the loop to improve efficiency. E.g., the np.random.normal call can be done only once. Same with alpha * delta_t.
More meaningful variable names could be chosen.
It is always a good practice to have docstrings and type hints in your code.

Here is one version of improved code:

def cir_simulations(alpha: float, mu: float, sigma: float, delta_t: float, sim_steps: int, num_sims: int):
 """
 Simulate the CIR process.
 Parameters:
 Input
 ...
 Output
 ...
 """
 output_shape = (num_sims, sim_steps + 1)
 sim_results = np.empty(output_shape)
 sigma_dW = np.random.normal(0, sigma * np.sqrt(delta_t), size=output_shape) 
 alpha_dt = alpha * delta_t
 sim_results[0, :] = r_prev = mu
 for r_t, sigma_dWt in zip(sim_results[1:], sigma_dW):
 r_t[:] = r_prev = (mu - r_prev) * alpha_dt + np.sqrt(r_prev) * sigma_dWt
 return sim_results

I'm not very familiar with the meaning of all the parameters in the formula. There are two things that I have doubts in your computation:

The rate \$r\$ in the formula is initialized with the same parameter mu used in the iterative computation. Is it expected?
The increment \$dW_t\$ used a fixed variance of 1. Is this supposed to be the case or should it be another function parameter?

Question 3

Cheers for the answer. The CIR model should produce stochastic processes which are mean-reverting, and so I chose the initial value to be mu as this is the mean of the process. My next step will be to plug this into the Heston model (which is used to simulate stochastic volatility) so I'll think harder about whether the initial value of the processes should be mu when I've understood the context of the CIR model as used in the Heston model.

Question 4

With regards to the $dW_t$'s having fixed variance 1, the actual variance should be $dt$. I have accounted for this by multiplying the simulated standard normal values by $\sqrt{dt}$.

Question 5

OK. I just corrected some mistakes in the code.

GZ0 GZ0 2,3318 silver badges19 bronze badges · Accepted Answer · 2020-10-24 14:24:47Z

Since the output of each step depends on the previous step, it is impossible to entirely eliminate the loop unless the formula can be simplified to allow a multi-step computation. However, the code can definitely still be improved.

Calling np.concatenate is generally inefficient since it requires memory reallocation and data copying. As long as possible, one should preallocate all the needed memory and update it iteratively.
Some computation can be moved out of the loop to improve efficiency. E.g., the np.random.normal call can be done only once. Same with alpha * delta_t.
More meaningful variable names could be chosen.
It is always a good practice to have docstrings and type hints in your code.

Here is one version of improved code:

def cir_simulations(alpha: float, mu: float, sigma: float, delta_t: float, sim_steps: int, num_sims: int):
 """
 Simulate the CIR process.
 Parameters:
 Input
 ...
 Output
 ...
 """
 output_shape = (num_sims, sim_steps + 1)
 sim_results = np.empty(output_shape)
 sigma_dW = np.random.normal(0, sigma * np.sqrt(delta_t), size=output_shape) 
 alpha_dt = alpha * delta_t
 sim_results[0, :] = r_prev = mu
 for r_t, sigma_dWt in zip(sim_results[1:], sigma_dW):
 r_t[:] = r_prev = (mu - r_prev) * alpha_dt + np.sqrt(r_prev) * sigma_dWt
 return sim_results

I'm not very familiar with the meaning of all the parameters in the formula. There are two things that I have doubts in your computation:

The rate \$r\$ in the formula is initialized with the same parameter mu used in the iterative computation. Is it expected?
The increment \$dW_t\$ used a fixed variance of 1. Is this supposed to be the case or should it be another function parameter?

Cheers for the answer. The CIR model should produce stochastic processes which are mean-reverting, and so I chose the initial value to be mu as this is the mean of the process. My next step will be to plug this into the Heston model (which is used to simulate stochastic volatility) so I'll think harder about whether the initial value of the processes should be mu when I've understood the context of the CIR model as used in the Heston model.
With regards to the $dW_t$'s having fixed variance 1, the actual variance should be $dt$. I have accounted for this by multiplying the simulated standard normal values by $\sqrt{dt}$.

Stack Exchange Network

Can this numpy code be vectorized?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Can this numpy code be vectorized?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions