I've implemented the Simulated Bifurcation (SB) algorithm in C++, based on the method described in [2] (link to Science Advances). The goal of this algorithm is to find the ground state of an Ising model, introduced in my paper [1] (link to arXiv) .
Here's the code:
#include <iostream>
#include <Eigen/Dense>
#include <vector>
#include <random>
#include <cmath>
#include <ctime>
// [1]: https://arxiv.org/abs/2411.19604
// [2]: https://www.science.org/doi/full/10.1126/sciadv.aav2372
std::pair<double, Eigen::VectorXd> calculate_min_J_o(const Eigen::MatrixXd& J) {
/*
This function calculates the minimum energy and configuration of the Hamiltonian of the Ising model in [1].
This assumes the two cluster pattern holds (which has been proven in the paper).
*/
int N = J.rows();
std::vector<double> H_l;
std::vector<Eigen::VectorXd> s_l;
for (int M = 1; M < N; ++M) {
Eigen::VectorXd s(N);
s.head(M).setOnes();
s.tail(N - M).setConstant(-1.0);
double H = s.transpose() * J * s;
H_l.push_back(H);
s_l.push_back(s);
}
auto min_H_iter = std::min_element(H_l.begin(), H_l.end());
int min_H_index = std::distance(H_l.begin(), min_H_iter);
return { *min_H_iter, s_l[min_H_index] };
}
Eigen::MatrixXd J_o(int N, double d = 1.0) {
/*
Ordered J matrix [1]
*/
Eigen::MatrixXd matrix(N, N);
for (int i = 0; i < N; ++i) {
for (int j = 0; j < N; ++j) {
matrix(i, j) = std::pow((i + 1.0) / N, d) + std::pow((j + 1.0) / N, d);
}
}
matrix /= std::pow(N, 2);
return matrix;
}
// Function g(x)
Eigen::MatrixXd g(const Eigen::MatrixXd& x) {
return x.unaryExpr([](double xi) {
if (std::abs(xi) <= 1) return xi;
return xi > 1 ? 1.0 : -1.0;
});
}
// Function h(x, y)
Eigen::MatrixXd h(const Eigen::MatrixXd& x, const Eigen::MatrixXd& y) {
Eigen::MatrixXd result(y.rows(), y.cols());
for (int i = 0; i < x.rows(); ++i) {
for (int j = 0; j < x.cols(); ++j) {
result(i, j) = std::abs(x(i, j)) <= 1 ? y(i, j) : 0;
}
}
return result;
}
// Simulated Bifurcation Algorithm with K parallel batches [2]
Eigen::MatrixXd SimulatedBifurcation(const Eigen::MatrixXd& J, int K = 3, int steps = 10000,
double c_0 = -1, double a_0 = 1, double delta_t = 0.1, double perturbation = 0.005) {
int N = J.rows();
if (c_0 == -1) {
double expectation_J_squared = std::sqrt(J.cwiseAbs2().sum() / (N * (N - 1)));
c_0 = 0.5 / (expectation_J_squared * std::sqrt(N));
}
std::mt19937 gen(std::random_device{}());
std::uniform_real_distribution<double> dist(-perturbation, perturbation);
Eigen::MatrixXd x = Eigen::MatrixXd::NullaryExpr(K, N, [&]() { return dist(gen); });
Eigen::MatrixXd y = Eigen::MatrixXd::NullaryExpr(K, N, [&]() { return dist(gen); });
double a = 0;
for (int step = 0; step < steps; ++step) {
Eigen::MatrixXd f = x * J; // Batch-wise matrix multiplication
y += ((-(a_0 - a) * x + c_0 * f) * delta_t);
x += (a_0 * y * delta_t);
a += 1.0 / steps;
x = g(x);
y = h(x, y);
}
return x.unaryExpr([](double xi) { return std::copysign(1.0, xi); });
}
int main() {
// compile with: g++ -O3 -march=native -flto SB.cpp -o SB
int N = 1000;
int K = 10;
int steps = 1000;
Eigen::MatrixXd J = -J_o(N);
clock_t start_time = std::clock();
Eigen::MatrixXd min_x = SimulatedBifurcation(J, K, steps, -1, 1, 0.2, 0.005);
clock_t end_time = std::clock();
std::cout << "Time taken: " << (double)(end_time - start_time) / CLOCKS_PER_SEC << " seconds.\n";
auto [min_H, true_gs] = calculate_min_J_o(-J);
Eigen::VectorXd true_gs_sign = true_gs.array().sign();
Eigen::MatrixXd min_x_sign = min_x.array().sign();
// std::cout << "min_x_sign:" << std::endl << min_x_sign << std::endl;
// std::cout << "true_gs_sign:" << std::endl << true_gs_sign.transpose() << std::endl;
bool ground_state_matches = false;
for (int i = 0; i < min_x_sign.rows(); ++i) {
bool match = (true_gs_sign.array() == min_x_sign.row(i).transpose().array()).all() ||
(true_gs_sign.array() == (-min_x_sign.row(i).transpose().array())).all();
if (match) {
ground_state_matches = true;
break;
}
}
std::cout << "Number of Batches (K) = " << K << std::endl;
std::cout << "N = " << N << std::endl;
std::cout << "steps = " << steps << std::endl;
std::cout << "Reached the true ground state in any batch: " << (ground_state_matches ? "Yes" : "No") << std::endl;
return 0;
}
I'm specifically interested in any tips for optimizing the SimulatedBifurcation
function, as this is the core of the algorithm and where most of the computation happens. Given the heavy use of matrix operations, I'm particularly curious about efficient use of the Eigen library.
My questions are:
Are there any obvious bottlenecks or areas within the
SimulatedBifurcation
function where I could improve performance, particularly regarding the Eigen matrix operations? Any suggestions for more efficient ways to express these calculations using Eigen would be greatly appreciated.What's the best way to migrate this code base from CPU to GPU based calculation?
-
1\$\begingroup\$ In section IV, when you spoke of "neglecting the Kronecker delta in Equation (5)", I think you meant equation 2. \$\endgroup\$J_H– J_H2024年12月27日 23:03:17 +00:00Commented Dec 27, 2024 at 23:03
-
\$\begingroup\$ @J_H Thank you for your feedback. We fixed this issue in the version 2, which is available on arxiv. \$\endgroup\$Amirhossein Rezaei– Amirhossein Rezaei2025年03月20日 20:08:16 +00:00Commented Mar 20 at 20:08
1 Answer 1
There are several little things that could be improved in the code:
- Unnecessary use of parentheses and some operations. Consider the calculation of
match
inmain()
, which could be written much simpler as:bool match = true_gs_sign == min_x_sign.row(i).transpose() || true_gs_sign == -min_x_sign.row(i).transpose();
- Don't use
std::endl
, prefer to use'\n'
; the former is equivalent to the latter, but also forces the output to be flushed, which isn't necessary. - Use
std::chrono
to calculate elapsed time instead of C'sclock()
function. - Find ways to simplify expressions and move things out of loops. The compiler isn't always able to do it for you. For example, in
J_o()
:
And inmatrix(i, j) = std::pow(i + 1.0, d) + std::pow(j + 1.0, d) ... matrix /= std::pow(N, 2 + d);
g()
:return x.unaryExpr([](double xi) { return std::clamp(xi, -1.0, 1.0); });
- Remove unused values. For example,
min_H
is never used, which in turn means thatcalculate_min_J_o()
's return value can be simplified. - Avoid unnecessary (re)calculations. This is especially important when matrices get large. Consider the repeated calculation of \$\mathbf s^\mathrm{T} J \mathbf s\$: each iteration you are only flipping the sign of one element of \$\mathbf s\$. This means calculating the next value of
H
could be done with much less effort. - Naming things: if you use short variable names that come from formulas from the papers you referenced, add a comment explaining the function of that variable, the paper it's from and the number of the formula where it was first mentioned. For example:
However, I don't see \$g(\mathbf x)\$ or \$h(\mathbf x, \mathbf y)\$ mentioned in the papers, so I don't know what// J is the interaction matrix (see arXiv:2411.19604v1 eq. 2)
g()
andh()
are supposed to do. Give them more descriptive names, or refer to a paper they are coming from (possibly your own).
I don't know how to best convert this to something that can run on a GPU, hopefully someone else will add an answer about that.