Computing the RPY tensor with C++/Eigen (linear algebra)

Question 1

I'm mostly used to writing scientific code in MATLAB / Python / Julia and my experience with C++ is very limited. I've implemented a fairly simple computation of the Rotne-Prager-Yamakawa tensor in dimensionless form using the Eigen library in C++11 (currently using the Visual Studio toolchain).

#include <iostream>
#include <random>
#include <chrono>
#include <Eigen/Dense>
using namespace Eigen;
using namespace std;
using namespace std::chrono;
MatrixXd rotne_prager(Vector3d coords[], double r, const int p)
{
 MatrixXd rpy = MatrixXd::Zero(3 * (p + 1), 3 * (p + 1));
 Vector3d rvec;
 double Rij;
 double distance_ratio;
 for (int i = 0; i < p + 1; i++)
 {
 for (int j = 0; j < p + 1; j++)
 {
 rvec(0) = coords[j](0) - coords[i](0);
 rvec(1) = coords[j](1) - coords[i](1);
 rvec(2) = coords[j](2) - coords[i](2);
 Rij = sqrt(rvec(0)*rvec(0) + rvec(1)*rvec(1) + rvec(2)*rvec(2));
 distance_ratio = r / Rij;
 if (Rij > 2 * r)
 { 
 rpy.block<3, 3>(3 * (i + 1) - 3, 3 * (j + 1) - 3) = 0.75 * distance_ratio * (1.0 - 2.0*distance_ratio * distance_ratio) * rvec * rvec.transpose();
 rpy.block<3, 3>(3 * (i + 1) - 3, 3 * (j + 1) - 3).diagonal().array() += 0.75 * distance_ratio * (1.0 + 2.0 / 3.0 * distance_ratio * distance_ratio); 
 }
 else
 {
 rpy.block<3, 3>(3 * (i + 1) - 3, 3 * (j + 1) - 3) = 3.0 / 32.0 / distance_ratio * rvec*rvec.transpose();
 rpy.block<3, 3>(3 * (i + 1) - 3, 3 * (j + 1) - 3).diagonal().array() += 1.0 - 9.0 / 32.0 / distance_ratio;
 }
 }
 }
 return rpy;
}
int main()
{
 random_device rd;
 mt19937 gen(rd());
 uniform_real_distribution<> dis(0, 1);
 Eigen::setNbThreads(8);
 auto start = high_resolution_clock::now();
 const int p = 4000;
 Vector3d coords[p + 1];
 for (int i = 0; i < p + 1; i++)
 {
 coords[i](0) = dis(gen); coords[i](1) = dis(gen); coords[i](2) = dis(gen);
 }
 MatrixXd rpy = rotne_prager(coords, 0.01, p);
 auto end = high_resolution_clock::now();
 auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count();
 cout << "p = " << p << ", time elapsed: " << duration << " ms" << endl;
}

The code produces correct results, but I'm sure it could have been written much better. I'd appreciate any performance optimization tips as well as any suggestions about writing more idiomatic (insofar as that term even applies in the C++ world) and concise code.

Question 2

Vector3d supports arithmetic operators. Use them:
```
rvec = coords[j] - coords[i];
```
is much cleaner than a member-wise initialization.
Vector3d supports norm(). Use it:
```
Rjj = rvec.norm();
```
distance_ratio = r / Rij; invokes division by 0 when i equals to j (that is, over the diagonal). It is definitely a red flag (if the algorithm does expect INF in the subsequent computations, the comment would be nice).

Question 3

Thank you. I've adopted your first two suggestions, it seems that norm() is actually faster than the manual computation, another positive change from the languages I normally use. As for the division by zero, the algorithm isn't supposed to handle overlapping spheres. I know I should normally add a check nonetheless to prevent potentially unpleasant crashes, but such additional logic in a tight inner leap would negatively affect the performance of the code. For now, I think I'll stick to prominent warnings in the documentation.

Question 4

Using p + 1 instead of just p

You declare

const int p = 4000;

but use p+1 at all the places that matter. I think you should just use

const int p = 4001;

and replace all usages of p + 1 by just p.

Simplify a few expressions

You have repeated usage of 3 * ( i + 1) - 3 and 3 * ( j + 1) - 3. They can simply be 3*i and 3*j.

Give some computations a name

You have the following expressions:

0.75 * distance_ratio * (1.0 - 2.0*distance_ratio * distance_ratio) * rvec * rvec.transpose()
0.75 * distance_ratio * (1.0 + 2.0 / 3.0 * distance_ratio * distance_ratio)
3.0 / 32.0 / distance_ratio * rvec*rvec.transpose()
1.0 - 9.0 / 32.0 / distance_ratio

I don't know that they compute but they compute four different things. It will be easier to understand your code if you can give those computations a name and put them in a function.

double firstTerm(double distance_ratio,
 Vector3d const& rvec)
{
 return (0.75 * distance_ratio * (1.0 - 2.0*distance_ratio * distance_ratio) * rvec * rvec.transpose());
}

and use it

 rpy.block<3, 3>(3*i, 3*j) = firstTerm(distanc_ratio, recv);

PS I am not suggesting firstTerm as a function name. It just illustrates the concept.

Avoid multiple statements in one line

The line

 coords[i](0) = dis(gen); coords[i](1) = dis(gen); coords[i](2) = dis(gen);

is less readable than:

 coords[i](0) = dis(gen);
 coords[i](1) = dis(gen);
 coords[i](2) = dis(gen);

You could also put them in a for loop to avoid code repetition.

 for ( int j = 0; i < 3; ++j )
 coords[i](j) = dis(gen);

Using const int as an argument type`

I am not a fan of using const int as an argument type. Just int works without any problem. YMMV.

The sqrt line

 Rij = sqrt(rvec(0)*rvec(0) + rvec(1)*rvec(1) + rvec(2)*rvec(2));

can be modified to

double square(double in)
{
 return in*in;
}
Rij = sqrt(square(rvec(0)) + square(rvec(1)) + square(rvec(2));

I think this not only avoids code repetition but also uses the right abstraction in the computation.

Question 5

Thank you for the suggestions. Some of them I'm quite happy to adopt immediately. However, the one idea that gives me slight pause is the idea of separating the index logic in a separate function. The problem that I see with it is that the particular kind of index logic is likely to only be useful in the rotne_prager function itself, so I'm slightly hesitant about abstracting it out of the function altogether. Since C++ does not allow nested functions, would lambdas provide a reasonable alternative? Would those have a potential performance penalty? Am I overthinking the problem?

Question 6

I just realized that you need just 3 * (i + 1) - 3 == 3*i. You can just use 3*i and 3*j.

Question 7

Ah, you're right, of course. The wonders that come from directly porting code from languages that use one-based indexing... Thanks!

vnp vnp 58.5k4 gold badges55 silver badges144 bronze badges · Answer 1 · 2014-12-25 01:57:19Z

1

\$\begingroup\$

Vector3d supports arithmetic operators. Use them:
```
rvec = coords[j] - coords[i];
```
is much cleaner than a member-wise initialization.
Vector3d supports norm(). Use it:
```
Rjj = rvec.norm();
```
distance_ratio = r / Rij; invokes division by 0 when i equals to j (that is, over the diagonal). It is definitely a red flag (if the algorithm does expect INF in the subsequent computations, the comment would be nice).

Share

answered Dec 25, 2014 at 1:57

vnp's user avatar

vnp vnp

58.5k4 gold badges55 silver badges144 bronze badges

\$\endgroup\$

1

\$\begingroup\$ Thank you. I've adopted your first two suggestions, it seems that norm() is actually faster than the manual computation, another positive change from the languages I normally use. As for the division by zero, the algorithm isn't supposed to handle overlapping spheres. I know I should normally add a check nonetheless to prevent potentially unpleasant crashes, but such additional logic in a tight inner leap would negatively affect the performance of the code. For now, I think I'll stick to prominent warnings in the documentation. \$\endgroup\$

user13492
– user13492

2014年12月25日 12:23:56 +00:00
Commented Dec 25, 2014 at 12:23

Add a comment |

R Sahu R Sahu 3,56213 silver badges20 bronze badges · Answer 2 · 2014-12-24 21:53:21Z

Using p + 1 instead of just p

You declare

const int p = 4000;

but use p+1 at all the places that matter. I think you should just use

const int p = 4001;

and replace all usages of p + 1 by just p.

Simplify a few expressions

You have repeated usage of 3 * ( i + 1) - 3 and 3 * ( j + 1) - 3. They can simply be 3*i and 3*j.

Give some computations a name

You have the following expressions:

0.75 * distance_ratio * (1.0 - 2.0*distance_ratio * distance_ratio) * rvec * rvec.transpose()
0.75 * distance_ratio * (1.0 + 2.0 / 3.0 * distance_ratio * distance_ratio)
3.0 / 32.0 / distance_ratio * rvec*rvec.transpose()
1.0 - 9.0 / 32.0 / distance_ratio

I don't know that they compute but they compute four different things. It will be easier to understand your code if you can give those computations a name and put them in a function.

double firstTerm(double distance_ratio,
 Vector3d const& rvec)
{
 return (0.75 * distance_ratio * (1.0 - 2.0*distance_ratio * distance_ratio) * rvec * rvec.transpose());
}

and use it

 rpy.block<3, 3>(3*i, 3*j) = firstTerm(distanc_ratio, recv);

PS I am not suggesting firstTerm as a function name. It just illustrates the concept.

Avoid multiple statements in one line

The line

 coords[i](0) = dis(gen); coords[i](1) = dis(gen); coords[i](2) = dis(gen);

is less readable than:

 coords[i](0) = dis(gen);
 coords[i](1) = dis(gen);
 coords[i](2) = dis(gen);

You could also put them in a for loop to avoid code repetition.

 for ( int j = 0; i < 3; ++j )
 coords[i](j) = dis(gen);

Using const int as an argument type`

I am not a fan of using const int as an argument type. Just int works without any problem. YMMV.

The sqrt line

 Rij = sqrt(rvec(0)*rvec(0) + rvec(1)*rvec(1) + rvec(2)*rvec(2));

can be modified to

double square(double in)
{
 return in*in;
}
Rij = sqrt(square(rvec(0)) + square(rvec(1)) + square(rvec(2));

I think this not only avoids code repetition but also uses the right abstraction in the computation.

Thank you for the suggestions. Some of them I'm quite happy to adopt immediately. However, the one idea that gives me slight pause is the idea of separating the index logic in a separate function. The problem that I see with it is that the particular kind of index logic is likely to only be useful in the rotne_prager function itself, so I'm slightly hesitant about abstracting it out of the function altogether. Since C++ does not allow nested functions, would lambdas provide a reasonable alternative? Would those have a potential performance penalty? Am I overthinking the problem?
I just realized that you need just 3 * (i + 1) - 3 == 3*i. You can just use 3*i and 3*j.
Ah, you're right, of course. The wonders that come from directly porting code from languages that use one-based indexing... Thanks!

Stack Exchange Network

Computing the RPY tensor with C++/Eigen (linear algebra)

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Computing the RPY tensor with C++/Eigen (linear algebra)

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions