3D Proper Orthogonal Decomposition: C++ implementation

Question 1

I'm very new to C++ and made this code using the Eigen library to port an old and slow python code which computes the three-dimensional proper orthogonal decomposition (POD) from a set of temporal point cloud data.

In this case the data is a obtained from a CFD (computational fluid dynamics) simulation as a cloud of 106x60x60 points and for each point three velocity components are written. An example can be seen here. It is stored in the input/OFout folder. The physical time associated with each cloud is retrieved from the times.txt file.

This data is concatenated in a big matrix which is used for several mathematical operations like calculating a projection matrix, obtaining the eigenvalues and eigenvectors of it and calculating the POD modes. Finally, these modes are written out to different files in the output/mode folder.

My aim is to improve both the C++ aspect (syntax, making it easier to read, more idiomatic, object oriented...) as well as the performance of the code. Tips/pointers regarding how to parallelise it would also be nice.

The code I wrote is the following:

#include <iostream>
#include <string>
#include <fstream>
#include <iomanip>
#include <Eigen/Dense>
#define MSIZE 381600 // Rows of matrix (number of points)
#define TSIZE 1804 // Size of time data (number of snapshots)
#define VSIZE 3 // Size of the variable (1 if scalar, 3 if vector)
#define NSIZE 20 // Size of output POD modes (number of modes to write)
using namespace Eigen;
int main()
{
 MatrixXd m = MatrixXd::Zero(MSIZE * VSIZE, TSIZE);
 MatrixXd pm = MatrixXd::Zero(TSIZE, TSIZE);
 // READING INPUT FILES
 std::string t;
 std::string timedir = "../input/times.txt";
 std::ifstream timefile(timedir);
 std::cout << t << std::endl;
 for (size_t k = 0; k < TSIZE; k++)
 {
 if (timefile.is_open())
 {
 timefile >> t;
 while (t.back() == '0') // Remove trailing zeros
 {
 t.pop_back();
 }
 }
 std::string dir = "../input/OFout/" + t + "/pointCloud_U.xy";
 std::ifstream file(dir);
 if (file.is_open())
 {
 for (size_t i = 0; i < MSIZE; i++)
 {
 for (size_t j = 0; j < VSIZE; j++)
 {
 file >> m(i + MSIZE * j, k);
 }
 }
 std::cout << "Finished reading file " + std::to_string(k + 1) + " of " + std::to_string(TSIZE) << std::endl;
 file.close();
 }
 else
 {
 std::cout << "Unable to open file" << std::endl;
 }
 }
 // COMPUTING NORMALISED PROJECTION MATRIX
 std::cout << "Computing projection matrix..." << std::flush;
 for (size_t i = 0; i < TSIZE; i++)
 {
 for (size_t j = 0; j < TSIZE; j++)
 {
 pm(i, j) = (1.0 / TSIZE) * (m.col(i).dot(m.col(j).transpose()));
 }
 }
 std::cout << " Done" << std::endl;
 // COMPUTING SORTED EIGENVALUES AND EIGENVECTORS
 std::cout << "Computing eigenvalues..." << std::flush;
 SelfAdjointEigenSolver<MatrixXd>
 eigensolver(pm);
 if (eigensolver.info() != Success)
 abort();
 VectorXd eigval = eigensolver.eigenvalues().reverse();
 MatrixXd eigvec = eigensolver.eigenvectors().rowwise().reverse();
 std::cout << " Done" << std::endl;
 // COMPUTING POD MODES
 std::cout << "Computing POD modes..." << std::flush;
 MatrixXd podx = MatrixXd::Zero(MSIZE, TSIZE);
 MatrixXd pody = MatrixXd::Zero(MSIZE, TSIZE);
 MatrixXd podz = MatrixXd::Zero(MSIZE, TSIZE);
 for (size_t i = 0; i < TSIZE; i++)
 {
 for (size_t j = 0; j < TSIZE; j++)
 {
 podx.col(i) = podx.col(i) + (1.0 / (eigval(i) * TSIZE)) * sqrt(eigval(i) * TSIZE) * eigvec(j, i) * m.block<MSIZE, 1>(0 * MSIZE, j);
 pody.col(i) = pody.col(i) + (1.0 / (eigval(i) * TSIZE)) * sqrt(eigval(i) * TSIZE) * eigvec(j, i) * m.block<MSIZE, 1>(1 * MSIZE, j);
 podz.col(i) = podz.col(i) + (1.0 / (eigval(i) * TSIZE)) * sqrt(eigval(i) * TSIZE) * eigvec(j, i) * m.block<MSIZE, 1>(2 * MSIZE, j);
 }
 }
 std::cout << " Done" << std::endl;
 // WRITING SORTED EIGENVALUES
 std::cout << "Writing eigenvalues..." << std::flush;
 std::ofstream writeEigval("../output/chronos/A.txt");
 if (writeEigval.is_open())
 {
 writeEigval << std::scientific << std::setprecision(10) << eigval;
 writeEigval.close();
 }
 std::cout << " Done" << std::endl;
 // WRITING CHRONOS
 std::cout << "Writing chronos..." << std::flush;
 for (size_t i = 0; i < NSIZE; i++)
 {
 std::string chronos = "../output/chronos/chronos_" + std::to_string(i) + ".dat";
 std::ofstream writeChronos(chronos);
 for (size_t j = 0; j < TSIZE; j++)
 {
 writeChronos << std::scientific << std::setprecision(10) << sqrt(eigval(i) * TSIZE) * eigvec(j, i) << '\n';
 }
 writeChronos.close();
 }
 std::cout << " Done" << std::endl;
 // WRITING POD MODES
 std::cout << "Writing POD modes..." << std::flush;
 std::string xyz = "xyz_";
 std::string modeHead = "../output/mode/mode_U";
 for (size_t j = 0; j < VSIZE; j++)
 {
 for (size_t i = 0; i < NSIZE; i++)
 {
 std::string modeTail = std::to_string(i) + ".dat";
 std::ofstream writeMode(modeHead + xyz.at(j) + xyz.at(3) + modeTail);
 if (writeMode.is_open())
 {
 if (j == 0)
 {
 writeMode << std::scientific << std::setprecision(6) << podx.col(i);
 }
 else if (j == 1)
 {
 writeMode << std::scientific << std::setprecision(6) << pody.col(i);
 }
 else if (j == 2)
 {
 writeMode << std::scientific << std::setprecision(6) << podz.col(i);
 }
 writeMode.close();
 }
 }
 }
 std::cout << " Done" << std::endl;
}

Thanks for your insights!

Question 2

Instead of #define use enum or constexpr for the constant numbers.
I don't think that the number of time data or number of points should be absolute constants - rather they should be determined during runtime. Else you have to recompile your code each time you receive new data.
Better declare some sort of output path instead of relying on ../, also you need to make sure that the directories exist, else it will fail to save.
You don't want to store small pieces of data over huge amount of files and directories. Better make a serialization standard for yourselves and store a sizable amount of data inside each file. Also, consider binary save/load for better speed, accuracy, and consistency, though, it will be less readable for humans and portability might suffer a bit - some odd platforms use less common endianess.
You can speed up the loop
```
for (size_t i = 0; i < TSIZE; i++)
{
 for (size_t j = 0; j <= i; j++)
 {
 pm(j, i) = pm(i, j) = (1.0 / TSIZE) * (m.col(i).dot(m.col(j).transpose()));
 }
}
```
Note: by dot function I expect to see scalar pruduct, so it is odd for me to see that you transpose the vector inside it. Use operator * for matrix multiplication.

I am not too familiar with these Matrices classes, but most likely, it will be better if you replace

podx.col(i) = podx.col(i) + (1.0 / (eigval(i) * TSIZE)) * sqrt(eigval(i) * TSIZE) * eigvec(j, i) * m.block<MSIZE, 1>(0 * MSIZE, j);

with

podx.col(i) += (eigvec(j, i) / sqrt(eigval(i) * TSIZE)) * m.block<MSIZE, 1>(0 * MSIZE, j);

also, what to do if eigval(i) == 0. or too close to it?

If you plan to grow your project you'd better eventually replace std::cout with a usage of a logger class as std::cout is not thread friendly despite being technically thread safe. Also, in this, case move the code inside a class and each relevant section move inside a function with a sensible name.

Question 3

"Instead of #define use enum for the constant numbers." No, use constexpr.

Question 4

@L.F. I see constexpr being better when you require computation or non trivial classes. But for simple integer constants?

Question 5

Still better. For both semantics and consistency.

Question 6

@L.F. edited and made recommendation for both.

ALX23z ALX23z 2,5396 silver badges15 bronze badges · Answer 1 · 2019-10-12 07:19:51Z

Instead of #define use enum or constexpr for the constant numbers.
I don't think that the number of time data or number of points should be absolute constants - rather they should be determined during runtime. Else you have to recompile your code each time you receive new data.
Better declare some sort of output path instead of relying on ../, also you need to make sure that the directories exist, else it will fail to save.
You don't want to store small pieces of data over huge amount of files and directories. Better make a serialization standard for yourselves and store a sizable amount of data inside each file. Also, consider binary save/load for better speed, accuracy, and consistency, though, it will be less readable for humans and portability might suffer a bit - some odd platforms use less common endianess.
You can speed up the loop
```
for (size_t i = 0; i < TSIZE; i++)
{
 for (size_t j = 0; j <= i; j++)
 {
 pm(j, i) = pm(i, j) = (1.0 / TSIZE) * (m.col(i).dot(m.col(j).transpose()));
 }
}
```
Note: by dot function I expect to see scalar pruduct, so it is odd for me to see that you transpose the vector inside it. Use operator * for matrix multiplication.

I am not too familiar with these Matrices classes, but most likely, it will be better if you replace

podx.col(i) = podx.col(i) + (1.0 / (eigval(i) * TSIZE)) * sqrt(eigval(i) * TSIZE) * eigvec(j, i) * m.block<MSIZE, 1>(0 * MSIZE, j);

with

podx.col(i) += (eigvec(j, i) / sqrt(eigval(i) * TSIZE)) * m.block<MSIZE, 1>(0 * MSIZE, j);

also, what to do if eigval(i) == 0. or too close to it?

If you plan to grow your project you'd better eventually replace std::cout with a usage of a logger class as std::cout is not thread friendly despite being technically thread safe. Also, in this, case move the code inside a class and each relevant section move inside a function with a sensible name.

"Instead of #define use enum for the constant numbers." No, use constexpr.
@L.F. I see constexpr being better when you require computation or non trivial classes. But for simple integer constants?

Stack Exchange Network

3D Proper Orthogonal Decomposition: C++ implementation

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

3D Proper Orthogonal Decomposition: C++ implementation

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions