I'm very new to C++ and made this code using the Eigen library to port an old and slow python code which computes the three-dimensional proper orthogonal decomposition (POD) from a set of temporal point cloud data.
In this case the data is a obtained from a CFD (computational fluid dynamics) simulation as a cloud of 106x60x60 points and for each point three velocity components are written. An example can be seen here. It is stored in the input/OFout folder. The physical time associated with each cloud is retrieved from the times.txt file.
This data is concatenated in a big matrix which is used for several mathematical operations like calculating a projection matrix, obtaining the eigenvalues and eigenvectors of it and calculating the POD modes. Finally, these modes are written out to different files in the output/mode folder.
My aim is to improve both the C++ aspect (syntax, making it easier to read, more idiomatic, object oriented...) as well as the performance of the code. Tips/pointers regarding how to parallelise it would also be nice.
The code I wrote is the following:
#include <iostream>
#include <string>
#include <fstream>
#include <iomanip>
#include <Eigen/Dense>
#define MSIZE 381600 // Rows of matrix (number of points)
#define TSIZE 1804 // Size of time data (number of snapshots)
#define VSIZE 3 // Size of the variable (1 if scalar, 3 if vector)
#define NSIZE 20 // Size of output POD modes (number of modes to write)
using namespace Eigen;
int main()
{
MatrixXd m = MatrixXd::Zero(MSIZE * VSIZE, TSIZE);
MatrixXd pm = MatrixXd::Zero(TSIZE, TSIZE);
// READING INPUT FILES
std::string t;
std::string timedir = "../input/times.txt";
std::ifstream timefile(timedir);
std::cout << t << std::endl;
for (size_t k = 0; k < TSIZE; k++)
{
if (timefile.is_open())
{
timefile >> t;
while (t.back() == '0') // Remove trailing zeros
{
t.pop_back();
}
}
std::string dir = "../input/OFout/" + t + "/pointCloud_U.xy";
std::ifstream file(dir);
if (file.is_open())
{
for (size_t i = 0; i < MSIZE; i++)
{
for (size_t j = 0; j < VSIZE; j++)
{
file >> m(i + MSIZE * j, k);
}
}
std::cout << "Finished reading file " + std::to_string(k + 1) + " of " + std::to_string(TSIZE) << std::endl;
file.close();
}
else
{
std::cout << "Unable to open file" << std::endl;
}
}
// COMPUTING NORMALISED PROJECTION MATRIX
std::cout << "Computing projection matrix..." << std::flush;
for (size_t i = 0; i < TSIZE; i++)
{
for (size_t j = 0; j < TSIZE; j++)
{
pm(i, j) = (1.0 / TSIZE) * (m.col(i).dot(m.col(j).transpose()));
}
}
std::cout << " Done" << std::endl;
// COMPUTING SORTED EIGENVALUES AND EIGENVECTORS
std::cout << "Computing eigenvalues..." << std::flush;
SelfAdjointEigenSolver<MatrixXd>
eigensolver(pm);
if (eigensolver.info() != Success)
abort();
VectorXd eigval = eigensolver.eigenvalues().reverse();
MatrixXd eigvec = eigensolver.eigenvectors().rowwise().reverse();
std::cout << " Done" << std::endl;
// COMPUTING POD MODES
std::cout << "Computing POD modes..." << std::flush;
MatrixXd podx = MatrixXd::Zero(MSIZE, TSIZE);
MatrixXd pody = MatrixXd::Zero(MSIZE, TSIZE);
MatrixXd podz = MatrixXd::Zero(MSIZE, TSIZE);
for (size_t i = 0; i < TSIZE; i++)
{
for (size_t j = 0; j < TSIZE; j++)
{
podx.col(i) = podx.col(i) + (1.0 / (eigval(i) * TSIZE)) * sqrt(eigval(i) * TSIZE) * eigvec(j, i) * m.block<MSIZE, 1>(0 * MSIZE, j);
pody.col(i) = pody.col(i) + (1.0 / (eigval(i) * TSIZE)) * sqrt(eigval(i) * TSIZE) * eigvec(j, i) * m.block<MSIZE, 1>(1 * MSIZE, j);
podz.col(i) = podz.col(i) + (1.0 / (eigval(i) * TSIZE)) * sqrt(eigval(i) * TSIZE) * eigvec(j, i) * m.block<MSIZE, 1>(2 * MSIZE, j);
}
}
std::cout << " Done" << std::endl;
// WRITING SORTED EIGENVALUES
std::cout << "Writing eigenvalues..." << std::flush;
std::ofstream writeEigval("../output/chronos/A.txt");
if (writeEigval.is_open())
{
writeEigval << std::scientific << std::setprecision(10) << eigval;
writeEigval.close();
}
std::cout << " Done" << std::endl;
// WRITING CHRONOS
std::cout << "Writing chronos..." << std::flush;
for (size_t i = 0; i < NSIZE; i++)
{
std::string chronos = "../output/chronos/chronos_" + std::to_string(i) + ".dat";
std::ofstream writeChronos(chronos);
for (size_t j = 0; j < TSIZE; j++)
{
writeChronos << std::scientific << std::setprecision(10) << sqrt(eigval(i) * TSIZE) * eigvec(j, i) << '\n';
}
writeChronos.close();
}
std::cout << " Done" << std::endl;
// WRITING POD MODES
std::cout << "Writing POD modes..." << std::flush;
std::string xyz = "xyz_";
std::string modeHead = "../output/mode/mode_U";
for (size_t j = 0; j < VSIZE; j++)
{
for (size_t i = 0; i < NSIZE; i++)
{
std::string modeTail = std::to_string(i) + ".dat";
std::ofstream writeMode(modeHead + xyz.at(j) + xyz.at(3) + modeTail);
if (writeMode.is_open())
{
if (j == 0)
{
writeMode << std::scientific << std::setprecision(6) << podx.col(i);
}
else if (j == 1)
{
writeMode << std::scientific << std::setprecision(6) << pody.col(i);
}
else if (j == 2)
{
writeMode << std::scientific << std::setprecision(6) << podz.col(i);
}
writeMode.close();
}
}
}
std::cout << " Done" << std::endl;
}
Thanks for your insights!
1 Answer 1
- Instead of
#define
useenum
orconstexpr
for the constant numbers. - I don't think that the number of time data or number of points should be absolute constants - rather they should be determined during runtime. Else you have to recompile your code each time you receive new data.
- Better declare some sort of output path instead of relying on
../
, also you need to make sure that the directories exist, else it will fail to save. - You don't want to store small pieces of data over huge amount of files and directories. Better make a serialization standard for yourselves and store a sizable amount of data inside each file. Also, consider binary save/load for better speed, accuracy, and consistency, though, it will be less readable for humans and portability might suffer a bit - some odd platforms use less common endianess.
You can speed up the loop
for (size_t i = 0; i < TSIZE; i++) { for (size_t j = 0; j <= i; j++) { pm(j, i) = pm(i, j) = (1.0 / TSIZE) * (m.col(i).dot(m.col(j).transpose())); } }
Note: by
dot
function I expect to see scalar pruduct, so it is odd for me to see that you transpose the vector inside it. Use operator * for matrix multiplication.I am not too familiar with these Matrices classes, but most likely, it will be better if you replace
podx.col(i) = podx.col(i) + (1.0 / (eigval(i) * TSIZE)) * sqrt(eigval(i) * TSIZE) * eigvec(j, i) * m.block<MSIZE, 1>(0 * MSIZE, j);
with
podx.col(i) += (eigvec(j, i) / sqrt(eigval(i) * TSIZE)) * m.block<MSIZE, 1>(0 * MSIZE, j);
also, what to do if
eigval(i) == 0.
or too close to it?- If you plan to grow your project you'd better eventually replace
std::cout
with a usage of a logger class asstd::cout
is not thread friendly despite being technically thread safe. Also, in this, case move the code inside a class and each relevant section move inside a function with a sensible name.
-
\$\begingroup\$ "Instead of
#define
useenum
for the constant numbers." No, useconstexpr
. \$\endgroup\$L. F.– L. F.2019年10月12日 10:47:49 +00:00Commented Oct 12, 2019 at 10:47 -
\$\begingroup\$ @L.F. I see constexpr being better when you require computation or non trivial classes. But for simple integer constants? \$\endgroup\$ALX23z– ALX23z2019年10月12日 14:22:37 +00:00Commented Oct 12, 2019 at 14:22
-
\$\begingroup\$ Still better. For both semantics and consistency. \$\endgroup\$L. F.– L. F.2019年10月12日 14:31:03 +00:00Commented Oct 12, 2019 at 14:31
-
\$\begingroup\$ @L.F. edited and made recommendation for both. \$\endgroup\$ALX23z– ALX23z2019年10月12日 14:35:53 +00:00Commented Oct 12, 2019 at 14:35