The difference between the two forms is that the former defines things within the std::
namespace versus into the global namespace. Language lawyers have lots of fun with this, but for daily use I'd recommend using <cmath>
. See this SO question this SO question for details. There is never any need to include both.
The difference between the two forms is that the former defines things within the std::
namespace versus into the global namespace. Language lawyers have lots of fun with this, but for daily use I'd recommend using <cmath>
. See this SO question for details. There is never any need to include both.
The difference between the two forms is that the former defines things within the std::
namespace versus into the global namespace. Language lawyers have lots of fun with this, but for daily use I'd recommend using <cmath>
. See this SO question for details. There is never any need to include both.
The original version takes 16.48 seconds and the version with the modifications mentioned above takes 2.82 seconds on my machine, so over 5 times faster. I'm compiling and running on a 64-bit Linux machine. Here's the Makefile
I created. The work directory has the original.cpp
and schroedinger.cpp
(the modified version). I then created directories named test
and golden
. I type make golden
to create a reference version of all outputs in the golden
directory and then make
to test the new version and compare it to the original. Also, I am using my usual environment flags, so on my machine CXXFLAGS="-Wall -Wextra -pedantic -std=c++14"
. The profile.txt
target uses gprof
to generate a profile of the code after it has run.
Makefile
.PHONY:all
all: result.diff
profile.txt : result.diff
mv test/gmon.out .
gprof schroedinger >$@
schroedinger: schroedinger.cpp
g++ -g -pg -O2 $(CXXFLAGS) $^ -o $@
original: original.cpp
g++ -g -pg -O2 $(CXXFLAGS) $^ -o $@
.PHONY:golden
golden: original
cd golden && time ../original >/dev/null
result.diff: schroedinger
cd test && time ../schroedinger >/dev/null
diff -Naur --exclude *.out golden test >$@
The original version takes 16.48 seconds and the version with the modifications mentioned above takes 2.82 seconds on my machine, so over 5 times faster.
The original version takes 16.48 seconds and the version with the modifications mentioned above takes 2.82 seconds on my machine, so over 5 times faster. I'm compiling and running on a 64-bit Linux machine. Here's the Makefile
I created. The work directory has the original.cpp
and schroedinger.cpp
(the modified version). I then created directories named test
and golden
. I type make golden
to create a reference version of all outputs in the golden
directory and then make
to test the new version and compare it to the original. Also, I am using my usual environment flags, so on my machine CXXFLAGS="-Wall -Wextra -pedantic -std=c++14"
. The profile.txt
target uses gprof
to generate a profile of the code after it has run.
Makefile
.PHONY:all
all: result.diff
profile.txt : result.diff
mv test/gmon.out .
gprof schroedinger >$@
schroedinger: schroedinger.cpp
g++ -g -pg -O2 $(CXXFLAGS) $^ -o $@
original: original.cpp
g++ -g -pg -O2 $(CXXFLAGS) $^ -o $@
.PHONY:golden
golden: original
cd golden && time ../original >/dev/null
result.diff: schroedinger
cd test && time ../schroedinger >/dev/null
diff -Naur --exclude *.out golden test >$@
I see a number of things that may help you improve your program.
Prefer constexpr
to old-style #define
Rather than using a #define
for all of the constants, the code could use a constexpr
:
constexpr auto K2{1.0E6};
constexpr auto omega1{std::sqrt(K1/M)};
constexpr compx I{0.0, 1.0};
constexpr compx one{1.0,0.0};
constexpr compx two{2.0,0.0};
Generally the advantage is that the value has an associated type. Depending on compiler optimizations, it may also confer a speed advantage because you have explicitly told the compiler that the value is a constant known at compile time.
Fix your constructor
It appears that you want to have two kinds of contructors; one simply creates the data structures and the other does a more thorough initialization. Instead of providing two different constructors, I'd suggest instead that you create a single constructor that has a default value for the constant.
Eliminate unused variables
The variable check
in your code is defined but never used. Similarly, the parameter New
in alpha_beta_solver()
is never used. Since unused variables are a sign of poor code quality, you should seek to eliminate them. Your compiler is probably smart enough to warn you about such things if you know how to ask it to do so.
Ensure every control path returns a proper value
The sec_space_deriv()
function returns one calculated value if the first parameter was 'x', another if it was 'y' but then doesn't return anything at all otherwise. This is an error. The code could instead be written like this:
compx wave_function::sec_space_deriv(bool want_x1, int l, int m) const {
if (want_x1)
if (l == 0)
return value[l + 1][m] - two * value[l][m] + value[grid_point - 1][m];
else if (l == grid_point - 1)
return value[0][m] - two * value[l][m] + value[l - 1][m];
else
return value[l + 1][m] - two * value[l][m] + value[l - 1][m];
}
if (m == 0)
return value[l][m + 1] - two * value[l][m] + value[l][grid_point - 1];
else if (m == grid_point - 1)
return value[l][0] - two * value[l][m] + value[l][m - 1];
return value[l][m + 1] - two * value[l][m] + value[l][m - 1];
}
Note that I've added the keyword const
to that which brings me to the next suggestion.
Use const
where practical
For a number of the operations, the wave_function
data are unaltered. That's good design, but you should make it explicit by marking those member functions const
. In particular, potential()
, real_space()
and sec_space_deriv()
are all const
functions. Similarly, the first and third arguments to alpha_beta_solver
should be const
references.
Prefer std::array
to raw arrays
At the moment, the code for rho()
begins like this:
for (int k = 0; k < grid_point; k++) {
density_x1[k] = 0.;
density_x2[k] = 0.;
}
If instead, you were to define density_x1
and density_x2
like this:
std::array<double, grid_point> density_x1;
std::array<double, grid_point> density_x2;
That part of rho()
would look like this:
density_x1.fill(0);
density_x2.fill(0);
Use <cmath>
instead of <math.h>
The difference between the two forms is that the former defines things within the std::
namespace versus into the global namespace. Language lawyers have lots of fun with this, but for daily use I'd recommend using <cmath>
. See this SO question for details. There is never any need to include both.
Don't use std::endl
if you don't really need it
The difference betweeen std::endl
and '\n'
is that '\n'
just emits a newline character, while std::endl
actually flushes the stream. This can be time-consuming in a program with a lot of I/O and is rarely actually needed. It's best to only use std::endl
when you have some good reason to flush the stream and it's not very often needed for simple programs such as this one. Avoiding the habit of using std::endl
when '\n'
will do will pay dividends in the future as you write more complex programs with more I/O and where performance needs to be maximized.
Profile code performance
I profiled your code using gprof
and noticed that most of the time (98.84%) was being spent in alpha_beta_solver()
. Looking at the code, that function does two things: it initializes a number of local vectors and then applies that to the value
stored in the wave_function
class. However, all of the setup is actually constant, so we can split the function into two pieces. The first creates x_N
, x_1
and x_2
and the second simply uses those to update values
. Here's how I rewrote them:
compx wave_function::alpha_beta_solver(const compx &b4_mid, const std::array<compx, grid_point> &mid, const compx &post_mid, const std::array<compx, grid_point> &R, std::vector<compx> &x_1, std::vector<compx> &x_2) {
compx R_N;// new_mid[grid_point], new_R[grid_point];
std::vector<compx> alpha(grid_point - 2);
std::vector<compx> beta1(grid_point - 1);//for x_1
std::vector<compx> beta2(grid_point - 1);//for x_2
alpha[0] = post_mid / mid[0];
beta1[0] = R[0] / mid[0];
beta2[0] = -b4_mid / mid[0];
//Forward run
for (int l = 1; l < grid_point - 2; l++) {
alpha[l] = post_mid / (mid[l] - b4_mid*alpha[l - 1]);
beta1[l] = (R[l] - b4_mid*beta1[l - 1]) / (mid[l] - b4_mid*alpha[l - 1]);
beta2[l] = (-b4_mid*beta2[l - 1]) / (mid[l] - b4_mid*alpha[l - 1]);
}
beta1[grid_point - 2] = (R[grid_point - 2] - b4_mid*beta1[grid_point - 3]) / (mid[grid_point - 2] - b4_mid*alpha[grid_point - 3]);
beta2[grid_point - 2] = (-post_mid - b4_mid*beta2[grid_point - 3]) / (mid[grid_point - 2] - b4_mid*alpha[grid_point - 3]);
//Backward run
x_1[grid_point - 2] = beta1[grid_point - 2];
x_2[grid_point - 2] = beta2[grid_point - 2];
for (int l = grid_point - 3; l >= 0; l--) {
x_1[l] = beta1[l] - alpha[l] * x_1[l + 1];
x_2[l] = beta2[l] - alpha[l] * x_2[l + 1];
}
return (R_N - post_mid*x_1[0] - b4_mid*x_1[grid_point - 2])
/ (mid[grid_point - 2] + post_mid*x_2[0] + b4_mid*x_2[grid_point - 2]);
}
void wave_function::update_value(int coor, bool want_x, const compx &x_N, const std::vector<compx> &x_1, const std::vector<compx> &x_2) {
if (want_x) {
value[grid_point - 1][coor] = x_N;
for (int l = 0; l < grid_point - 1; l++) {
value[l][coor] = x_1[l] + x_2[l] * x_N;
//std::cout << x1 << " " << x2 << " " << l << " " << x_1[l] + x_2[l] * x_N << std::endl;
}
}
else {
value[coor][grid_point - 1] = x_N;
for (int l = 0; l < grid_point - 1; l++) {
value[coor][l] = x_1[l] + x_2[l] * x_N;
}
}
}
And here is the calling function, reword to use std::array
and to take advantage of these split functions:
void wave_function::solve_triag() {
std::array<compx, grid_point> mid1, mid2, R1, R2;
std::vector<compx> x_1(grid_point), x_2(grid_point);
mid1.fill(one - two*A);
for (int x2 = 0; x2 < grid_point; x2++) {
for (int i = 0; i < grid_point; i++) {
R1[i] = (one - C*potential(i, x2))*value[i][x2] - B*sec_space_deriv(false, i, x2);
}
compx x_N = alpha_beta_solver(A, mid1, A, R1, x_1, x_2);
for (int x1 = 0; x1 < grid_point; x1++) {
update_value(x2, true, x_N, x_1, x_2);
}
}
wave_function tmp;
for (int x1 = 0; x1 < grid_point; x1++) {
for (int i = 0; i < grid_point; i++) {
mid2[i] = one - two*B + C*potential(x1, i);
R2[i] = tmp.value[x1][i] - A*tmp.sec_space_deriv(true, x1, i);
}
compx x_N = alpha_beta_solver(B, mid2, B, R2, x_1, x_2);
for (int x2 = 0; x2 < grid_point; x2++) {
update_value(x1, false, x_N, x_1, x_2);
}
}
normalize();
}
There are other optimizations possible here, but I wanted to illustrate the process rather than doing it all for you.
Results
The original version takes 16.48 seconds and the version with the modifications mentioned above takes 2.82 seconds on my machine, so over 5 times faster.