I have written a C program to compute values of the Hardy Z function, \$Z(t)\$. The Hardy Z function is a continuous, real-valued function that takes a positive real number \$t\$ as its input and outputs a real number that can be either positive or negative. Since \$Z(t)\$ is continuous, if it is positive (negative) at the beginning of a real interval and negative (positive) at the end of the interval, then it must be zero at some point inside that interval. It turns out that \$Z(t)\$ is zero at a given \$t\$ if and only if the complex valued Riemann Zeta function, \$\zeta(s)\$, is zero at the complex number \$s = 1⁄2 + it\$.
I will briefly describe the mathematical algorithm for \$Z(t)\$ before getting to my programming questions. \$Z(t) = M + E\$, where \$M =\$ the main term (easy formula) and \$E =\$ the error term (complicated formula). So, \$E\$ computes the error introduced if you only use the main term. There is a very good estimate of the \$E\$ term; call it \$R\$ (easy formula). The estimate \$Z(t) \approx M + R\$ is accurate plus or minus \10ドル^{-6}\$ for \$t > 200\$. Here, I am focused only on the \$M\$ term. Let
\begin{align*} N=\left[\left(\frac{t}{2\pi}\right)^{1/2}\right]\quad\text{and}\quad \theta(t) = \frac{t}{2}\log\left(\frac{t}{2\pi}\right) - \frac{\pi}{8} - \frac{t}{2} + \frac{1}{48 t} + \frac{7}{5760 t^3}, \end{align*}
where \$[x]\$ means the integral part of \$x\$. Then
\begin{align*} M(t) = 2 \sum_{n=1}^{N} n^{-1/2}\cos[\theta(t) - t \log n]. \end{align*}
Using \$Z(t) = M + R\$, my program calculates \$Z(t)\$ to an accuracy of at least 6 digits to the right of the decimal for \$t\$ values having up to 9 total digits (including digits to the left and right of the decimal). Then, accuracy starts to fall off. Looking at the above formula for \$M(t)\$, I can see two problems:
For large \$t\$, \$N\$ gets fairly large. (For \$t = 10,000,000,000\$, \$N = 39,894\$). A very small error introduced by floating point calculations grows to a "real error" when multiplied 39,894 times.
For large \$t\$, both \$\theta(t)\$ and \$t \log n\$ become significant multiples of \$t\$. Thus, at each iteration of the sum, floating point calculations are being made on numbers much larger than \$t\$.
To address the second problem, my code avoids any numbers larger than \$t\$ in the calculations, using the fact that the cosine function is \2ドル\pi\$ periodic. To explain my code, assume we want the \2ドル\pi\$ remainder of \$A\cdot B\$, but direct manipulation yields a number that is too large. We can ensure that all intermediate values are less than \$\max(A,B)\$ as follows.
Let \$i,j,k,m\$ be positive integers and let \$w,x,y,z\$ be positive real numbers. Choose any \$K\$ that is a positive even integer. Now let
\begin{align*} A = K\pi i + w,\kern6pt B = K\pi j + x,\kern6pt (ix + jw) = k+y,\kern6pt(K\pi - [K\pi])ij = m + z, \end{align*}
where again \$[x]\$ means the integral part of \$x\$. We then have
\begin{align*} A \cdot B &= (K\pi i + w) \cdot (K\pi j + x) = K\pi\{ (K\pi - [K\pi] + [K\pi])(ij)\} + K\pi\ (ix + jw) + wx \\ &= K\pi \cdot [K\pi]ij + K\pi\{ (K\pi - [K\pi] )ij\} + K\pi(ix + jw) + wx \\ &= K\pi \cdot [K\pi]ij + K\pi (m+z) + K\pi (k+y) + wx \\ &= K\pi \cdot [K\pi]ij + K\pi (k+m) + K\pi (y+z) + wx. \end{align*}
We can now drop the terms evenly divisible by \2ドル\pi\$, leaving
\begin{align*} K\pi(y+z) + wx. \end{align*}
Now for my actual C code, and then my questions.
typedef long double xdouble;
// -------------------------------------------------------------------
// We compute the main term of the Riemann-Siegel formula. We will
// need the uiN variable from ComputeNandP, along with the
// current dt variable. The first step is to compute theta (one time).
// Using that theta, we make uiN passes through a calculating loop.
// -------------------------------------------------------------------
xdouble ComputeMain(xdouble dt, unsigned int uiN)
{
// -------------------------------------------------------------------
// We compute theta. The formula from our book is:
//
// xdouble dTheta = ((dt/2) * (log(dt / (2 * M_PI)))) - M_PI/8 - dt/2
// + 1/(48 * dt) + 7/(5760 * powl(dt, 3));
//
// The problem with that formula is dTheta can be multiples of the
// value of dt. As dt approaches our floating point limits, this
// will cause problems. To resolve this issue, note that dTheta
// is only used in the 2 pi periodic cosine function. Thus, all we
// need is the remainder of dTheta / (2 * pi). We implement here
// the algorith discussed in our paper.
// ------------------------------------------------------------------
#define PI_TIMESP (2 * M_PI) // it may be optimal to increase this
#define PI_TIMESPINT 6.0 // depend on PI_TIMESP
#define PI_TIMESK (4 * M_PI) // it may be optimal to increase this
#define PI_TIMESKINT 12.0 // depend on PI_TIMESK
xdouble dMain = 0;
xdouble i, j, k;
xdouble w, x, y, z;
w = PI_TIMESP * modfl(dt/(2 * PI_TIMESP), &i);
x = PI_TIMESP * modfl((log(dt / (2 * M_PI))-1) / PI_TIMESP, &j);
y = modfl(((i * x) + (j * w)), &k);
z = fmodl(((PI_TIMESP)-PI_TIMESPINT) * i * j, 1);
xdouble dTheta = ((y + z) * PI_TIMESP) + (w * x) - M_PI/8
+ 1/(48 * dt) + 7/(5760 * powl(dt, 3));
// -------------------------------------------------------------------
// Like dTheta, computing the term (dt * log((xdouble)n)) inside
// the cosine function can evaluate to multiples of dt, with the same
// floating point issues. We again would like to use the algorithm
// discussed in our paper to reduce the size of both (dt * log(n))
// and its intermediate variables. We can compute the algorithmic
// dt term outside the for loop. The other parts of the algorithm
// depend on n and must be done at each iteration of the for loop.
// ------------------------------------------------------------------
w = PI_TIMESK * modfl(dt/PI_TIMESK, &i);
xdouble dtLogN;
for (unsigned int n = 1; n <= uiN; ++n) {
x = PI_TIMESK * modfl(log((xdouble)n)/ PI_TIMESK, &j);
y = modfl(((i * x) + (j * w)), &k);
z = fmodl(((PI_TIMESK)-PI_TIMESKINT) * i * j, 1);
dtLogN = ((y + z) * PI_TIMESK) + (w * x);
dMain += ((1 / sqrtl((xdouble)n)) * cosl(dTheta - dtLogN));
}
return(2 * dMain);
}
My primary question is whether my "avoid variables larger than t" algorithm is useful, neutral or harmful. The useful part is avoiding numbers larger than \$t\$. The harmful part is including even more floating point calculations. I am hoping someone with floating point knowledge can give me their perspective.
Regarding the first problem, I’m not sure much can be done other than improve the accuracy of each iteration of the sum. When I switched from a compiler with a 64-bit long double to a compiler (gcc/mingw-64) with an 80-bit long double, that helped somewhat, but accuracy again falls off at ten total digits (at least if the left-most digit is above 5). Other than moving to a 128-bit long double, any other thoughts?
1 Answer 1
Use long double constants
M_PI
, although not part of the standard C library, is often coded like a double
constant like
#define M_PI 3.1415926535897932384626433832795
Yet OP is using long double
and mod math.
I recommend to use a long double
constant
// v Append 'L'
#define M_PI_L 3.141592653589793238462643383279502884L
// 1 23456789 123456789 123456789 1234567
Consider at least 3+ decimal digits more than binary128 or calculate at runtime.
const long double pi = acosl(-1.0L);
Enable all compiler warnings. Use long double
functions
Unclear why OP uses cosl()
, yet not logl()
for long double
math.
// log((xdouble)n)/ PI_TIMESK
logl((xdouble)n)/ PI_TIMESK
This hints that all warnings are not enabled.
Missing #include <math.h>
Avoid obfuscation typedef long double xdouble;
Perhaps OP wants to also try typedef double xdouble;
, etc. Yet the non-uniform use of long double
functions and constants leads to unneeded complications. For now, choose 1 type and stick to it, using its common name long double
.
For some generic math type matching, research <tgmath.h>
. If used, be cautious.
Use the same object names as reference equations
The undocumented casual name changes used by OP might make sense for OP, yet add more cognitive overhead then simply being consistent.
// xdouble ComputeMain(xdouble dt, unsigned int uiN)
long double ComputeMain(long double t, unsigned N)
M_PI
is not π
M_PI
is a rational number (some double
) that is near π (an irrational). OP's attempt to at doing some mod (M_PI*someInt) makes things worse than the mod π math that good cosl()
does better as it likely uses an extended math π constant.
Considering OP is not using a long double
version of log
and M_PI
underscore this.
If you really want to do mod π math, see
K.C. Ng's "ARGUMENT REDUCTION FOR HUGE ARGUMENTS: Good to the Last Bit"
Code directly the equation
First code a fully long double
calculation without your mod improvements and use as a reference.
Did this really perform unacceptably for OP?
long double theta = ((t/2) * (logl(t / (2 * M_PI_L)))) //
- M_PI_L/8 - t/2 + 1/(48 * t) + 7/(5760 * powl(t, 3));
...
long double sum = 0.0;
for (unsigned n = 1; n <= N; n++) {
sum += cosl(theta - t*logl(n))/sqrl(n);
}
long_double M_t = sum*2.0L;
Now consider mod
Maybe some user code mod math is good, I just do not see it yet. I do see casual type changes between double
and long double
that should be rectified first.
It is possible cospi()
may help (slated for C2X) in OP's mod code.
The cospi
functions compute the cosine of π ×ばつ x, thus regarding x as a measurement in half revolutions.
Note: %
and fmod()
, fmodl()
are better thought of a remainder functions than Euclidean mod.
Maybe: Summation order
A general guide is that the summation order, if are all the same sign is to sum the smallest magnitude parts first (from 'N' to 1). If alternating signs, from larges magnitude to least, use the forward order (from '1' to N). This may only improve things by a few bits.
In this case, its more complicated and deserves a study.
Perhaps more later. A very intriguing and advanced math post.
-
\$\begingroup\$ All great comments. (I'll wait just a bit longer before up-voting). I am new to this site. I could update the post with the full program code, but it is about 400 lines. Is that too much for this site? If that is OK, I should also explain HOW I verify the accuracy of the output. \$\endgroup\$TMurphy– TMurphy2024年10月14日 14:10:42 +00:00Commented Oct 14, 2024 at 14:10
-
1\$\begingroup\$ @TMurphy,take what ever time you want. If you want to update consider another post. Review help center for guidelines. Since you ask about accuracy, it makes sense to see your assessment criteria. \$\endgroup\$chux– chux2024年10月14日 14:47:08 +00:00Commented Oct 14, 2024 at 14:47
-
\$\begingroup\$ @TMurphy Using the simplified
ComputeMain()
to compute zeros, I found "Zero: HardyZ( 0 ,0) --> 0 Zero: HardyZ( 0.783403098773051473899861 ,1) --> -5.42101e-20 Zero: HardyZ( 0.627526795358594805462286 ,2) --> 5.69206e-19 Zero: HardyZ( 0.55437338863809121607823 ,3) --> -2.71051e-20 Zero: HardyZ( 0.5098041507833285227347 ,4) --> 4.87891e-19 Zero: HardyZ( 0.478990168127584384422709 ,5) --> -5.42101e-20 Zero: HardyZ( 0.411232898663775866637772 ,9) --> 0". How does this compare with expectations? \$\endgroup\$chux– chux2024年10月15日 14:10:47 +00:00Commented Oct 15, 2024 at 14:10 -
-
\$\begingroup\$ Reinstate Monika Not sure about your ComputeMain() values. Almost all values will be in the range -20 to 20. If you want to look at the source code, you can access it via link, along with a method to verify computed values. \$\endgroup\$TMurphy– TMurphy2024年10月16日 20:36:53 +00:00Commented Oct 16, 2024 at 20:36
xdouble dt
in your code represents T in your math andunsigned int uiN
in your code represents N in your math - is this the case? \$\endgroup\$