Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Calculate confidence intervals for correlation coefficients

License

Notifications You must be signed in to change notification settings

jfenger/correlation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

2 Commits

Repository files navigation

correlation

Calculate confidence intervals for correlation coefficients, including Pearson's R, Kendall's tau, Spearman's rho, and customized correlation measures.

Methodology

Two approaches are offered to calculate the confidence intervals, one parametric approach based on normal approximation, and one non-parametric approach based on bootstrapping.

Parametric Approach

Say r_hat is the correlation we obtained, then with a transformation

z = ln((1+r)/(1-r))/2,

z would approximately follow a normal distribution,
with a mean equals to z(r_hat),
and a variance sigma^2 that equals to 1/(n-3), 0.437/(n-4), (1+r_hat^2/2)/(n-3) for the Pearson's r, Kendall's tau, and Spearman's rho, respectively (read Ref. [1, 2] for more details). n is the array length.

The (1-alpha) CI for r would be

(T(z_lower), T(z_upper))

where T is the inverse of the transformation mentioned earlier

T(x) = (exp(2x) - 1) / (exp(2x) + 1),
z_lower = z - z_(1-alpha/2) sigma,
z_upper = z + z_(1-alpha/2) sigma.

This normal approximation works when the absolute values of the Pearson's r, Kendall's tau, and Spearman's rho are less than 1, 0.8, and 0.95, respectively.

Nonparametric Approach

For the nonparametric approach, we simply adopt a naive bootstrap method.

  • We sample a pair (x_i, y_i) with replacement from the original (paired) samples until we have a sample size that equals to n, and calculate a correlation coefficient from the new samples.
  • Repeat this process for a large number of times (by default we use 5000),
  • then we could obtain the (1-alpha) CI for r by taking the alpha/2 and (1-alpha/2) quantiles of the obtained correlation coefficients.

References

[1] Bonett, Douglas G., and Thomas A. Wright. "Sample size requirements for estimating Pearson, Kendall and Spearman correlations." Psychometrika 65, no. 1 (2000): 23-28.
[2] Bishara, Anthony J., and James B. Hittner. "Confidence intervals for correlations when data are not normal." Behavior research methods 49, no. 1 (2017): 294-309.

Installation:

pip install correlation

or

conda install -c wangxiangwen correlation

Example Usage:

>>> import correlation
>>> a, b = list(range(2000)), list(range(200, 0, -1)) * 10
>>> correlation.corr(a, b, method='spearman_rho')
(-0.0999987624920335, # correlation coefficient
 -0.14330929583811683, # lower endpoint of CI
 -0.056305939127336606, # upper endpoint of CI
 7.446171861744971e-06) # p-value

About

Calculate confidence intervals for correlation coefficients

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%

AltStyle によって変換されたページ (->オリジナル) /