Correlation Test for Two Continuous Variables
cor_test.RdThis function is a wrapper for stats::cor.test function, except if
method = "spearman" is selected and there are ties in at least one
variable, in which case this is a wrapper for coin::spearman_test
employing the approximate method.
Usage
cor_test(
x,
y,
method = c("pearson", "kendall", "spearman"),
seed = 68954857,
nresample = 10000,
exact = TRUE,
verbose = FALSE,
...
)Arguments
- x
numeric vector (can include NA values).
- y
numeric vector (can include NA values).
- method
a character string indicating which correlation coefficient is to be used for the test. One of "pearson", "kendall", or "spearman", can be abbreviated to "p", "k", or "s".
- seed
seed (only used if
method = "spearman").- nresample
a positive integer, the number of Monte Carlo replicates used for the computation of the approximative reference distribution. Defaults to 10000. (only used if
method = "spearman").- exact
should exact method be used. Ignored if
method = "pearson"or ifmethod = "spearman"and there are ties in x or y.- verbose
a logical variable indicating if warnings and messages should be displayed.
- ...
parameters passed to stats::cor.test or coin::spearman_test
Details
The three methods each estimate the association between paired samples and compute a test of the value being zero. They use different measures of association, all in the range [-1, 1] with 0 indicating no association. These are sometimes referred to as tests of no correlation, but that term is often confined to the default method.
If method is "pearson", the test statistic is based on Pearson's product moment correlation coefficient cor(x, y) and follows a t distribution with length(x)-2 degrees of freedom if the samples follow independent normal distributions. If there are at least 4 complete pairs of observation, an asymptotic confidence interval is given based on Fisher's Z transform.
If method is "kendall" or "spearman", Kendall's tau or Spearman's rho statistic is used to estimate a rank-based measure of association. These tests may be used if the data do not necessarily come from a bivariate normal distribution.
The preferred method for a Spearman test is using the exact method, unless
computation time is too high. This
preferred method is obtained though stats::cor.test with exact = TRUE.
When there are ties in either variable there is no exact method possible.
Unfortunately if there are any ties the stats::cor.test function switches
to the asymptotic method, which is especially troubling with small sample
sizes. If there are ties cor_test will switch to the approximate
method available in the coin::spearman_test.
Examples
set.seed(5432322)
x <- rnorm(20,0,3)
y <- x + rnorm(20,0,5)
cor_test(x,y, method = 'pearson')
#> [1] 0.003244366
cor_test(x,y, method = 'kendall')
#> [1] 0.0237345
cor_test(x,y, method = 'spearman')
#> [1] 0.01473093
# Adding ties
cor_test(c(x,x), c(y,y), method = 'spearman',
seed = 1, nresample = 10000, verbose = TRUE)
#> Either "x" or "y" has ties, so using approximate method.
#> [1] 7e-04