Title: | Robust Rank Correlation Coefficient and Test |
---|---|
Description: | Provides the robust gamma rank correlation coefficient as introduced by Bodenhofer, Krone, and Klawonn (2013) <DOI:10.1016/j.ins.2012.11.026> along with a permutation-based rank correlation test. The rank correlation coefficient and the test are explicitly designed for dealing with noisy numerical data. |
Authors: | Martin Krone [aut], Ulrich Bodenhofer [aut,cre] |
Maintainer: | Ulrich Bodenhofer <[email protected]> |
License: | GPL (>=2) |
Version: | 1.1.9 |
Built: | 2024-11-22 04:06:11 UTC |
Source: | https://github.com/ubod/rococo |
Compute the Gaussian rank correlation estimate
gauss.cor(x, y)
gauss.cor(x, y)
x |
a numeric vector; compulsory argument |
y |
a numeric vector; compulsory argument; |
gauss.cor
computes the Gaussian rank correlation estimate for
x
and y
.
Note that gauss.cor
only works for x
and y
being
numeric vectors, unlike the classical correlation measures implemented
in cor
which can also be computed for matrices or data
frames.
Upon successful completion, the function returns the Gaussian rank correlation estimate.
Ulrich Bodenhofer
https://github.com/UBod/rococo
K. Boudt, J. Cornelissen, and C. Croux (2012). The Gaussian rank correlation estimator: robustness properties. Stat. Comput. 22(2):471-483. DOI: doi:10.1007/s11222-011-9237-0.
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## compute correlation gauss.cor(x, y)
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## compute correlation gauss.cor(x, y)
Methods performing a Gaussian rank correlation test
## S4 method for signature 'numeric,numeric' gauss.cor.test(x, y, ...) ## S4 method for signature 'formula,data.frame' gauss.cor.test(x, y, na.action, ...)
## S4 method for signature 'numeric,numeric' gauss.cor.test(x, y, ...) ## S4 method for signature 'formula,data.frame' gauss.cor.test(x, y, na.action, ...)
x |
a numeric vector or a formula; compulsory argument |
y |
compulsory argument; if |
na.action |
a function which indicates what should happen when the data
contain |
... |
all parameters specified are forwarded internally to the
method |
If called for numeric vectors, gauss.cor.test
performs the
Gaussian gamma rank correlation test for x
and y
. This
is done by simply performing a Pearson correlation test on the normal
scores of the data.
If gauss.cor.test
is called for a formula x
and a data
frame y
, then the method checks whether the formula x
correctly
extracts two columns from y
(see examples below). If so, the
two columns are extracted and the Gaussian gamma rank correlation test
is applied to them according to the specified parameters.
Upon successful completion, the function returns a list of class
htest
containing the results (see cor.test
).
Ulrich Bodenhofer
https://github.com/UBod/rococo
K. Boudt, J. Cornelissen, and C. Croux (2012). The Gaussian rank correlation estimator: robustness properties. Stat. Comput. 22(2):471-483. DOI: doi:10.1007/s11222-011-9237-0.
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## perform correlation tests gauss.cor.test(x, y, alternative="greater") ## the formula variant require(datasets) data(iris) gauss.cor.test(~ Petal.Width + Petal.Length, iris, alternative="two.sided")
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## perform correlation tests gauss.cor.test(x, y, alternative="greater") ## the formula variant require(datasets) data(iris) gauss.cor.test(~ Petal.Width + Petal.Length, iris, alternative="two.sided")
Compute the robust gamma rank correlation coefficient
rococo(x, y, similarity=c("linear", "exp", "gauss", "epstol", "classical"), tnorm="min", r=0, noVarReturnZero=TRUE)
rococo(x, y, similarity=c("linear", "exp", "gauss", "epstol", "classical"), tnorm="min", r=0, noVarReturnZero=TRUE)
x |
a numeric vector; compulsory argument |
y |
a numeric vector; compulsory argument; |
similarity |
a character string or a character vector identifying
which type of similarity measure to use; valid values are
|
tnorm |
can be any of the following strings identifying a
standard tnorm: |
r |
numeric vector defining the tolerances to be used; if a
single value is supplied, the same value is used both for |
noVarReturnZero |
if |
rococo
computes the robust gamma rank correlation
coefficient of x
and y
according to the specified
parameters (see literature for more details).
Note that rococo
only works for x
and y
being
numeric vectors, unlike the classical correlation measures implemented
in cor
which can also be computed for matrices or data
frames.
Upon successful completion, the function returns the robust gamma rank correlation coefficient.
Martin Krone and Ulrich Bodenhofer
https://github.com/UBod/rococo
U. Bodenhofer and F. Klawonn (2008). Robust rank correlation coefficients on the basis of fuzzy orderings: initial steps. Mathware Soft Comput. 15(1):5-20.
U. Bodenhofer, M. Krone, and F. Klawonn (2013). Testing noisy numerical data for monotonic association. Inform. Sci. 245:21-37. DOI: doi:10.1016/j.ins.2012.11.026.
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## compute correlation rococo(x, y, similarity="classical") rococo(x, y, similarity="linear") rococo(x, y, similarity=c("classical", "gauss"), r=c(0, 0.1))
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## compute correlation rococo(x, y, similarity="classical") rococo(x, y, similarity="linear") rococo(x, y, similarity=c("classical", "gauss"), r=c(0, 0.1))
Methods performing a robust gamma rank correlation test
## S4 method for signature 'numeric,numeric' rococo.test(x, y, similarity=c("linear", "exp", "gauss", "epstol", "classical"), tnorm="min", r=0, numtests=1000, storeValues=FALSE, exact=FALSE, alternative=c("two.sided", "less", "greater"), noVarReturnZero=TRUE) ## S4 method for signature 'formula,data.frame' rococo.test(x, y, na.action, ...)
## S4 method for signature 'numeric,numeric' rococo.test(x, y, similarity=c("linear", "exp", "gauss", "epstol", "classical"), tnorm="min", r=0, numtests=1000, storeValues=FALSE, exact=FALSE, alternative=c("two.sided", "less", "greater"), noVarReturnZero=TRUE) ## S4 method for signature 'formula,data.frame' rococo.test(x, y, na.action, ...)
x |
a numeric vector or a formula; compulsory argument |
y |
compulsory argument; if |
similarity |
a character string or a character vector identifying
which type of similarity measure to use; see |
tnorm |
t-norm used for aggregating results; see |
r |
numeric vector defining the tolerances to be used;
see |
numtests |
number of random shuffles to perform; see details below. |
storeValues |
logical indicating whether the vector of test
statistics should be stored in the output object (in slot
|
exact |
logical indicating whether exact p-value should be computed; see details below. |
alternative |
indicates the alternative hypothesis and must be one of
|
noVarReturnZero |
if |
na.action |
a function which indicates what should happen when the data
contain |
... |
all parameters specified are forwarded internally to the
method |
If called for numeric vectors, rococo.test
computes the
robust gamma rank correlation coefficient of x
and y
according to the specified parameters (see rococo
) and
then performs a permutation test to compute a p-value.
If exact=TRUE
, rococo.test
attempts to compute an exact
p-value and ignores the numtests
argument. This is done by
considering all possible permutations and computing the ratio of
permutations for which the test statistic is at least as large/small
as the test statistic for unshuffled data. This works only for 10 or
less samples. Otherwise exact=TRUE
is ignored, a warning is
issued and random shuffles are considered to estimate the p-value (as
follows next).
If exact=FALSE
, numtests
random shuffles of y
are
performed and the empirical standard deviation of the robust gamma
correlation values for these shuffled data sets is computed.
Under the assumption that these values are normally distributed around
mean zero, the p-value is then computed from this distribution in the
usual way.
Note that a too small choice of the number of shuffles (parameter
numtests
) leads to unreliable p-values.
If rococo.test
is called for a formula x
and a data
frame y
, then the method checks whether the formula x
correctly
extracts two columns from y
(see examples below). If so, the
two columns are extracted and the robust gamma rank correlation test
is applied to them according to the specified parameters.
Note that exact=TRUE
may result in long computation times for
user-defined t-norms.
Upon successful completion, the function returns an object of class
RococoTestResults
containing the results.
Martin Krone and Ulrich Bodenhofer
https://github.com/UBod/rococo
U. Bodenhofer, M. Krone, and F. Klawonn (2013). Testing noisy numerical data for monotonic association. Inform. Sci. 245:21-37. DOI: doi:10.1016/j.ins.2012.11.026.
U. Bodenhofer and F. Klawonn (2008). Robust rank correlation coefficients on the basis of fuzzy orderings: initial steps. Mathware Soft Comput. 15(1):5-20.
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## perform correlation tests rococo.test(x, y, similarity="classical", alternative="greater") rococo.test(x, y, similarity="linear", alternative="greater") rococo.test(x, y, similarity=c("classical", "gauss"), r=c(0, 0.1), alternative="greater", numtests=10000) ## the formula variant require(datasets) data(iris) rococo.test(~ Petal.Width + Petal.Length, iris, similarity="linear", alternative="two.sided")
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## perform correlation tests rococo.test(x, y, similarity="classical", alternative="greater") rococo.test(x, y, similarity="linear", alternative="greater") rococo.test(x, y, similarity=c("classical", "gauss"), r=c(0, 0.1), alternative="greater", numtests=10000) ## the formula variant require(datasets) data(iris) rococo.test(~ Petal.Width + Petal.Length, iris, similarity="linear", alternative="two.sided")
S4 class for storing results of the robust rank correlation test
Objects of this class can be created by calling rococo.test
.
The following slots are defined for RococoTestResults
objects:
count
:number of times in which the test
statistic for a random shuffle exceeded the test statistic of the
true data; see rococo.test
.
tnorm
:list identifying t-norm to use or
two-argument function; see rococo
.
If one of the standard choices "min"
, "prod"
, or
"lukasiewicz"
has been used, the list has one component,
name
that contains the string identifying the t-norm.
If a user-defined function has been used, the list has two
components: name
contains "user-defined t-norm"
or the name
attribute of the function object if available
and def
contains the function object itself.
input
:character string describing the input for which
rococo.test
has been called.
length
:number of samples for which
rococo.test
has been called.
p.value
:p-value of test.
p.value.approx
:p-value as based on a normal approximation of the null distribution.
r.values
:vector containing tolerance levels for the
two inputs; see rococo.test
or
rococo
.
numtests
:number of (random) shuffles performed by
rococo.test
.
exact
:logical indicating whether p-value has been
computed exactly; see rococo.test
.
similarity
:character (vector) identifying the
similarity measure(s) used by rococo.test
.
sample.gamma
:test statistic (robust gamma rank
correlation coefficient) determined by
rococo.test
.
H0gamma.mu
:empirical mean of test statistic for random shuffles
H0gamma.sd
:empirical standard deviation of test statistic for random shuffles
perm.gamma
:in case rococo.test
was
called with storeValues=TRUE
, this slot contains the
vector of test statistics for random shuffles.
alternative
:alternative hypothesis used by
rococo.test
.
signature(object = "RococoTestResults")
: d
displays the most important information stored in
object
Martin Krone and Ulrich Bodenhofer
https://github.com/UBod/rococo
U. Bodenhofer, M. Krone, and F. Klawonn (2013). Testing noisy numerical data for monotonic association. Inform. Sci. 245:21-37. DOI: doi:10.1016/j.ins.2012.11.026.
U. Bodenhofer and F. Klawonn (2008). Robust rank correlation coefficients on the basis of fuzzy orderings: initial steps. Mathware Soft Comput. 15(1):5-20.
rococo.test
, rococo
,
show-methods
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## perform correlation tests ret <- rococo.test(x, y, similarity="classical", alternative="greater") show(ret) ret <- rococo.test(x, y, similarity="linear", alternative="greater") show(ret) ret <- rococo.test(x, y, similarity=c("classical", "gauss"), r=c(0, 0.1), alternative="greater", numtests=10000) show(ret)
## create data f <- function(x) ifelse(x > 0.9, x - 0.9, ifelse(x < -0.9, x + 0.9, 0)) x <- rnorm(25) y <- f(x) + rnorm(25, sd=0.1) ## perform correlation tests ret <- rococo.test(x, y, similarity="classical", alternative="greater") show(ret) ret <- rococo.test(x, y, similarity="linear", alternative="greater") show(ret) ret <- rococo.test(x, y, similarity=c("classical", "gauss"), r=c(0, 0.1), alternative="greater", numtests=10000) show(ret)