It estimates the Distance Correlation coefficient (dcorr) for a continuous predicted-observed dataset.

## Usage

dcorr(data = NULL, obs, pred, tidy = FALSE, na.rm = TRUE)

## Arguments

data

(Optional) argument to call an existing data frame containing the data.

obs

Vector with observed values (numeric).

pred

Vector with predicted values (numeric).

tidy

logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list (default).

na.rm

Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE.

## Value

an object of class numeric within a list (if tidy = FALSE) or within a data frame (if tidy = TRUE).

## Details

The dcorr function is a wrapper for the dcor function from the energy-package. See Rizzo & Szekely (2022). The distance correlation (dcorr) coefficient is a novel measure of dependence between random vectors introduced by Szekely et al. (2007).

The dcorr is characterized for being symmetric, which is relevant for the predicted-observed case (PO).

For all distributions with finite first moments, distance correlation $$\mathcal R$$ generalizes the idea of correlation in two fundamental ways:

(1) $$\mathcal R(P,O)$$ is defined for $$P$$ and $$O$$ in arbitrary dimension.

(2) $$\mathcal R(P,O)=0$$ characterizes independence of $$P$$ and $$O$$.

Distance correlation satisfies $$0 \le \mathcal R \le 1$$, and $$\mathcal R = 0$$ only if $$P$$ and $$O$$ are independent. Distance covariance $$\mathcal V$$ provides a new approach to the problem of testing the joint independence of random vectors. The formal definitions of the population coefficients $$\mathcal V$$ and $$\mathcal R$$ are given in Szekely et al. (2007).

The empirical distance correlation $$\mathcal{R}_n(\mathbf{P,O})$$ is the square root of $$\mathcal{R}^2_n(\mathbf{P,O})= \frac {\mathcal{V}^2_n(\mathbf{P,O})} {\sqrt{ \mathcal{V}^2_n (\mathbf{P}) \mathcal{V}^2_n(\mathbf{O})}}.$$

For the formula and more details, see online-documentation and the energy-package

## References

Szekely, G.J., Rizzo, M.L., and Bakirov, N.K. (2007). Measuring and testing dependence by correaltion of distances. Annals of Statistics, Vol. 35(6): 2769-2794. doi:10.1214/009053607000000505 .

Rizzo, M., and Szekely, G. (2022). energy: E-Statistics: Multivariate Inference via the Energy of Data. R package version 1.7-10. https://CRAN.R-project.org/package=energy.

eval_tidy, defusing-advanced dcor, energy
# \donttest{