irdc {FORD} | R Documentation |
Estimate the Integrated R-squared Dependence Coefficient (irdc)
Description
The Integrated R-squared Dependence Coefficient (irdc) is a measure of dependence between
a random variable Y and a random vector X, based on an i.i.d. sample of (Y, X).
The estimated coefficient is asymptotically guaranteed to lie between 0 and 1.
The measure is asymmetrical; that is, irdc(X, Y) != irdc(Y, X).
The measure equals 0 if and only if X is independent of Y, and it equals 1 if and only if
Y is a measurable function of X.
This coefficient has several applications; for example, it can be used for variable selection, as demonstrated in the ford
function.
Usage
irdc(Y, X, dist.type.X = "continuous", na.rm = TRUE)
Arguments
Y |
A vector of length n. |
X |
A vector or matrix of length n (or with n rows). |
dist.type.X |
A string specifying the distribution type of X: either "continuous" or "discrete". Default is "continuous". |
na.rm |
Logical; if TRUE, missing values (NAs) will be removed. Default is TRUE. |
Details
The value returned by 'irdc' can be positive or negative for finite samples,
but asymptotically, it is guaranteed to be between 0 and 1.
A small value indicates low dependence between Y and X, while a high value indicates strong dependence.
The 'irdc' function is used by the ford
function for variable selection.
Value
The Integrated R-squared Dependence Coefficient (irdc) between Y and X.
Author(s)
Mona Azadkia, Pouya Roudaki
References
Azadkia, M. and Roudaki, P. (2025). A New Measure Of Dependence: Integrated R2 http://arxiv.org/abs/2505.18146.
See Also
ford
, irdc_simple
, codec
, xicor
, KPCgraph
, KPCRKHS
Examples
n = 1000
x <- matrix(runif(n * 3), nrow = n)
y <- (x[, 1] + x[, 2])
irdc(y, x[, 1])
irdc(y, x[, 2])
irdc(y, x[, 3])