irdc {FORD}R Documentation

Estimate the Integrated R-squared Dependence Coefficient (irdc)

Description

The Integrated R-squared Dependence Coefficient (irdc) is a measure of dependence between a random variable Y and a random vector X, based on an i.i.d. sample of (Y, X). The estimated coefficient is asymptotically guaranteed to lie between 0 and 1. The measure is asymmetrical; that is, irdc(X, Y) != irdc(Y, X). The measure equals 0 if and only if X is independent of Y, and it equals 1 if and only if Y is a measurable function of X. This coefficient has several applications; for example, it can be used for variable selection, as demonstrated in the ford function.

Usage

irdc(Y, X, dist.type.X = "continuous", na.rm = TRUE)

Arguments

Y

A vector of length n.

X

A vector or matrix of length n (or with n rows).

dist.type.X

A string specifying the distribution type of X: either "continuous" or "discrete". Default is "continuous".

na.rm

Logical; if TRUE, missing values (NAs) will be removed. Default is TRUE.

Details

The value returned by 'irdc' can be positive or negative for finite samples, but asymptotically, it is guaranteed to be between 0 and 1. A small value indicates low dependence between Y and X, while a high value indicates strong dependence. The 'irdc' function is used by the ford function for variable selection.

Value

The Integrated R-squared Dependence Coefficient (irdc) between Y and X.

Author(s)

Mona Azadkia, Pouya Roudaki

References

Azadkia, M. and Roudaki, P. (2025). A New Measure Of Dependence: Integrated R2 http://arxiv.org/abs/2505.18146.

See Also

ford, irdc_simple, codec, xicor, KPCgraph, KPCRKHS

Examples

n = 1000
x <- matrix(runif(n * 3), nrow = n)
y <- (x[, 1] + x[, 2])
irdc(y, x[, 1])
irdc(y, x[, 2])
irdc(y, x[, 3])

[Package FORD version 0.1.2 Index]