kNN.Mahalanobis {sicure} | R Documentation |
K Nearest Neighbors with Mahalanobis Distance
Description
This function computes the k
nearest neighbors for a given set of data points,
where each observation is a pair of the form (X, T)
, with X
representing a covariate and T
the observed time.
The distance between each pair of points is computed using the Mahalanobis distance:
d_M((X_i, T_i), (X_j, T_j)) = \sqrt{ \left( \begin{pmatrix} X_i \\ T_i \end{pmatrix} - \begin{pmatrix} X_j \\ T_j \end{pmatrix} \right)^t \Sigma^{-1} \left( \begin{pmatrix} X_i \\ T_i \end{pmatrix} - \begin{pmatrix} X_j \\ T_j \end{pmatrix} \right) },
where \Sigma
is the variance-covariance matrix of the joint distribution of (X, T)
.
Usage
kNN.Mahalanobis(x, time, k)
Arguments
x |
A numeric vector of length |
time |
A numeric vector giving the observed times. |
k |
The number of nearest neighbors to search. |
Value
A matrix with n
rows and k
columns. Each row represents
each pair (X_i, T_i)
. The values in each row give the index of the
k
nearest neighbors considering Mahalanobis distance.
References
Mahalanobis, P. C. (1936). On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India, 2, 49-55.
Examples
# Some artificial data
set.seed(123)
n <- 50
x <- runif(n, -2, 2) # Covariate values
y <- rweibull(n, shape = 0.5 * (x + 4)) # True lifetimes
c <- rexp(n) # Censoring values
p <- exp(2*x)/(1 + exp(2*x)) # Probability of being susceptible
u <- runif(n)
t <- ifelse(u < p, pmin(y, c), c) # Observed times
d <- ifelse(u < p, ifelse(y < c, 1, 0), 0) # Uncensoring indicator
kNN.Mahalanobis(x=x, time=t, k=5)