predict.mvout {mvout}R Documentation

Predict method for Robust Multivariate Outlier Detection

Description

predict method for class "mvout".

Usage

## S3 method for class 'mvout'
predict(object, 
        x, 
        type = c("distance", "outlier", "scores"), 
        thresh = 0.01,
        ...)

Arguments

object

Object of class mvout

x

Optional matrix of new data used for the predictions. If omitted, the original data are used if keepx = TRUE (and error is produced otherwise).

type

Type of prediction to return: "distance" returns the predicted Mahalanobis distance, "outlier" returns the predicted outlier status (T/F) using the specified thresh, and "scores" returns the predicted principal component or factor scores (if applicable).

thresh

Scalar specifying the threshold for flagging outliers (0 < thresh < 1). See mvout for details.

...

Additional arguments (ignored)

Details

Produces predictions from the input new data x using the robust parameter estimates (of location and scatter) from the input "mvout" object.

Value

Returns a vector of numerics ("distance" or "scores") or logicals ("outlier").

Note

If you input the same x that was used to estimate the location and scale parameters you will obtain:

(1) the same "distance" and "scores" as output by the mvout function

(2) a potentially different "outlier" result than what is output by the mvout function

The discrepancy in (2) is because all of the observations are considered to have been excluded from the location/scatter estimation when the x argument is provided. This results in a different critical value being used for the observations that were included in the MCD estimate. For boarderline cases, this slight change in the critical value could result in a change of outlier status.

Author(s)

Jesus E. Delgado <delga220@umn.edu> Nathaniel E. Helwig <helwig@umn.edu>

See Also

mvout for estimation of (robust) location/scatter.

Examples

# generate some data
n <- 200
p <- 2
set.seed(0)
x <- matrix(rnorm(n * p), n, p)

# thresh = 0.01
set.seed(1)    # for reproducible MCD estimate
out1 <- mvout(x)

# predicted distance (same as before)
fit1 <- predict(out1, x = x)
max(abs(fit1 - out1$distance))

# predicted outlier (differs from before)
fit1 <- predict(out1, x = x, type = "outlier")
mean(abs(fit1 == out1$outlier))


[Package mvout version 1.2 Index]