confidence_ellipsoid {ConfidenceEllipse} | R Documentation |
Confidence Ellipsoid Coordinates
Description
Compute the coordinate points of confidence ellipsoids at a specified confidence level.
Usage
confidence_ellipsoid(
.data,
x,
y,
z,
.group_by = NULL,
conf_level = 0.95,
robust = FALSE,
distribution = "normal"
)
Arguments
.data |
data frame or tibble. |
x |
column name for the x-axis variable. |
y |
column name for the y-axis variable. |
z |
column name for the z-axis variable. |
.group_by |
column name for the grouping variable ( |
conf_level |
confidence level for the ellipsoid (0.95 by default). |
robust |
optional ( |
distribution |
optional ( |
Details
The function computes the coordinates of the confidence ellipse based
on the specified confidence level and the provided data. It can handle both classical
and robust estimation methods, and it supports grouping by a factor variable.
The distribution
parameter controls the statistical approach used for ellipse
calculation. The "normal"
option uses the chi-square distribution quantile,
which is appropriate when working with very large samples.
Whereas the "hotelling"
option uses Hotelling's T² distribution quantile.
This approach accounts for uncertainty in estimating both mean and covariance
from sample data, producing larger ellipses that better reflect sampling uncertainty.
This is statistically more rigorous for smaller sample sizes where parameter
estimation uncertainty is higher.
The combination of distribution = "hotelling"
and robust = TRUE
offers the
most conservative and statistically rigorous approach, particularly recommended
for exploratory data analysis and when dealing with datasets that may
not meet ideal statistical assumptions. For very large samples, the default
settings (distribution = "normal"
, robust = FALSE
) may be sufficient, as
the differences between methods diminish with increasing sample size.
Value
Data frame of the coordinates points.
Author(s)
Christian L. Goueguel
References
Raymaekers, J., Rousseeuw P.J. (2019). Fast robust correlation for high dimensional data. Technometrics, 63(2), 184-198.
Brereton, R. G. (2016). Hotelling’s T-squared distribution, its relationship to the F distribution and its use in multivariate space. Journal of Chemometrics, 30(1), 18–21.
Examples
# Data
data("glass", package = "ConfidenceEllipse")
# Confidence ellipsoid
ellipsoid <- confidence_ellipsoid(.data = glass, x = SiO2, y = Na2O, z = Fe2O3)
ellipsoid_grp <- confidence_ellipsoid(
.data = glass,
x = SiO2,
y = Na2O,
z = Fe2O3,
.group_by = glassType
)