PPCR {OPCreg} | R Documentation |
Perturbation-based Principal Component Regression
Description
This function performs Perturbation-based Principal Component Regression (PPCR) on the provided dataset. It combines Principal Component Analysis (PCA) with linear regression, incorporating perturbation to enhance robustness.
Usage
PPCR(data, eta = 0.0035, m = 3, alpha = 0.05, perturbation_factor = 0.1)
Arguments
data |
A data frame containing the response variable and predictors. |
eta |
A proportion (between 0 and 1) determining the initial sample size for PCA. |
m |
The number of principal components to retain. |
alpha |
Significance level (currently not used in the function). |
perturbation_factor |
A factor controlling the magnitude of perturbation added to the principal components. |
Details
The function first standardizes the predictors, then performs PCA on an initial subset of the data. It iteratively updates the principal components by incorporating new observations and adding random perturbations. Finally, it fits a linear regression model using the principal components as predictors and transforms the coefficients back to the original space.
Value
A list containing the following components:
Bhat |
Estimated regression coefficients in the original space. |
RMSE |
Root Mean Squared Error of the regression model. |
summary |
Summary of the linear regression model. |
Vhat |
Estimated principal components. |
lambdahat |
Estimated eigenvalues. |
yhat |
Predicted values from the regression model. |
See Also
lm
: For linear regression models.
prcomp
: For principal component analysis.
Examples
## Not run:
# Example data
set.seed(1234)
n <- 2000
p <- 10
mu0 <- as.matrix(runif(p, 0))
sigma0 <- as.matrix(runif(p, 0, 10))
ro <- as.matrix(c(runif(round(p / 2), -1, -0.8), runif(p - round(p / 2), 0.8, 1)))
R0 <- ro %*% t(ro)
diag(R0) <- 1
Sigma0 <- sigma0 %*% t(sigma0) * R0
x <- mvrnorm(n, mu0, Sigma0)
colnames(x) <- paste("x", 1:p, sep = "")
e <- rnorm(n, 0, 1)
B <- sample(1:3, (p + 1), replace = TRUE)
en <- matrix(rep(1, n * 1), ncol = 1)
y <- cbind(en, x) %*% B + e
colnames(y) <- paste("y")
data <- data.frame(cbind(y, x))
# Call the PPCR function
result <- PPCR(data, eta = 0.0035, m = 3, alpha = 0.05, perturbation_factor = 0.1)
# Print results
print(result$Bhat) # Estimated regression coefficients
print(result$RMSE) # RMSE of the model
print(result$summary) # Summary of the regression model
## End(Not run)