IPCR {OPCreg}R Documentation

Incremental Principal Component Regression for Online Datasets

Description

The IPCR function implements an incremental Principal Component Regression (PCR) method designed to handle online datasets. It updates the principal components recursively as new data arrives, making it suitable for real-time data processing.

Usage

IPCR(data, eta, m, alpha)

Arguments

data

A data frame where the first column is the response variable and the remaining columns are predictor variables.

eta

The proportion of the initial sample size used to initialize the principal components (0 < eta < 1). Default is 0.0035.

m

The number of principal components to retain. Default is 3.

alpha

The significance level used for calculating critical values. Default is 0.05.

Details

The IPCR function performs the following steps: 1. Standardizes the predictor variables. 2. Initializes the principal components using the first n0 = round(eta * n) samples. 3. Recursively updates the principal components as each new sample arrives. 4. Fits a linear regression model using the principal component scores. 5. Back-transforms the regression coefficients to the original scale.

This method is particularly useful for datasets where new observations are continuously added, and the model needs to be updated incrementally.

Value

A list containing the following elements:

Bhat

The estimated regression coefficients, including the intercept.

RMSE

The Root Mean Square Error of the regression model.

summary

The summary of the linear regression model.

yhat

The predicted values of the response variable.

See Also

lm: For fitting linear models.

eigen: For computing eigenvalues and eigenvectors.

Examples

## Not run: 
set.seed(1234)
library(MASS)
n <- 2000
p <- 10
mu0 <- as.matrix(runif(p, 0))
sigma0 <- as.matrix(runif(p, 0, 10))
ro <- as.matrix(c(runif(round(p / 2), -1, -0.8), runif(p - round(p / 2), 0.8, 1)))
R0 <- ro %*% t(ro)
diag(R0) <- 1
Sigma0 <- sigma0 %*% t(sigma0) * R0
x <- mvrnorm(n, mu0, Sigma0)
colnames(x) <- paste("x", 1:p, sep = "")
e <- rnorm(n, 0, 1)
B <- sample(1:3, (p + 1), replace = TRUE)
en <- matrix(rep(1, n * 1), ncol = 1)
y <- cbind(en, x) %*% B + e
colnames(y) <- paste("y")
data <- data.frame(cbind(y, x))

result <- IPCR(data = data, m = 3, eta = 0.0035, alpha = 0.05)
print(result$Bhat)
print(result$yhat)
print(result$RMSE)
print(result$summary)

## End(Not run)


[Package OPCreg version 3.0.0 Index]