optim.pls.cv {fastPLS} | R Documentation |
Cross-Validation with PLS-DA.
Description
This function performs a 10-fold cross validation on a given data set using Partial Least Squares (PLS) model. To assess the prediction ability of the model, a 10-fold cross-validation is conducted by generating splits with a ratio 1:9 of the data set. This is achieved by removing 10% of samples prior to any step of the statistical analysis, including PLS component selection and scaling. Best number of component for PLS was carried out by means of 10-fold cross-validation on the remaining 90% selecting the best Q2y value. Permutation testing was undertaken to estimate the classification/regression performance of predictors.
Usage
optim.pls.cv (Xdata,
Ydata,
ncomp,
constrain=NULL,
scaling = c("centering", "autoscaling","none"),
method = c("plssvd", "simpls"),
svd.method = c("irlba", "dc"),
kfold=10)
Arguments
Xdata |
a matrix of independent variables or predictors. |
Ydata |
the responses. If Ydata is a numeric vector, a regression analysis will be performed. If Ydata is factor, a classification analysis will be performed. |
ncomp |
the number of latent components to be used for classification. |
constrain |
a vector of |
scaling |
the scaling method to be used. Choices are " |
method |
the algorithm to be used to perform the PLS. Choices are " |
svd.method |
the SVD method to be used to perform the PLS. Choices are " |
kfold |
number of cross-validations loops. |
Value
The output of the result is a list with the following components:
B |
the (p x m x length(ncomp)) array containing the regression coefficients. Each row corresponds to a predictor variable and each column to a response variable. The third dimension of the matrix B corresponds to the number of PLS components used to compute the regression coefficients. If ncomp has length 1, B is just a (p x m) matrix. |
Ypred |
the vector containing the predicted values of the response variables obtained by cross-validation. |
Yfit |
the vector containing the fitted values of the response variables. |
P |
the (p x max(ncomp)) matrix containing the X-loadings. |
Q |
the (m x max(ncomp)) matrix containing the Y-loadings. |
T |
the (ntrain x max(ncomp)) matrix containing the X-scores (latent components) |
R |
the (p x max(ncomp)) matrix containing the weights used to construct the latent components. |
Q2Y |
predicting power of model. |
R2Y |
proportion of variance in Y. |
R2X |
vector containg the explained variance of X by each PLS component. |
txtQ2Y |
a summary of the Q2y values. |
txtR2Y |
a summary of the R2y values. |
Author(s)
Dupe Ojo, Alessia Vignoli, Stefano Cacciatore, Leonardo Tenori
See Also
Examples
data(iris)
data=iris[,-5]
labels=iris[,5]
pp=optim.pls.cv(data,labels,2:4)
pp$optim_comp