glintnet {adelie} | R Documentation |
fit a GLM interaction model with group lasso or group elastic-net regularization
Description
This function is an implementation of the glinternet
model of Lim and Hastie, for fitting interactions between pairs of variables in a model. The method creates interaction matrices and enforces hierarchy using the overlap group lasso. Once the augmented model matrix is set up,
glintnet
uses grpnet
to fit the overlap group lasso path. It hence inherits all the capabilities of
grpnet
, and in particular can fit interaction models for all the GLM families.
Usage
glintnet(
X,
glm,
offsets = NULL,
intr_keys = NULL,
intr_values,
levels = NULL,
n_threads = 1,
save.X = FALSE,
...
)
Arguments
X |
A dense matrix, which can include factors with levels coded as non-negative integers starting at 0. |
glm |
GLM family/response object. This is an expression that
represents the family, the reponse and other arguments such as
weights, if present. The choices are |
offsets |
Offsets, default is |
intr_keys |
List of feature indices. This is a list of all features with which interactions can be
formed. Default is |
intr_values |
List of integer vectors of feature indices. For each of the |
levels |
Number of levels for each of the columns of |
n_threads |
Number of threads, default |
save.X |
Logical flag, default |
... |
Additional named arguments to |
Details
The input matrix can be composed of quantitative variables or columns representing factors.
The argument levels
indicates which are quantitative, and which are factors.
The later are represented by numbers starting at 0, up to one less than the number of levels (sorry!)
Each of the factors are converted to "one-hot" matrices, and hence a group of columns are created for each of these.
This is done using the matrix utility function matrix.one_hot()
. In addition interaction matrices are created.
For each pair of variables for which an interaction is considered, a matrix is created consisting of the
cross-product of each of the constituent matrices, as described in the "glinternet" reference.
Once this much bigger matrix is established, the model is handed to grpnet
to produce the fit.
Value
A list of class "glintnet"
, which inherits from class "grpnet"
.
This has a a few additional components such as pairs
, groups
and levels
.
Users typically use methods like predict()
, print()
, plot()
etc to examine the object.
Author(s)
James Yang, Trevor Hastie, and Balasubramanian Narasimhan
Maintainer: Trevor Hastie
hastie@stanford.edu
References
Lim, Michael and Hastie, Trevor (2015) Learning interactions via hierarchical group-lasso regularization, JCGS
doi:10.1080/10618600.2014.938812
Yang, James and Hastie, Trevor. (2024) A Fast and Scalable Pathwise-Solver for Group Lasso
and Elastic Net Penalized Regression via Block-Coordinate Descent. arXiv doi:10.48550/arXiv.2405.08631.
Friedman, J., Hastie, T. and Tibshirani, R. (2008)
Regularization Paths for Generalized Linear Models via Coordinate
Descent (2010), Journal of Statistical Software, Vol. 33(1), 1-22,
doi:10.18637/jss.v033.i01.
Simon, N., Friedman, J., Hastie, T. and Tibshirani, R. (2011)
Regularization Paths for Cox's Proportional
Hazards Model via Coordinate Descent, Journal of Statistical Software, Vol.
39(5), 1-13,
doi:10.18637/jss.v039.i05.
Tibshirani,Robert, Bien, J., Friedman, J., Hastie, T.,Simon, N.,Taylor, J. and
Tibshirani, Ryan. (2012) Strong Rules for Discarding Predictors in
Lasso-type Problems, JRSSB, Vol. 74(2), 245-266,
https://arxiv.org/abs/1011.2234.
See Also
cv.glintnet
, predict.glintnet
, plot.glintnet
, print.glintnet
.
Examples
set.seed(0)
n=500
d_cont = 5 # number of continuous features
d_disc = 5 # number of categorical features
Z_cont = matrix(rnorm(n*d_cont), n, d_cont)
levels = sample(2:5,d_disc, replace = TRUE)
Z_disc = matrix(0,n,d_disc)
for(i in seq(d_disc))Z_disc[,i] = sample(0:(levels[i]-1),n,replace=TRUE)
Z = cbind(Z_cont,Z_disc)
levels = c(rep(1,d_cont),levels)
xmat = model.matrix(~Z_cont[,1]*factor(Z_disc[,2]))
nc=ncol(xmat)
beta = rnorm(nc)
y = xmat%*%beta+rnorm(n)*1.5
fit <- glintnet(Z, glm.gaussian(y), levels=levels, intr_keys = 1)
print(fit)