cv.glmmsel {glmmsel}R Documentation

Cross-validated generalised linear mixed model selection

Description

Fits the regularisation path for a sparse generalised linear mixed model and then cross-validates this path.

Usage

cv.glmmsel(
  x,
  y,
  cluster,
  family = c("gaussian", "binomial"),
  lambda = NULL,
  nfold = 10,
  folds = NULL,
  cv.loss = NULL,
  interpolate = TRUE,
  ...
)

Arguments

x

a predictor matrix

y

a response vector

cluster

a vector of length nrow(x) with the jth element identifying the cluster that the jth observation belongs to

family

the likelihood family to use; 'gaussian' for a continuous response or 'binomial' for a binary response

lambda

the regularisation parameter for the overlapping penalty on the fixed and random slopes

nfold

the number of cross-validation folds

folds

an optional vector of length nrow(x) with the jth entry identifying the fold that the jth observation belongs to

cv.loss

an optional cross-validation loss-function to use; should accept a vector of predicted values and a vector of actual values

interpolate

a logical indicating whether to interpolate the lambda sequence for the cross-validation fits

...

any other arguments for glmmsel()

Value

An object of class cv.glmmsel; a list with the following components:

cv.mean

a vector of cross-validation means

cv.sd

a vector of cross-validation standard errors

lambda

a vector of cross-validated regularisation parameters

lambda.min

the value of lambda minimising cv.mean

fit

the fit from running glmmsel() on the full data

Author(s)

Ryan Thompson <ryan.thompson-1@uts.edu.au>

References

Thompson, R., Wand, M. P., and Wang, J. J. J. (2025). 'Scalable subset selection in linear mixed models'. arXiv: 2506.20425.

Examples

# Generate data
set.seed(1234)
n <- 100
m <- 4
p <- 10
s <- 5
x <- matrix(rnorm(n * p), n, p)
beta <- c(rep(1, s), rep(0, p - s))
u <- cbind(matrix(rnorm(m * s), m, s), matrix(0, m, p - s))
cluster <- sample(1:m, n, replace = TRUE)
xb <- rowSums(x * sweep(u, 2, beta, '+')[cluster, ])
y <- rnorm(n, xb)

# Fit sparse linear mixed model
fit <- cv.glmmsel(x, y, cluster)
plot(fit)
fixef(fit)
ranef(fit)
coef(fit)
predict(fit, x[1:3, ], cluster[1:3])

[Package glmmsel version 1.0.3 Index]