glmmsel {glmmsel}R Documentation

Generalised linear mixed model selection

Description

Fits the regularisation path for a sparse generalised linear mixed model (GLMM).

Usage

glmmsel(
  x,
  y,
  cluster,
  family = c("gaussian", "binomial"),
  local.search = FALSE,
  max.nnz = 100,
  nlambda = 100,
  lambda.step = 0.99,
  lambda = NULL,
  alpha = 0.8,
  intercept = TRUE,
  random.intercept = TRUE,
  standardise = TRUE,
  eps = 1e-04,
  max.cd.iter = 10000,
  max.ls.iter = 100,
  max.bls.iter = 30,
  t.init = 1,
  t.scale = 0.5,
  max.pql.iter = 100,
  active.set = TRUE,
  active.set.count = 3,
  sort = TRUE,
  screen = 100,
  warn = TRUE
)

Arguments

x

a predictor matrix

y

a response vector

cluster

a vector of length nrow(x) with the jth element identifying the cluster that the jth observation belongs to

family

the likelihood family to use; 'gaussian' for a continuous response or 'binomial' for a binary response

local.search

a logical indicating whether to perform local search after coordinate descent; typically leads to higher quality solutions

max.nnz

the maximum number of predictors ever allowed to be active

nlambda

the number of regularisation parameters to evaluate when lambda is computed automatically

lambda.step

the step size taken when computing lambda from the data; should be a value strictly between 0 and 1; larger values typically lead to a finer grid of subset sizes

lambda

an optional vector of regularisation parameters

alpha

the hierarchical parameter

intercept

a logical indicating whether to include a fixed intercept

random.intercept

a logical indicating whether to include a random intercept; applies only when intercept = TRUE

standardise

a logical indicating whether to scale the data to have unit root mean square; all parameters are returned on the original scale of the data

eps

the convergence tolerance; convergence is declared when the relative maximum difference in consecutive parameter values is less than eps

max.cd.iter

the maximum number of coordinate descent iterations allowed

max.ls.iter

the maximum number of local search iterations allowed

max.bls.iter

the maximum number of backtracking line search iterations allowed

t.init

the initial value of the gradient step size during backtracking line search

t.scale

the scaling parameter of the gradient step size during backtracking line search

max.pql.iter

the maximum number of penalised quasi-likelihood iterations allowed

active.set

a logical indicating whether to use active set updates; typically lowers the run time

active.set.count

the number of consecutive coordinate descent iterations in which a subset should appear before running active set updates

sort

a logical indicating whether to sort the coordinates before running coordinate descent; typically leads to higher quality solutions

screen

the number of predictors to keep after gradient screening; smaller values typically lower the run time

warn

a logical indicating whether to print a warning if the algorithms fail to converge

Value

An object of class glmmsel; a list with the following components:

beta0

a vector of fixed intercepts

gamma0

a vector of random intercept variances

beta

a matrix of fixed slopes

gamma

a matrix of random slope variances

u

an array of random coefficient predictions

sigma2

a vector of residual variances

loss

a vector of loss function values

cd.iter

a vector indicating the number of coordinate descent iterations for convergence

ls.iter

a vector indicating the number of local search iterations for convergence

pql.iter

a vector indicating the number of penalised quasi-likelihood iterations for convergence

nnz

a vector of the number of nonzeros

lambda

a vector of regularisation parameters used for the fit

family

the likelihood family used

clusters

a vector of cluster identifiers

alpha

the value of the hierarchical parameter used for the fit

intercept

whether a fixed intercept is included in the model

random.intercept

whether a random intercept is included in the model

Author(s)

Ryan Thompson <ryan.thompson-1@uts.edu.au>

References

Thompson, R., Wand, M. P., and Wang, J. J. J. (2025). 'Scalable subset selection in linear mixed models'. arXiv: 2506.20425.

Examples

# Generate data
set.seed(1234)
n <- 100
m <- 4
p <- 10
s <- 5
x <- matrix(rnorm(n * p), n, p)
beta <- c(rep(1, s), rep(0, p - s))
u <- cbind(matrix(rnorm(m * s), m, s), matrix(0, m, p - s))
cluster <- sample(1:m, n, replace = TRUE)
xb <- rowSums(x * sweep(u, 2, beta, '+')[cluster, ])
y <- rnorm(n, xb)

# Fit sparse linear mixed model
fit <- glmmsel(x, y, cluster)
plot(fit)
fixef(fit, lambda = 10)
ranef(fit, lambda = 10)
coef(fit, lambda = 10)
predict(fit, x[1:3, ], cluster[1:3], lambda = 10)

[Package glmmsel version 1.0.3 Index]