balance {sgboost} | R Documentation |
Balances selection frequencies for unequal groups
Description
Returns optimal degrees of freedom for group boosting to achieve more balanced variables selection.
Groups should be defined through group_df
. Each base_learner
Usage
balance(
df = NULL,
group_df = NULL,
blearner = "bols",
outcome_name = "y",
group_name = "group_name",
var_name = "var_name",
n_reps = 3000,
iterations = 15,
nu = 0.5,
red_fact = 0.9,
min_weights = 0.01,
max_weights = 0.99,
intercept = TRUE,
verbose = F
)
Arguments
df |
data.frame to be analyzed |
group_df |
input data.frame containing variable names with group structure.
All variables in |
blearner |
Type of baselearner. Default is |
outcome_name |
String indicating the name of dependent variable. Default is |
group_name |
Name of column in group_df indicating the group structure of the variables.
Default is |
var_name |
Name of column in group_df containing the variable names
to be used as predictors. Default is |
n_reps |
Number of samples to be drawn in each iteration |
iterations |
Number of iterations performed in the algorithm. Default is |
nu |
Learning rate as the step size to move away from the current estimate.
Default is |
red_fact |
Factor by which the learning rate is reduced if the algorithm overshoots,
meaning the loss increases. Default is |
min_weights |
The minimum weight size to be used. Default is |
max_weights |
The maximum weight size to be used. Default is |
intercept |
Logical, should intercept be used? |
verbose |
Logical, should iteration be printed? |
Value
Character containing the formula to be passed to mboost::mboost()
yielding the sparse-group boosting for a given value mixing parameter alpha
.
Examples
library(mboost)
library(dplyr)
set.seed(1)
df <- data.frame(
x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100),
x4 = rnorm(100), x5 = runif(100)
)
df <- df %>%
mutate_all(function(x) {
as.numeric(scale(x))
})
df$y <- df$x1 + df$x4 + df$x5
group_df <- data.frame(
group_name = c(1, 1, 1, 2, 2),
var_name = c("x1", "x2", "x3", "x4", "x5")
)
sgb_formula <- create_formula(alpha = 0.3, group_df = group_df)
sgb_model <- mboost(formula = sgb_formula, data = df)
summary(sgb_model)