decide_variable_type_univariate {SplitWise} | R Documentation |
Decide Variable Type (Univariate)
Description
For each numeric predictor, this function fits a shallow (maxdepth = 2
) rpart
tree
directly on Y ~ x
and tests whether a dummy transformation improves model fit.
Usage
decide_variable_type_univariate(
X,
Y,
minsplit = 5,
criterion = c("AIC", "BIC"),
exclude_vars = NULL,
verbose = FALSE
)
Arguments
X |
A data frame of numeric predictors (no response). |
Y |
A numeric response vector. |
minsplit |
Minimum number of observations in a node to consider splitting. Default = 5. |
criterion |
A character string: either |
exclude_vars |
A character vector of variable names to exclude from dummy transformations.
These variables will always be treated as linear. Default = |
verbose |
Logical; if |
Details
Dummy forms come from a shallow (maxdepth = 2
) rpart
tree fit to the data. We extract up to two splits:
Single cutoff dummy (e.g.,
x >= c
)Double cutoff dummy (e.g.,
c1 < x < c2
)
The function then picks the form (linear, single-split dummy, or double-split dummy)
that yields the lowest AIC/BIC. If a variable is listed in exclude_vars
, it will always be used
as a linear predictor (dummy transformation is never attempted).
Value
A named list of decisions, where each element is a list with:
- type
Either
"dummy"
or"linear"
.- cutoffs
A numeric vector (length 1 or 2) if
type = "dummy"
, orNULL
if linear.- tree_model
The fitted
rpart
model (for reference) orNULL
if excluded.