b_inter {bases} | R Documentation |
N-way interaction basis
Description
Generates a design matrix that contains all possible interactions of the
input variables up to a specified maximum depth.
The default "symbox"
standardization, which maps inputs to
[-0.5, 0.5]^d
, is strongly recommended, as it means that the interaction
terms will have smaller variance and thus be penalized more by methods like
the Lasso or ridge regression (see Gelman et al., 2008).
Usage
b_inter(
...,
depth = 2,
stdize = c("symbox", "box", "scale", "none"),
shift = NULL,
scale = NULL
)
Arguments
... |
The variable(s) to build features for. A single data frame or matrix may be provided as well. Missing values are not allowed. |
depth |
The maximum interaction depth. The default is 2, which means that all pairwise interactions are included. |
stdize |
How to standardize the predictors, if at all. The default
|
shift |
Vector of shifts, or single shift value, to use. If provided,
overrides those calculated according to |
scale |
Vector of scales, or single scale value, to use. If provided,
overrides those calculated according to |
Value
A matrix with the rescaled and interacted features.
References
Gelman, A., Jakulin, A., Pittau, M. G., & Su, Y. S. (2008). A weakly informative default prior distribution for logistic and other regression models.
Examples
# default: all pairwise interactions
lm(mpg ~ b_inter(cyl, hp, wt), mtcars)
# how number of features depends on interaction depth
for (d in 2:6) {
X = with(mtcars, b_inter(cyl, disp, hp, drat, wt, depth=d))
print(ncol(X))
}