optimSplit_dichotom {Qindex} | R Documentation |
Optimal Dichotomizing Predictors via Repeated Sample Splits
Description
To identify the optimal dichotomizing predictors using repeated sample splits.
Usage
optimSplit_dichotom(
formula,
data,
include = quote(p1 > 0.15 & p1 < 0.85),
top = 1L,
nsplit,
...
)
split_dichotom(y, x, id, ...)
splits_dichotom(y, x, ids = rSplit(y, ...), ...)
## S3 method for class 'splits_dichotom'
quantile(x, probs = 0.5, ...)
Arguments
formula , y , x |
formula, e.g., |
data |
|
include |
(optional) language, inclusion criteria.
Default |
top |
positive integer scalar, number of optimal dichotomizing predictors, default |
nsplit , ... |
additional parameters for function rSplit |
id |
logical vector for helper function split_dichotom, indices of training ( |
ids |
(optional) list of logical vectors for helper function splits_dichotom, multiple copies of indices of repeated training-test sample splits. |
probs |
double scalar for helper function quantile.splits_dichotom, see quantile |
Details
Function optimSplit_dichotom identifies the optimal dichotomizing predictors via repeated sample splits. Specifically,
Generate multiple, i.e., repeated, training-test sample splits (via rSplit)
For each candidate predictor
x_i
, find the median-split-dichotomized regression model based on the repeated sample splits, see details in section Details on Helper FunctionsLimit the selection of the candidate predictors
x
's to a user-desired range ofp_1
of the split-dichotomized regression models, see explanations ofp_1
in section Returns of Helper FunctionsRank the candidate predictors
x
's by the decreasing order of the absolute values of the regression coefficient estimate of the median-split-dichotomized regression models. On the top of this rank are the optimal dichotomizing predictors.
Value
Function optimSplit_dichotom returns an object of class 'optimSplit_dichotom'
, which is a list of dichotomizing functions,
with the input formula
and data
as additional attributes.
Details on Helper Functions
Split-Dichotomized Regression Model
Helper function split_dichotom performs a univariable regression model on the test set with a dichotomized predictor, using a dichotomizing rule determined by a recursive partitioning of the training set. Specifically, given a training-test sample split,
find the dichotomizing rule
\mathcal{D}
of the predictorx_0
given the responsey_0
in the training set (via rpartD);fit a univariable regression model of the response
y_1
with the dichotomized predictor\mathcal{D}(x_1)
in the test set.
Currently the Cox proportional hazards (coxph) regression for Surv response, logistic (glm) regression for logical response and linear (lm) regression for gaussian response are supported.
Split-Dichotomized Regression Models based on Repeated Training-Test Sample Splits
Helper function splits_dichotom fits multiple split-dichotomized regression models split_dichotom on the response y
and predictor x
, based on each copy of the repeated training-test sample splits.
Quantile of Split-Dichotomized Regression Models
Helper function quantile.splits_dichotom is a method dispatch of the S3 generic function quantile on splits_dichotom object. Specifically,
-
collect the univariable regression coefficient estimate from each one of the split-dichotomized regression models;
-
find the nearest-even (i.e.,
type = 3
) quantile of the coefficients from Step 1. By default, we use the median (i.e.,prob = .5
); -
the split-dichotomized regression model corresponding to the selected coefficient quantile in Step 2, is returned.
Returns of Helper Functions
Helper function split_dichotom returns a split-dichotomized regression model, which is either a Cox proportional hazards (coxph), a logistic (glm), or a linear (lm) regression model, with additional attributes
attr(,'rule')
function, dichotomizing rule
\mathcal{D}
based on the training setattr(,'text')
character scalar, human-friendly description of
\mathcal{D}
attr(,'p1')
double scalar,
p_1 = \text{Pr}(\mathcal{D}(x_1)=1)
attr(,'coef')
double scalar, univariable regression coefficient estimate of
y_1\sim\mathcal{D}(x_1)
Helper function splits_dichotom returns a list of split-dichotomized regression models (split_dichotom).
Helper function quantile.splits_dichotom returns a split-dichotomized regression model (split_dichotom).
Examples
# see ?`Qindex-package`