ice {gfoRmulaICE} | R Documentation |
Iterative Conditional Expectation Estimator
Description
This function implements iterative conditional expectation (ICE) estimators under user-defined treatment strategies. Available ICE estimators are classical and hazard-based pooling over treatment history ICE, classical and hazard-based stratifying on treatment history ICE, and doubly robust ICE estimators. See Wen et al. (2021) for more details regarding the parametric g-formula iterative conditional expectation estimator.
Usage
ice(
data,
time_points,
id,
time_name,
outcome_name,
censor_name = NULL,
compevent_name = NULL,
comp_effect = 0,
outcome_model,
censor_model = NULL,
competing_model = NULL,
hazard_model = NULL,
global_hazard = FALSE,
ref_idx = 0,
estimator,
int_descript,
ci_method = "percentile",
nsamples = 0,
seed = 1,
coverage = 95,
parallel = FALSE,
ncores = 2,
verbose = TRUE,
...
)
Arguments
data |
a data frame containing the observed data in long format. |
time_points |
a number indicating the total number of time points. |
id |
a string indicating the ID variable name in |
time_name |
a string specifying the time variable name in |
outcome_name |
a string specifying the outcome variable name in |
censor_name |
a string specifying the censor variable name in |
compevent_name |
a string specifying the competing variable name in |
comp_effect |
a number indicating how the competing event is handled for all the specified interventions. Default is 0. 0 for controlled direct effect. 1 for total effect. |
outcome_model |
a formula specifying the model statement for the outcome. |
censor_model |
a formula specifying the model statement for the censoring event. Default is |
competing_model |
a formula specifying the model statement for the competing event. Default is |
hazard_model |
a formula specifying the model statement for the hazard, if hazard-based estimator is used. Default is |
global_hazard |
a logical value indicating whether to use global pooled-over-time hazard model or time-specific hazard models, for hazard-based pooled ICE only.
If |
ref_idx |
a number indicating which intervention to be used as the reference to calculate the risk ratio and risk difference. Default is 0. 0 refers to the natural course as the reference intervention. Any other numbers refer to the corresponding intervention that users specify in the keyword arguments. |
estimator |
a function specifying which ICE estimator to use for the estimation. Please see Details for all available specifications. |
int_descript |
a vector of strings containing descriptions for each specified intervention. |
ci_method |
a string specifying the method for calculating the confidence interval, if |
nsamples |
a number larger than 0 indicating the number of bootstrap samples. Default is 0. |
seed |
a number indicating the starting seed for bootstrapping. Default is 1. |
coverage |
a number greater than 0 and less than 100 indicating the coverage of the confidence interval. Default is 95. |
parallel |
a logical value indicating whether to parallelize the bootstrap process. Default is |
ncores |
a number indicating the number of CPU cores to use in parallel computing. Default is 2. |
verbose |
a logical specifying whether progress of the algorithm is printed. Default is TRUE. |
... |
keyword arguments to specify intervention inputs. If stratified ICE is used, keyword arguments also allow intervention-specific outcome models and competing models.
Each input of intervention keyword arguments is a list consisting of a vector of intervened values and an optional vector of time points on which the intervention is applied.
If the intervention time points are not specified, the intervention is applied to all time points. For example,
an input considers a simultaneous intervention with always treat on A1 and never treat on A2 at all time points looks like: To specify different outcome model and/or competing model for different intervention, please follow the input convention below:
The input to each outcome or competing model keyword argument is a model statement formula.
If no outcome model and competing model keyword argument is specified, the models specified in |
Details
Users could specify which version ICE estimator to use through estimator
.
pool(hazard = FALSE)
specifies the classical pooling over treatment history ICE estimator.pool(hazard = TRUE)
specifies the hazard-based pooling over treatment history ICE estimator.strat(hazard = FALSE)
specifies the classical stratifying on treatment history ICE estimator.strat(hazard = TRUE)
specifies the hazard-based stratifying on treatment history ICE estimator.weight(treat_model)
specifies the doubly robust weighted ICE estimator wheretreat_model
specifies the treatment model.
To provide flexible choices on model inputs for stratified ICE and doubly robust ICE, we allow users to specify intervention-specific model statements through keyword arguments. In the case where intervention-specific model statements are specified, treatment variables that are not intervened under some strategies will be considered as a covariate and automatically added into the model specification at each time point. Please see more details on how to specify intervention-specific model specifications in the "Arguments" section.
For example, the following input specifies Y ~ L1
as outcome model for intervention 1,
D ~ L1
as competing model for intervention 2,
D ~ L1 + L2
as competing model for intervention 1,
Y ~ L1 + L2
as outcome model for intervention 2.
fit_hazard_strat <- ice(data = data, K = 4, id = "id", time_name = "t0",
outcome_name = "Y", censor_name = "C", competing_name = "D",
estimator = strat(hazard = TRUE), comp_effect = 1,
censor_model = C ~ L1 + L2, ref_idx = 0,
int_descript = c("Static Intervention", "Dynamic Intervention"),
outcome_model = Y ~ L1 + L2,
competing_model = D ~ L1 + L2,
intervention1.A1 = list(static(0)),
intervention1.A2 = list(static(1)),
intervention2.A1 = list(dynamic("L1 > 0", static(0), static(1), absorb = FALSE)),
intervention2.A2 = list(dynamic("L2 == 0", static(0), static(1), absorb = TRUE)),
outcomeModel.1 = Y ~ L1,
compModel.2 = D ~ L1
Because the keyword argument for outcome model is not specified for intervention 2, the outcome model for intervention 2 is
Y ~ L1 + L2
as specified in outcome_model
.
Similarly, because the keyword argument for competing model is not specified for intervention 1, the competing model for intervention 1 is
D ~ L1 + L2
as specified in competing_model
.
In the case of controlled direct effect, the keyword arguments for competing models are ignored. Please see more examples in the "Examples" section.
Both built-in interventions and user-defined interventions are available.
The following are the built-in intervention functions in the package:
Static Intervention:
static(value)
specifies a constant intervention withvalue
.Dynamic Intervention:
dynamic(condition, strategy_before, strategy_after, absorb)
specifies a dynamic intervention where the strategy instrategy_before
is followed untilcondition
is met. Uponcondition
is met, the strategy instrategy_after
is followed. If absorb isTRUE
, the intervention becomes absorbing (always treat with the specified strategy) oncecondition
is met.Threshold Intervention:
threshold(lower_bound, upper_bound)
specifies a threshold intervention. If the treatment value is betweenlower_bound
andupper_bound
inclusively, follow the natural value of the treatment. Otherwise, set tolower_bound
orupper_bound
, if the treatment value is belowlower_bound
or aboveupper_bound
, correspondingly.Grace Period:
grace_period(type, nperiod, condition)
specifies a dynamic intervention with grace period. Oncecondition
is met, the intervention is initiated withinnperiod
time units. During the grace period, the treatment variable follows its natural value or initiate intervention with a uniform distribution at each time point.
The following is the user-defined intervention:
User-defined Interventions: The output of the user-defined intervention should contain the intervened value for each individual at each time point, and should be of the same size as the number of rows in
data
.
Please see examples in the "Examples" section.
In order to obtain an inverse probability (IP) weighted natural course risk based on the observed data, users must specify
a censoring variable through censor_name
and a corresponding censoring model through censor_model
. Please see
Chiu et al. (2023) for more details regarding the IP weighted estimate of the natural course risk.
If competing event exists in the data, users need to specify the name of the competing variable through competing_name
and
the model specification through competing_model
for hazard-based ICE estimator. Users need to specify whether to treat
the competing event as censoring or total effect through total_effect
.
We provide flexible term options in model specification for the outcome, censoring, competing, and hazard model.
Users could specify polynomial terms using functions I
and poly
and spline terms using ns
from splines package
and rcspline.eval
from Hmisc package. In addition, users could specify lagged terms using the format lagn
_var
to indicate lagging the variable var with n periods.
If the lagged variable is a treatment variable, this variable is automatically intervened based on user-defined intervention.
The polynomial and spline terms could be used on lagged variables.
Value
A list containing the following components. Each component that contains the fitted models includes the model fits, the summary of the fitted model, standard errors of the coefficients, variance-covariance matrices of the parameters, and the root mean square error (RMSE) values.
estimator.type |
A string describing the type of the estimator. |
summary |
A summary table containing the estimated risk, risk ratio, and risk difference for user-defined interventions including estimated natural course risk and the observed risk.
If |
risk.over.time |
A data frame containing the estimated risk at each time point for each intervention. |
initial.outcome |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the outcome model in the first step of algorithm. |
initial.comp |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the competing model in the first step of algorithm (if applicable). |
np.risk.model |
A list containing the fitted models for the censoring and/or competing model in estimating observed risk (if applicable). |
outcome.models.by.step |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the outcome model in each iteration of algorithm. |
comp.models.by.step |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the competing model in each iteration of algorithm (if applicable). |
hazard.models.by.step |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the hazard model (if applicable), either time-specific models at all time points or one pooled-over-time global model. |
boot.data |
A list of bootstrap samples. If |
boot.initial.outcome |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the outcome model in the first step of algorithm on the bootstrap samples.
If |
boot.initial.comp |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the competing model in the first step of algorithm on the bootstrap samples (if applicable).
If |
boot.np.risk.model |
A list containing the fitted models for the censoring and/or competing model in estimating observed risk on the bootstrap samples (if applicable).
If |
boot.outcome.models.by.step |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the outcome model in each iteration of algorithm on the bootstrap samples (if applicable).
If |
boot.comp.models.by.step |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the competing model in each iteration of algorithm on the bootstrap samples (if applicable).
If |
boot.hazard.models.by.step |
A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the hazard model (if applicable), either time-specific models at all time points or one pooled-over-time global model, on the bootstrap samples.
If |
References
Wen L, Young JG, Robins JM, Hernán MA. Parametric g-formula implementations for causal survival analyses. Biometrics. 2021;77(2):740-753.
McGrath S, Lin V, Zhang Z, Petito LC, Logan RW, Hernán MA, and JG Young. gfoRmula: An R package for estimating the effects of sustained treatment strategies via the parametric g-formula. Patterns. 2020;1:100008.
Young JG, Herńan MA, Robins JM. Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data. Epidemiologic Methods. 2014;3(1):1-19.
Young JG, Vatsa R, Murray EJ, Hernán MA. Interval-cohort designs and bias in the estimation of per-protocol effects: a simulation study. Trials. 2019;20(1):552.
Díaz, I, Williams, N, Hoffman, KL, & Schenck, EJ. Nonparametric causal effects based on longitudinal modified treatment policies. Journal of the American Statistical Association. 2021;118(542), 846–857.
Young JG, Stensrud MJ, Tchetgen Tchetgen EJ, Hernán MA. A causal framework for classical statistical estimands in failure-time settings with competing events. Statistics in medicine. 2020;39(8):1199-1236.
Wen L, Hernán MA, Robins JM. Multiply robust estimators of causal effects for survival outcomes. Scandinavian journal of statistics, theory and applications. 2022;49(3):1304-1328.
Haneuse S, Rotnitzky A. Estimation of the effect of interventions that modify the received treatment. Statistics in medicine. 2013;32(30):5260-5277.
McGrath S, Young JG, Hernán MA. Revisiting the g-null Paradox. Epidemiology. 2022;33(1):114-120.
Chiu YH, Wen L, McGrath S, Logan R, Dahabreh IJ, Hernán MA. Evaluating model specification when using the parametric g-formula in the presence of censoring. American Journal of Epidemiology. 2023;192:1887–1895.
Examples
data <- gfoRmulaICE::compData
# Example: Built-in and User-defined Interventions
# We consider the following interventions.
# We intervene at all time points.
# Intervention 1 on A1: always treat with value 3.
# Intervention 2 on L2: when the natural value of L2 at time t is lower than -3,
# set its value to -3. Otherwise, do not intervene.
# Intervention 3 on A2: dynamic intervention (treat when L1 = 0)
# with uniform grace period of 2 periods
# Intervention 4 on A2: at time t, if L1 = 0, then treat; otherwise, not treat.
# We use classical pooled ICE estimator,
# natural course as the reference intervention,
# and the following models:
# a. outcome model: Y ~ L1 + L2 + A1 + A2
# b. censor model: C ~ L1 + L2 + A1 + A2.
ice_fit <- ice(
data = data,
time_points = 4,
id = "id",
time_name = "t0",
censor_name = "C",
outcome_name = "Y",
compevent_name = "D",
comp_effect = 0,
outcome_model = Y ~ L1 + L2 + A1 + A2,
censor_model = C ~ L1 + L2 + A1 + A2,
ref_idx = 0,
estimator = pool(hazard = FALSE),
int_descript = c("Static Intervention", "Threshold Intervention",
"Dynamic Intervention with Grace Period", "Dynamic Intervention"),
intervention1.A1 = list(static(3)),
intervention2.L2 = list(threshold(-3, Inf)),
intervention3.A2 = list(grace_period("uniform", 2, "L1 == 0")),
intervention4.A2 = list(dynamic("L1 == 0", static(0), static(1)))
)
summary(ice_fit)
plot(ice_fit)