ice {gfoRmulaICE}R Documentation

Iterative Conditional Expectation Estimator

Description

This function implements iterative conditional expectation (ICE) estimators under user-defined treatment strategies. Available ICE estimators are classical and hazard-based pooling over treatment history ICE, classical and hazard-based stratifying on treatment history ICE, and doubly robust ICE estimators. See Wen et al. (2021) for more details regarding the parametric g-formula iterative conditional expectation estimator.

Usage

ice(
  data,
  time_points,
  id,
  time_name,
  outcome_name,
  censor_name = NULL,
  compevent_name = NULL,
  comp_effect = 0,
  outcome_model,
  censor_model = NULL,
  competing_model = NULL,
  hazard_model = NULL,
  global_hazard = FALSE,
  ref_idx = 0,
  estimator,
  int_descript,
  ci_method = "percentile",
  nsamples = 0,
  seed = 1,
  coverage = 95,
  parallel = FALSE,
  ncores = 2,
  verbose = TRUE,
  ...
)

Arguments

data

a data frame containing the observed data in long format.

time_points

a number indicating the total number of time points.

id

a string indicating the ID variable name in data.

time_name

a string specifying the time variable name in data.

outcome_name

a string specifying the outcome variable name in data.

censor_name

a string specifying the censor variable name in data. Default is NULL.

compevent_name

a string specifying the competing variable name in data. Default is NULL.

comp_effect

a number indicating how the competing event is handled for all the specified interventions. Default is 0. 0 for controlled direct effect. 1 for total effect.

outcome_model

a formula specifying the model statement for the outcome.

censor_model

a formula specifying the model statement for the censoring event. Default is NULL.

competing_model

a formula specifying the model statement for the competing event. Default is NULL.

hazard_model

a formula specifying the model statement for the hazard, if hazard-based estimator is used. Default is NULL. If specified, the model in hazard_model will be used. If NULL, the model in outcome_model will be used.

global_hazard

a logical value indicating whether to use global pooled-over-time hazard model or time-specific hazard models, for hazard-based pooled ICE only. If TRUE, use pooled-over-time hazard model. If FALSE, use time-specific hazard models. Default is FALSE.

ref_idx

a number indicating which intervention to be used as the reference to calculate the risk ratio and risk difference. Default is 0. 0 refers to the natural course as the reference intervention. Any other numbers refer to the corresponding intervention that users specify in the keyword arguments.

estimator

a function specifying which ICE estimator to use for the estimation. Please see Details for all available specifications.

int_descript

a vector of strings containing descriptions for each specified intervention.

ci_method

a string specifying the method for calculating the confidence interval, if nsamples is larger than 0. Possible values are "percentile" and "normal." Default is "percentile."

nsamples

a number larger than 0 indicating the number of bootstrap samples. Default is 0.

seed

a number indicating the starting seed for bootstrapping. Default is 1.

coverage

a number greater than 0 and less than 100 indicating the coverage of the confidence interval. Default is 95.

parallel

a logical value indicating whether to parallelize the bootstrap process. Default is FALSE.

ncores

a number indicating the number of CPU cores to use in parallel computing. Default is 2.

verbose

a logical specifying whether progress of the algorithm is printed. Default is TRUE.

...

keyword arguments to specify intervention inputs. If stratified ICE is used, keyword arguments also allow intervention-specific outcome models and competing models.
To specify interventions, please follow the input convention below:

  • Each intervention is specified using the keyword argument name with intervention prefix.

  • Use i after intervention prefix in keyword argument name to represent the ith strategy.

  • Use . followed with treatment variable name after interventioni in keyword argument name to represent the treatment name within the ith strategy.

Each input of intervention keyword arguments is a list consisting of a vector of intervened values and an optional vector of time points on which the intervention is applied. If the intervention time points are not specified, the intervention is applied to all time points. For example, an input considers a simultaneous intervention with always treat on A1 and never treat on A2 at all time points looks like:

intervention1.A1 = list(static(1))
intervention1.A2 = list(static(0))

The above intervention applies to all time points. The following is an example of custom intervention time points, with always treat on A1 at time point 1 and 2 and never treat on A2 at time point 3 to 5.

intervention1.A1 = list(static(1), 1:2)
intervention1.A2 = list(static(0), 3:5)

If there is no intervention keyword argument specified, the function returns the natural course risk only. Please see the "Examples" section for more examples.

To specify different outcome model and/or competing model for different intervention, please follow the input convention below:

  • Each outcome model is specified using keyword argument name starting with outcomeModel or compModel prefix for outcome model or competing model correspondingly.

  • Use .n after outcomeModel or compModel prefix in keyword argument name to specify which intervention being applied to, where n represents the nth intervention.

The input to each outcome or competing model keyword argument is a model statement formula. If no outcome model and competing model keyword argument is specified, the models specified in outcome_model and comp_model are used. Please refer to the "Examples" section for more examples.

Details

Users could specify which version ICE estimator to use through estimator.

To provide flexible choices on model inputs for stratified ICE and doubly robust ICE, we allow users to specify intervention-specific model statements through keyword arguments. In the case where intervention-specific model statements are specified, treatment variables that are not intervened under some strategies will be considered as a covariate and automatically added into the model specification at each time point. Please see more details on how to specify intervention-specific model specifications in the "Arguments" section.

For example, the following input specifies Y ~ L1 as outcome model for intervention 1, D ~ L1 as competing model for intervention 2, D ~ L1 + L2 as competing model for intervention 1, Y ~ L1 + L2 as outcome model for intervention 2.

fit_hazard_strat <- ice(data = data, K = 4, id = "id", time_name = "t0",
outcome_name = "Y", censor_name = "C", competing_name = "D",
estimator = strat(hazard = TRUE), comp_effect = 1,
censor_model = C ~ L1 + L2, ref_idx = 0,
int_descript = c("Static Intervention", "Dynamic Intervention"),
outcome_model = Y ~ L1 + L2,
competing_model = D ~ L1 + L2,
intervention1.A1 = list(static(0)),
intervention1.A2 = list(static(1)),
intervention2.A1 = list(dynamic("L1 > 0", static(0), static(1), absorb = FALSE)),
intervention2.A2 = list(dynamic("L2 == 0", static(0), static(1), absorb = TRUE)),
outcomeModel.1 = Y ~ L1,
compModel.2 = D ~ L1

Because the keyword argument for outcome model is not specified for intervention 2, the outcome model for intervention 2 is Y ~ L1 + L2 as specified in outcome_model. Similarly, because the keyword argument for competing model is not specified for intervention 1, the competing model for intervention 1 is D ~ L1 + L2 as specified in competing_model. In the case of controlled direct effect, the keyword arguments for competing models are ignored. Please see more examples in the "Examples" section.

Both built-in interventions and user-defined interventions are available.

The following are the built-in intervention functions in the package:

The following is the user-defined intervention:

Please see examples in the "Examples" section.

In order to obtain an inverse probability (IP) weighted natural course risk based on the observed data, users must specify a censoring variable through censor_name and a corresponding censoring model through censor_model. Please see Chiu et al. (2023) for more details regarding the IP weighted estimate of the natural course risk.

If competing event exists in the data, users need to specify the name of the competing variable through competing_name and the model specification through competing_model for hazard-based ICE estimator. Users need to specify whether to treat the competing event as censoring or total effect through total_effect.

We provide flexible term options in model specification for the outcome, censoring, competing, and hazard model. Users could specify polynomial terms using functions I and poly and spline terms using ns from splines package and rcspline.eval from Hmisc package. In addition, users could specify lagged terms using the format lagn_var to indicate lagging the variable var with n periods. If the lagged variable is a treatment variable, this variable is automatically intervened based on user-defined intervention. The polynomial and spline terms could be used on lagged variables.

Value

A list containing the following components. Each component that contains the fitted models includes the model fits, the summary of the fitted model, standard errors of the coefficients, variance-covariance matrices of the parameters, and the root mean square error (RMSE) values.

estimator.type

A string describing the type of the estimator.

summary

A summary table containing the estimated risk, risk ratio, and risk difference for user-defined interventions including estimated natural course risk and the observed risk. If nsamples is greater than 0, the summary table includes standard error and confidence interval for the point estimates.

risk.over.time

A data frame containing the estimated risk at each time point for each intervention.

initial.outcome

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the outcome model in the first step of algorithm.

initial.comp

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the competing model in the first step of algorithm (if applicable).

np.risk.model

A list containing the fitted models for the censoring and/or competing model in estimating observed risk (if applicable).

outcome.models.by.step

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the outcome model in each iteration of algorithm.

comp.models.by.step

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the competing model in each iteration of algorithm (if applicable).

hazard.models.by.step

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the hazard model (if applicable), either time-specific models at all time points or one pooled-over-time global model.

boot.data

A list of bootstrap samples. If nsamples is set to 0, a NULL value is returned.

boot.initial.outcome

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the outcome model in the first step of algorithm on the bootstrap samples. If nsamples is set to 0, a NULL value is returned.

boot.initial.comp

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the competing model in the first step of algorithm on the bootstrap samples (if applicable). If nsamples is set to 0, a NULL value is returned.

boot.np.risk.model

A list containing the fitted models for the censoring and/or competing model in estimating observed risk on the bootstrap samples (if applicable). If nsamples is set to 0, a NULL value is returned.

boot.outcome.models.by.step

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the outcome model in each iteration of algorithm on the bootstrap samples (if applicable). If nsamples is set to 0, a NULL value is returned.

boot.comp.models.by.step

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the competing model in each iteration of algorithm on the bootstrap samples (if applicable). If nsamples is set to 0, a NULL value is returned.

boot.hazard.models.by.step

A list, where the name of each sublist corresponds to each specified intervention description, and each sublist contains the fitted models for the hazard model (if applicable), either time-specific models at all time points or one pooled-over-time global model, on the bootstrap samples. If nsamples is set to 0, a NULL value is returned.

References

Wen L, Young JG, Robins JM, Hernán MA. Parametric g-formula implementations for causal survival analyses. Biometrics. 2021;77(2):740-753.

McGrath S, Lin V, Zhang Z, Petito LC, Logan RW, Hernán MA, and JG Young. gfoRmula: An R package for estimating the effects of sustained treatment strategies via the parametric g-formula. Patterns. 2020;1:100008.

Young JG, Herńan MA, Robins JM. Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data. Epidemiologic Methods. 2014;3(1):1-19.

Young JG, Vatsa R, Murray EJ, Hernán MA. Interval-cohort designs and bias in the estimation of per-protocol effects: a simulation study. Trials. 2019;20(1):552.

Díaz, I, Williams, N, Hoffman, KL, & Schenck, EJ. Nonparametric causal effects based on longitudinal modified treatment policies. Journal of the American Statistical Association. 2021;118(542), 846–857.

Young JG, Stensrud MJ, Tchetgen Tchetgen EJ, Hernán MA. A causal framework for classical statistical estimands in failure-time settings with competing events. Statistics in medicine. 2020;39(8):1199-1236.

Wen L, Hernán MA, Robins JM. Multiply robust estimators of causal effects for survival outcomes. Scandinavian journal of statistics, theory and applications. 2022;49(3):1304-1328.

Haneuse S, Rotnitzky A. Estimation of the effect of interventions that modify the received treatment. Statistics in medicine. 2013;32(30):5260-5277.

McGrath S, Young JG, Hernán MA. Revisiting the g-null Paradox. Epidemiology. 2022;33(1):114-120.

Chiu YH, Wen L, McGrath S, Logan R, Dahabreh IJ, Hernán MA. Evaluating model specification when using the parametric g-formula in the presence of censoring. American Journal of Epidemiology. 2023;192:1887–1895.

Examples


data <- gfoRmulaICE::compData

# Example: Built-in and User-defined Interventions

# We consider the following interventions. 
# We intervene at all time points.
# Intervention 1 on A1: always treat with value 3.
# Intervention 2 on L2: when the natural value of L2 at time t is lower than -3, 
# set its value to -3. Otherwise, do not intervene.
# Intervention 3 on A2: dynamic intervention (treat when L1 = 0) 
# with uniform grace period of 2 periods
# Intervention 4 on A2: at time t, if L1 = 0, then treat; otherwise, not treat.


# We use classical pooled ICE estimator, 
# natural course as the reference intervention, 
# and the following models:
# a. outcome model: Y ~ L1 + L2 + A1 + A2
# b. censor model: C ~ L1 + L2 + A1 + A2.

ice_fit <- ice(
data = data, 
time_points = 4, 
id = "id", 
time_name = "t0",
censor_name = "C", 
outcome_name = "Y",
compevent_name = "D",
comp_effect = 0,
outcome_model = Y ~ L1 + L2 + A1 + A2, 
censor_model = C ~ L1 + L2 + A1 + A2,
ref_idx = 0,
estimator = pool(hazard = FALSE),
int_descript = c("Static Intervention", "Threshold Intervention", 
"Dynamic Intervention with Grace Period", "Dynamic Intervention"),
intervention1.A1 = list(static(3)),
intervention2.L2 = list(threshold(-3, Inf)),
intervention3.A2 = list(grace_period("uniform", 2, "L1 == 0")),
intervention4.A2 = list(dynamic("L1 == 0", static(0), static(1)))
)

summary(ice_fit)

plot(ice_fit)




[Package gfoRmulaICE version 0.1.0 Index]