atekCl {PND.heter.cluster}R Documentation

Estimation of the cluster-specific treatment effects in the partially nested design.

Description

Estimation of the cluster-specific treatment effects in the partially nested design.

Usage

atekCl(
  data_in,
  ttname,
  Kname,
  Yname,
  Xnames,
  Yfamily = "gaussian",
  learners_tt = c("SL.glm"),
  learners_k = c("SL.multinom"),
  learners_y = c("SL.glm"),
  sensitivity = NULL,
  cv_folds = 4L,
  seed = NULL
)

Arguments

data_in

A data.frame containing all necessary variables.

ttname

[character]
A character string of the column name of the treatment variable. The treatment variable should be dummy-coded, with 1 for the (clustered) treatment arm and 0 for the (non-clustered) control arm.

Kname

[character]
A character string of the column name of the cluster assignment variable. This variable should be coded as 0 for individuals in the control arm, the arm without the cluster assignment.

Yname

[character]
A character string of the column name of the outcome variable

Xnames

[character]
A character vector of the column names of the baseline covariates.

Yfamily

[numeric(1)]
Variable type of the outcome, with Yfamily = "gaussian" for continuous outcome, and Yfamily = "binomial" for binary outcome.

learners_tt

[character]
A character vector of methods for estimating the treatment model, chosen from the SuperLearner R package. Default is "SL.glm", a generalized linear model for the binary treatment variable. Other available methods can be found using the R function SuperLearner::listWrappers().

learners_k

[character]
A character string of a method for estimating the cluster assignment model, which can be one of "SL.multinom" (default), "SL.xgboost.modified", "SL.ranger.modified", and "SL.nnet.modified". Default is "SL.multinom", the multinomial regression (nnet::multinom) for the categorical cluster assignment using the treatment arm data. The other options are "SL.xgboost.modified" (gradient boosted model, xgboost::xgboost), "SL.ranger.modified" (random forest model, ranger::ranger), and "SL.nnet.modified" (neural network model, "SL.nnet.modified") modified for fitting categorical response variable of type multinomial.

learners_y

[character]
A character vector of methods for estimating the outcome model, chosen from the SuperLearner R package. Default is "SL.glm", a generalized linear model for the outcome variable, with family specified by Yfamily. Other available methods can be found using the R function SuperLearner::listWrappers().

sensitivity

Specification for sensitivity parameter values on the standardized mean difference scale, which can be NULL (default) or "small_to_medium". If NULL, no sensitivity analysis will be run. If "small_to_medium", the function will run a sensitivity analysis for the cluster assignment ignorability assumption, and the sensitivity parameter values indicate a deviation from this assumption of magnitude 0.1 and 0.3 standardized mean difference.

cv_folds

[numeric(1)]
The number of cross-fitting folds. Default is 4.

seed

An integer that is used as argument by the set.seed() for offsetting the random number generator. Default is to leave the random number generator alone.

Value

A list containing the following components:

ate_K

A data.frame of the estimation results.

The columns "ate_k", "std_error", "CI_lower", and "CI_upper" contain the estimate, standard error estimate, and lower and upper bounds of the 0.95 confidence interval of the cluster-specific treatment effect for the cluster (indicated by column "cluster") in the same row.

cv_components

A data.frame of nuisance model estimates.

sens_results

NULL if the argument sensitivity = NULL.

If the argument sensitivity = "small_to_medium" is specified, sens_results is a list of four data frames, containing the estimation results with the sensitivity parameter value (standardized mean difference) being 0.1, 0.3, -0.1, -0.3.

Examples


library(tidyverse)
library(SuperLearner)
library(glue)
library(nnet)

# data
data(data_in)
data_in <- data_in

# baseline covariates
Xnames <- c(grep("X_dat", colnames(data_in), value = TRUE))

estimates_ate_K <- PND.heter.cluster::atekCl(
data_in = data_in,
ttname = "tt",  # treatment variable
Kname = "K",    # cluster assignment variable, coded as 0 for
                # individuals in the (non-clustered) control arm
Yname = "Y",    # outcome variable
Xnames = Xnames,
seed = 12345
)
estimates_ate_K$ate_K



[Package PND.heter.cluster version 0.1.0 Index]