generate_randomizations_exact {fastrerandomize}R Documentation

Generate Complete Randomizations with Optional Balance Constraints

Description

Generates all possible treatment assignments for a completely randomized experiment, optionally filtering them based on covariate balance criteria. The function can generate either all possible randomizations or a subset that meets specified balance thresholds using Hotelling's T-squared statistic.

Usage

generate_randomizations_exact(
  n_units,
  n_treated,
  X = NULL,
  randomization_accept_prob = 1,
  approximate_inv = TRUE,
  threshold_func = NULL,
  verbose = TRUE,
  conda_env = "fastrerandomize",
  conda_env_required = TRUE
)

Arguments

n_units

An integer specifying the total number of experimental units

n_treated

An integer specifying the number of units to be assigned to treatment

X

A numeric matrix of covariates where rows represent units and columns represent different covariates. Default is NULL, in which case all possible randomizations are returned without balance filtering.

randomization_accept_prob

A numeric value between 0 and 1 specifying the quantile threshold for accepting randomizations based on balance statistics. Default is 1 (accept all randomizations).

approximate_inv

A logical value indicating whether to use an approximate inverse (diagonal of the covariance matrix) instead of the full matrix inverse when computing balance metrics. This can speed up computations for high-dimensional covariates. Default is TRUE.

threshold_func

A function that calculates balance statistics for candidate randomizations. Default is VectorizedFastHotel2T2 which computes Hotelling's T-squared statistic.

verbose

A logical value indicating whether to print progress information. Default is TRUE.

conda_env

A character string specifying the name of the conda environment to use via reticulate. Default is "fastrerandomize".

conda_env_required

A logical indicating whether the specified conda environment must be strictly used. If TRUE, an error is thrown if the environment is not found. Default is TRUE.

Details

The function works in two main steps: 1. Generates all possible combinations of treatment assignments given n_units and n_treated 2. If covariates (X) are provided, filters these combinations based on balance criteria using the specified threshold function

The balance filtering process uses Hotelling's T-squared statistic by default to measure multivariate balance between treatment and control groups. Randomizations are accepted if their balance measure is below the specified quantile threshold.

Value

The function returns a list with two elements: candidate_randomizations: an array of randomization vectors M_candidate_randomizations: an array of their balance measures.

Note

This function requires 'JAX' and 'NumPy' to be installed and accessible through the reticulate package.

References

Hotelling, H. (1931). The generalization of Student's ratio. The Annals of Mathematical Statistics, 2(3), 360-378.

See Also

generate_randomizations for full randomization generation function. generate_randomizations_mc for the Monte Carlo version.

Examples


## Not run: 
# Generate synthetic data 
X <- matrix(rnorm(60), nrow = 10)  # 10 units, 6 covariates

# Generate balanced randomizations with covariates
BalancedRandomizations <- generate_randomizations_exact(
  n_units = 10,
  n_treated = 5,
  X = X,
  randomization_accept_prob = 0.25  # Keep top 25% most balanced
)

## End(Not run)


[Package fastrerandomize version 0.2 Index]