generate_randomizations {fastrerandomize}R Documentation

Generate randomizations for a rerandomization-based experimental design

Description

This function generates randomizations for experimental design using either exact enumeration or Monte Carlo sampling methods. It provides a unified interface to both approaches while handling memory and computational constraints appropriately.

Usage

generate_randomizations(
  n_units,
  n_treated,
  X = NULL,
  randomization_accept_prob,
  threshold_func = NULL,
  max_draws = 10^6,
  batch_size = 1000,
  randomization_type = "monte_carlo",
  approximate_inv = TRUE,
  file = NULL,
  return_type = "R",
  verbose = TRUE,
  conda_env = "fastrerandomize",
  conda_env_required = TRUE
)

Arguments

n_units

An integer specifying the total number of experimental units.

n_treated

An integer specifying the number of units to be assigned to treatment.

X

A numeric matrix of covariates used for balance checking. Cannot be NULL.

randomization_accept_prob

A numeric value between 0 and 1 specifying the probability threshold for accepting randomizations based on balance.

threshold_func

A 'JAX' function that computes a balance measure for each randomization. Only used for Monte Carlo sampling.

max_draws

An integer specifying the maximum number of randomizations to draw in Monte Carlo sampling.

batch_size

An integer specifying batch size for Monte Carlo processing.

randomization_type

A string specifying the type of randomization: either "exact" or "monte_carlo".

approximate_inv

A logical value indicating whether to use an approximate inverse (diagonal of the covariance matrix) instead of the full matrix inverse when computing balance metrics. This can speed up computations for high-dimensional covariates. Default is TRUE.

file

A string specifying where to save candidate randomizations (if saving, not returning).

return_type

A string specifying the format of the returned randomizations and balance measures. Allowed values are "R" for base R objects (e.g., matrix, numeric) or "jax" for 'JAX' arrays. Default is "R".

verbose

A logical value indicating whether to print progress information. Default is TRUE.

conda_env

A character string specifying the name of the conda environment to use via reticulate. Default is "fastrerandomize".

conda_env_required

A logical indicating whether the specified conda environment must be strictly used. If TRUE, an error is thrown if the environment is not found. Default is TRUE.

Details

The function supports two methods of generating randomizations:

  1. Exact enumeration: Generates all possible randomizations (memory intensive but exact).

  2. Monte Carlo sampling: Generates randomizations through sampling (more memory efficient).

For large problems (e.g., X with >20 rows), Monte Carlo sampling is recommended.

Value

Returns an S3 object with slots:

See Also

generate_randomizations_exact for the exact enumeration method. generate_randomizations_mc for the Monte Carlo sampling method.

Examples


## Not run: 
# Generate synthetic data 
X <- matrix(rnorm(20*5), 20, 5)

# Generate randomizations using exact enumeration
RandomizationSet_Exact <- generate_randomizations(
               n_units = nrow(X), 
               n_treated = round(nrow(X)/2), 
               X = X, 
               randomization_accept_prob=0.1,
               randomization_type="exact")

# Generate randomizations using Monte Carlo sampling
RandomizationSet_MC <- generate_randomizations(
               n_units = nrow(X), 
               n_treated = round(nrow(X)/2), 
               X = X,
               randomization_accept_prob = 0.1,
               randomization_type = "monte_carlo",
               max_draws = 100000,
               batch_size = 1000)
 
## End(Not run)


[Package fastrerandomize version 0.2 Index]