generate_randomizations {fastrerandomize} | R Documentation |
Generate randomizations for a rerandomization-based experimental design
Description
This function generates randomizations for experimental design using either exact enumeration or Monte Carlo sampling methods. It provides a unified interface to both approaches while handling memory and computational constraints appropriately.
Usage
generate_randomizations(
n_units,
n_treated,
X = NULL,
randomization_accept_prob,
threshold_func = NULL,
max_draws = 10^6,
batch_size = 1000,
randomization_type = "monte_carlo",
approximate_inv = TRUE,
file = NULL,
return_type = "R",
verbose = TRUE,
conda_env = "fastrerandomize",
conda_env_required = TRUE
)
Arguments
n_units |
An integer specifying the total number of experimental units. |
n_treated |
An integer specifying the number of units to be assigned to treatment. |
X |
A numeric matrix of covariates used for balance checking. Cannot be |
randomization_accept_prob |
A numeric value between 0 and 1 specifying the probability threshold for accepting randomizations based on balance. |
threshold_func |
A 'JAX' function that computes a balance measure for each randomization. Only used for Monte Carlo sampling. |
max_draws |
An integer specifying the maximum number of randomizations to draw in Monte Carlo sampling. |
batch_size |
An integer specifying batch size for Monte Carlo processing. |
randomization_type |
A string specifying the type of randomization: either |
approximate_inv |
A logical value indicating whether to use an approximate inverse
(diagonal of the covariance matrix) instead of the full matrix inverse when computing
balance metrics. This can speed up computations for high-dimensional covariates.
Default is |
file |
A string specifying where to save candidate randomizations (if saving, not returning). |
return_type |
A string specifying the format of the returned randomizations and balance
measures. Allowed values are |
verbose |
A logical value indicating whether to print progress information. Default is |
conda_env |
A character string specifying the name of the conda environment to use
via |
conda_env_required |
A logical indicating whether the specified conda environment
must be strictly used. If |
Details
The function supports two methods of generating randomizations:
Exact enumeration: Generates all possible randomizations (memory intensive but exact).
Monte Carlo sampling: Generates randomizations through sampling (more memory efficient).
For large problems (e.g., X with >20 rows), Monte Carlo sampling is recommended.
Value
Returns an S3 object with slots:
-
assignments
An array where each row represents one possible treatment assignment vector containing the accepted randomizations. -
balance_measures
A numeric vector containing the balance measure for each corresponding randomization. -
fastrr_env
The fastrerandomize environment. -
file_output
If file is specified, results are saved to the given file path instead of being returned.
See Also
generate_randomizations_exact
for the exact enumeration method.
generate_randomizations_mc
for the Monte Carlo sampling method.
Examples
## Not run:
# Generate synthetic data
X <- matrix(rnorm(20*5), 20, 5)
# Generate randomizations using exact enumeration
RandomizationSet_Exact <- generate_randomizations(
n_units = nrow(X),
n_treated = round(nrow(X)/2),
X = X,
randomization_accept_prob=0.1,
randomization_type="exact")
# Generate randomizations using Monte Carlo sampling
RandomizationSet_MC <- generate_randomizations(
n_units = nrow(X),
n_treated = round(nrow(X)/2),
X = X,
randomization_accept_prob = 0.1,
randomization_type = "monte_carlo",
max_draws = 100000,
batch_size = 1000)
## End(Not run)