train {NeuralEstimators} | R Documentation |
Train a neural estimator
Description
The function caters for different variants of "on-the-fly" simulation.
Specifically, a sampler
can be provided to continuously sample new
parameter vectors from the prior, and a simulator
can be provided to
continuously simulate new data conditional on the parameters. If provided
with specific sets of parameters (theta_train
and theta_val
)
and/or data (Z_train
and Z_val
), they will be held fixed during
training.
Note that using R
functions to perform "on-the-fly" simulation requires the user to have installed the Julia package RCall
.
Usage
train(
estimator,
sampler = NULL,
simulator = NULL,
theta_train = NULL,
theta_val = NULL,
Z_train = NULL,
Z_val = NULL,
m = NULL,
M = NULL,
K = 10000,
xi = NULL,
loss = "absolute-error",
learning_rate = 1e-04,
epochs = 100,
batchsize = 32,
savepath = NULL,
stopping_epochs = 5,
epochs_per_Z_refresh = 1,
epochs_per_theta_refresh = 1,
simulate_just_in_time = FALSE,
use_gpu = TRUE,
verbose = TRUE
)
Arguments
estimator |
a neural estimator |
sampler |
a function that takes an integer |
simulator |
a function that takes a px |
theta_train |
a set of parameters used for updating the estimator using stochastic gradient descent |
theta_val |
a set of parameters used for monitoring the performance of the estimator during training |
Z_train |
a simulated data set used for updating the estimator using stochastic gradient descent |
Z_val |
a simulated data set used for monitoring the performance of the estimator during training |
m |
vector of sample sizes. If |
M |
deprecated; use |
K |
the number of parameter vectors sampled in the training set at each epoch; the size of the validation set is set to |
xi |
a list of objects used for data simulation (e.g., distance matrices); if it is provided, the parameter sampler is called as |
loss |
the loss function: a string ('absolute-error' for mean-absolute-error loss or 'squared-error' for mean-squared-error loss), or a string of Julia code defining the loss function. For some classes of estimators (e.g., |
learning_rate |
the learning rate for the optimiser ADAM (default 1e-3) |
epochs |
the number of epochs to train the neural network. An epoch is one complete pass through the entire training data set when doing stochastic gradient descent. |
batchsize |
the batchsize to use when performing stochastic gradient descent, that is, the number of training samples processed between each update of the neural-network parameters. |
savepath |
path to save the trained estimator and other information; if null (default), nothing is saved. Otherwise, the neural-network parameters (i.e., the weights and biases) will be saved during training as |
stopping_epochs |
cease training if the risk doesn't improve in this number of epochs (default 5). |
epochs_per_Z_refresh |
integer indicating how often to refresh the training data |
epochs_per_theta_refresh |
integer indicating how often to refresh the training parameters; must be a multiple of |
simulate_just_in_time |
flag indicating whether we should simulate "just-in-time", in the sense that only a |
use_gpu |
a boolean indicating whether to use the GPU if one is available |
verbose |
a boolean indicating whether information, including empirical risk values and timings, should be printed to the console during training. |
Value
a trained neural estimator or, if m
is a vector, a list of trained neural estimators
See Also
assess()
for assessing an estimator post training, and estimate()
/sampleposterior()
for making inference with observed data
Examples
## Not run:
# Construct a neural Bayes estimator for replicated univariate Gaussian
# data with unknown mean and standard deviation.
# Load R and Julia packages
library("NeuralEstimators")
library("JuliaConnectoR")
juliaEval("using NeuralEstimators, Flux")
# Define the neural-network architecture
estimator <- juliaEval('
n = 1 # dimension of each replicate
d = 2 # number of parameters in the model
w = 32 # width of each hidden layer
psi = Chain(Dense(n, w, relu), Dense(w, w, relu))
phi = Chain(Dense(w, w, relu), Dense(w, d))
deepset = DeepSet(psi, phi)
estimator = PointEstimator(deepset)
')
# Sampler from the prior
sampler <- function(K) {
mu <- rnorm(K) # Gaussian prior for the mean
sigma <- rgamma(K, 1) # Gamma prior for the standard deviation
theta <- matrix(c(mu, sigma), byrow = TRUE, ncol = K)
return(theta)
}
# Data simulator
simulator <- function(theta_set, m) {
apply(theta_set, 2, function(theta) {
t(rnorm(m, theta[1], theta[2]))
}, simplify = FALSE)
}
# Train using fixed parameter and data sets
theta_train <- sampler(10000)
theta_val <- sampler(2000)
m <- 30 # number of iid replicates
Z_train <- simulator(theta_train, m)
Z_val <- simulator(theta_val, m)
estimator <- train(estimator,
theta_train = theta_train,
theta_val = theta_val,
Z_train = Z_train,
Z_val = Z_val)
##### Simulation on-the-fly using R functions ####
juliaEval("using RCall") # requires the Julia package RCall
estimator <- train(estimator, sampler = sampler, simulator = simulator, m = m)
##### Simulation on-the-fly using Julia functions ####
# Defining the sampler and simulator in Julia can improve computational
# efficiency by avoiding the overhead of communicating between R and Julia.
juliaEval("using Distributions")
# Parameter sampler
sampler <- juliaEval("
function sampler(K)
mu = rand(Normal(0, 1), K)
sigma = rand(Gamma(1), K)
theta = hcat(mu, sigma)'
return theta
end")
# Data simulator
simulator <- juliaEval("
function simulator(theta_matrix, m)
Z = [rand(Normal(theta[1], theta[2]), 1, m) for theta in eachcol(theta_matrix)]
return Z
end")
# Train
estimator <- train(estimator, sampler = sampler, simulator = simulator, m = m)
## End(Not run)