model |
Model object.
Specifies the model whose predictions we want to explain.
Run get_supported_models()
for a table of which models explain supports natively. Unsupported models
can still be explained by passing predict_model and (optionally) get_model_specs ,
see details for more information.
|
y |
Matrix, data.frame/data.table or a numeric vector.
Contains the endogenous variables used to estimate the (conditional) distributions
needed to properly estimate the conditional expectations in the Shapley formula
including the observations to be explained.
|
xreg |
Matrix, data.frame/data.table or a numeric vector.
Contains the exogenous variables used to estimate the (conditional) distributions
needed to properly estimate the conditional expectations in the Shapley formula
including the observations to be explained.
As exogenous variables are used contemporaneously when producing a forecast,
this item should contain nrow(y) + horizon rows.
|
train_idx |
Numeric vector.
The row indices in data and reg denoting points in time to use when estimating the conditional expectations in
the Shapley value formula.
If train_idx = NULL (default) all indices not selected to be explained will be used.
|
explain_idx |
Numeric vector.
The row indices in data and reg denoting points in time to explain.
|
explain_y_lags |
Numeric vector.
Denotes the number of lags that should be used for each variable in y when making a forecast.
|
explain_xreg_lags |
Numeric vector.
If xreg != NULL , denotes the number of lags that should be used for each variable in xreg when making a forecast.
|
horizon |
Numeric.
The forecast horizon to explain. Passed to the predict_model function.
|
approach |
Character vector of length 1 or one less than the number of features.
All elements should, either be "gaussian" , "copula" , "empirical" , "ctree" , "vaeac" ,
"categorical" , "timeseries" , "independence" , "regression_separate" , or "regression_surrogate" .
The two regression approaches can not be combined with any other approach.
See details for more information.
|
phi0 |
Numeric.
The prediction value for unseen data, i.e. an estimate of the expected prediction without conditioning on any
features.
Typically we set this value equal to the mean of the response variable in our training data, but other choices
such as the mean of the predictions in the training data are also reasonable.
|
max_n_coalitions |
Integer.
The upper limit on the number of unique feature/group coalitions to use in the iterative procedure
(if iterative = TRUE ).
If iterative = FALSE it represents the number of feature/group coalitions to use directly.
The quantity refers to the number of unique feature coalitions if group = NULL ,
and group coalitions if group != NULL .
max_n_coalitions = NULL corresponds to max_n_coalitions=2^n_features .
|
iterative |
Logical or NULL
If NULL (default), the argument is set to TRUE if there are more than 5 features/groups, and FALSE otherwise.
If eventually TRUE , the Shapley values are estimated iteratively in an iterative manner.
This provides sufficiently accurate Shapley value estimates faster.
First an initial number of coalitions is sampled, then bootsrapping is used to estimate the variance of the Shapley
values.
A convergence criterion is used to determine if the variances of the Shapley values are sufficiently small.
If the variances are too high, we estimate the number of required samples to reach convergence, and thereby add more
coalitions.
The process is repeated until the variances are below the threshold.
Specifics related to the iterative process and convergence criterion are set through iterative_args .
|
group_lags |
Logical.
If TRUE all lags of each variable are grouped together and explained as a group.
If FALSE all lags of each variable are explained individually.
|
group |
List.
If NULL regular feature wise Shapley values are computed.
If provided, group wise Shapley values are computed.
group then has length equal to the number of groups.
The list element contains character vectors with the features included in each of the different groups.
See
Jullum et al. (2021)
for more information on group wise Shapley values.
|
n_MC_samples |
Positive integer.
For most approaches, it indicates the maximum number of samples to use in the Monte Carlo integration
of every conditional expectation.
For approach="ctree" , n_MC_samples corresponds to the number of samples
from the leaf node (see an exception related to the ctree.sample argument setup_approach.ctree() ).
For approach="empirical" , n_MC_samples is the K parameter in equations (14-15) of
Aas et al. (2021), i.e. the maximum number of observations (with largest weights) that is used, see also the
empirical.eta argument setup_approach.empirical() .
|
seed |
Positive integer.
Specifies the seed before any randomness based code is being run.
If NULL (default) no seed is set in the calling environment.
|
predict_model |
Function.
The prediction function used when model is not natively supported.
(Run get_supported_models() for a list of natively supported models.)
The function must have two arguments, model and newdata which specify, respectively, the model
and a data.frame/data.table to compute predictions for.
The function must give the prediction as a numeric vector.
NULL (the default) uses functions specified internally.
Can also be used to override the default function for natively supported model classes.
|
get_model_specs |
Function.
An optional function for checking model/data consistency when model is not natively supported.
(Run get_supported_models() for a list of natively supported models.)
The function takes model as argument and provides a list with 3 elements:
- labels
Character vector with the names of each feature.
- classes
Character vector with the classes of each features.
- factor_levels
Character vector with the levels for any categorical features.
If NULL (the default) internal functions are used for natively supported model classes, and the checking is
disabled for unsupported model classes.
Can also be used to override the default function for natively supported model classes.
|
verbose |
String vector or NULL.
Specifies the verbosity (printout detail level) through one or more of strings "basic" , "progress" ,
"convergence" , "shapley" and "vS_details" .
"basic" (default) displays basic information about the computation which is being performed,
in addition to some messages about parameters being sets or checks being unavailable due to specific input.
"progress displays information about where in the calculation process the function currently is.
#' "convergence" displays information on how close to convergence the Shapley value estimates are
(only when iterative = TRUE ) .
"shapley" displays intermediate Shapley value estimates and standard deviations (only when iterative = TRUE )
and the final estimates.
"vS_details" displays information about the v_S estimates.
This is most relevant for approach %in% c("regression_separate", "regression_surrogate", "vaeac" ).
NULL means no printout.
Note that any combination of four strings can be used.
E.g. verbose = c("basic", "vS_details") will display basic information + details about the v(S)-estimation process.
|
|
Named list.
Specifies extra arguments related to the computation of the Shapley values.
See get_extra_comp_args_default() for description of the arguments and their default values.
|
iterative_args |
Named list.
Specifies the arguments for the iterative procedure.
See get_iterative_args_default() for description of the arguments and their default values.
|
output_args |
Named list.
Specifies certain arguments related to the output of the function.
See get_output_args_default() for description of the arguments and their default values.
|
... |
Arguments passed on to setup_approach.categorical , setup_approach.copula , setup_approach.ctree , setup_approach.empirical , setup_approach.gaussian , setup_approach.independence , setup_approach.timeseries , setup_approach.vaeac
categorical.joint_prob_dt Data.table. (Optional)
Containing the joint probability distribution for each combination of feature
values.
NULL means it is estimated from the x_train and x_explain .
categorical.epsilon Numeric value. (Optional)
If categorical.joint_probability_dt is not supplied, probabilities/frequencies are
estimated using x_train . If certain observations occur in x_explain and NOT in x_train ,
then epsilon is used as the proportion of times that these observations occurs in the training data.
In theory, this proportion should be zero, but this causes an error later in the Shapley computation.
internal List.
Not used directly, but passed through from explain() .
ctree.mincriterion Numeric scalar or vector.
Either a scalar or vector of length equal to the number of features in the model.
The value is equal to 1 - \alpha where \alpha is the nominal level of the conditional independence tests.
If it is a vector, this indicates which value to use when conditioning on various numbers of features.
The default value is 0.95.
ctree.minsplit Numeric scalar.
Determines minimum value that the sum of the left and right daughter nodes required for a split.
The default value is 20.
ctree.minbucket Numeric scalar.
Determines the minimum sum of weights in a terminal node required for a split
The default value is 7.
ctree.sample Boolean.
If TRUE (default), then the method always samples n_MC_samples observations from the leaf nodes
(with replacement).
If FALSE and the number of observations in the leaf node is less than n_MC_samples ,
the method will take all observations in the leaf.
If FALSE and the number of observations in the leaf node is more than n_MC_samples ,
the method will sample n_MC_samples observations (with replacement).
This means that there will always be sampling in the leaf unless
sample = FALSE and the number of obs in the node is less than n_MC_samples .
empirical.type Character. (default = "fixed_sigma" )
Should be equal to either "independence" ,"fixed_sigma" , "AICc_each_k" "AICc_full" .
"independence" is deprecated. Use approach = "independence" instead.
"fixed_sigma" uses a fixed bandwidth (set through empirical.fixed_sigma ) in the kernel density estimation.
"AICc_each_k" and "AICc_full" optimize the bandwidth using the AICc criterion, with respectively
one bandwidth per coalition size and one bandwidth for all coalition sizes.
empirical.eta Numeric scalar.
Needs to be 0 < eta <= 1 .
The default value is 0.95.
Represents the minimum proportion of the total empirical weight that data samples should use.
If e.g. eta = .8 we will choose the K samples with the largest weight so that the sum of the weights
accounts for 80\
eta is the \eta parameter in equation (15) of
Aas et al. (2021).
empirical.fixed_sigma Positive numeric scalar.
The default value is 0.1.
Represents the kernel bandwidth in the distance computation used when conditioning on all different coalitions.
Only used when empirical.type = "fixed_sigma"
empirical.n_samples_aicc Positive integer.
Number of samples to consider in AICc optimization.
The default value is 1000.
Only used for empirical.type is either "AICc_each_k" or "AICc_full" .
empirical.eval_max_aicc Positive integer.
Maximum number of iterations when optimizing the AICc.
The default value is 20.
Only used for empirical.type is either "AICc_each_k" or "AICc_full" .
empirical.start_aicc Numeric.
Start value of the sigma parameter when optimizing the AICc.
The default value is 0.1.
Only used for empirical.type is either "AICc_each_k" or "AICc_full" .
empirical.cov_mat Numeric matrix. (Optional)
The covariance matrix of the data generating distribution used to define the Mahalanobis distance.
NULL means it is estimated from x_train .
gaussian.mu Numeric vector. (Optional)
Containing the mean of the data generating distribution.
NULL means it is estimated from the x_train .
gaussian.cov_mat Numeric matrix. (Optional)
Containing the covariance matrix of the data generating distribution.
NULL means it is estimated from the x_train .
timeseries.fixed_sigma Positive numeric scalar.
Represents the kernel bandwidth in the distance computation.
The default value is 2.
timeseries.bounds Numeric vector of length two.
Specifies the lower and upper bounds of the timeseries.
The default is c(NULL, NULL) , i.e. no bounds.
If one or both of these bounds are not NULL , we restrict the sampled time series to be between these bounds.
This is useful if the underlying time series are scaled between 0 and 1, for example.
vaeac.depth Positive integer (default is 3 ). The number of hidden layers
in the neural networks of the masked encoder, full encoder, and decoder.
vaeac.width Positive integer (default is 32 ). The number of neurons in each
hidden layer in the neural networks of the masked encoder, full encoder, and decoder.
vaeac.latent_dim Positive integer (default is 8 ). The number of dimensions in the latent space.
vaeac.lr Positive numeric (default is 0.001 ). The learning rate used in the torch::optim_adam() optimizer.
vaeac.activation_function An torch::nn_module() representing an activation function such as, e.g.,
torch::nn_relu() (default), torch::nn_leaky_relu() , torch::nn_selu() , or torch::nn_sigmoid() .
vaeac.n_vaeacs_initialize Positive integer (default is 4 ). The number of different vaeac models to initiate
in the start. Pick the best performing one after vaeac.extra_parameters$epochs_initiation_phase
epochs (default is 2 ) and continue training that one.
vaeac.epochs Positive integer (default is 100 ). The number of epochs to train the final vaeac model.
This includes vaeac.extra_parameters$epochs_initiation_phase , where the default is 2 .
vaeac.extra_parameters Named list with extra parameters to the vaeac approach. See
vaeac_get_extra_para_default() for description of possible additional parameters and their default values.
|
As any autoregressive forecast model will require a set of lags to make a forecast at an
arbitrary point in time, explain_y_lags
and explain_xreg_lags
define how many lags
are required to "refit" the model at any given time index. This allows the different
approaches to work in the same way they do for time-invariant models.