sa_diff {staccuracy} | R Documentation |
Statistical tests for the differences between standardized accuracies (staccuracies)
Description
Because the distribution of staccuracies is uncertain (and indeed, different staccuracies likely have different distributions), bootstrapping is used to empirically estimate the distributions and calculate the p-values. See the return value description for details on what the function provides.
Usage
sa_diff(
actual,
preds,
...,
na.rm = FALSE,
sa = NULL,
pct = c(0.01, 0.02, 0.03, 0.04, 0.05),
boot_alpha = 0.05,
boot_it = 1000,
seed = 0
)
Arguments
actual |
numeric vector. The actual (true) labels. |
preds |
named list of at least two numeric vectors. Each element is a vector of the same length as actual with predictions for each row corresponding to each element of actual. The names of the list elements should be the names of the models that produced each respective prediction; these names will be used to distinguish the results. |
... |
not used. Forces explicit naming of subsequent arguments. |
na.rm |
See documentation for |
sa |
list of functions. Each element is the unquoted name of a valid staccuracy function (see |
pct |
numeric with values from (0, 1). The percentage values on which the difference in staccuracies will be tested. |
boot_alpha |
numeric(1) from 0 to 1. Alpha for percentile-based confidence interval range for the bootstrapped means; the bootstrap confidence intervals will be the lowest and highest |
boot_it |
positive integer(1). The number of bootstrap iterations. |
seed |
integer(1). Random seed for the bootstrap sampling. Supply this between runs to assure identical results. |
Value
tibble with staccuracy difference results:
-
staccuracy
: name of staccuracy measure -
pred
: Each named element (model name) in the inputpreds
. The row values give the staccuracy for that prediction. Whenpred
isNA
, the row represents the difference between prediction staccuracies (diff
) instead of staccuracies themselves. -
diff
: Whendiff
takes the form 'model1-model2', then the row values give the difference in staccuracies between two named elements (model names) in the inputpreds
. Whendiff
isNA
, the row instead represents the staccuracy of a specific model prediction (pred
). -
lo
,mean
,hi
: The lower bound, mean, and upper bound of the bootstrapped staccuracy. The lower and upper bounds are confidence intervals specified by the inputboot_alpha
. -
p__
: p-values that the difference in staccuracies are at least the specified percentage amount or greater. E.g., for the default inputpct = c(0.01, 0.02, 0.03, 0.04, 0.05)
, these columns would bep01
,p02
,p03
,p04
, andp05
. As they apply only to differences between staccuracies, they are provided only fordiff
rows and areNA
forpred
rows. As an example of their meaning, if themean
difference for 'model1-model2' is 0.0832 withp01
of 0.012 andp02
of 0.035, then 1.2% of bootstrapped staccuracies had a model1 - model2 difference of less than 0.01 and 3.5% were less than 0.02. (That is, 98.8% of differences were greater than 0.01 and 96.5% were greater than 0.02.)
Examples
lm_attitude_all <- lm(rating ~ ., data = attitude)
lm_attitude__a <- lm(rating ~ . - advance, data = attitude)
lm_attitude__c <- lm(rating ~ . - complaints, data = attitude)
sdf <- sa_diff(
attitude$rating,
list(
all = predict(lm_attitude_all),
madv = predict(lm_attitude__a),
mcmp = predict(lm_attitude__c)
),
boot_it = 10
)
sdf