classif_model_compare {clinpubr} | R Documentation |
Performance comparison of classification models
Description
Compare the performance of classification models by commonly used metrics, and generate commonly used plots including receiver operating characteristic curve plot, decision curve analysis plot, and calibration plot.
Usage
classif_model_compare(
data,
target_var,
model_names,
colors = NULL,
save_output = FALSE,
figure_type = "png",
output_prefix = "model_compare",
as_probability = FALSE,
auto_order = TRUE
)
Arguments
data |
A data frame containing the target variable and the predicted values. |
target_var |
A string specifying the name of the target variable in the data frame. |
model_names |
A vector of strings specifying the names of the models to compare. |
colors |
A vector of colors to use for the plots. The last 2 colors are used for the "Treat all" and "Treat none" lines in the DCA plot. |
save_output |
A logical value indicating whether to output the results to files. |
figure_type |
A character string of the figure type. Can be |
output_prefix |
A string specifying the prefix for the output files. |
as_probability |
A logical or a vector of variable names. The logical value indicates whether to convert variables not in range 0 to 1 into this range. The vector of variable names means to convert these variables to the range of 0 to 1. |
auto_order |
A logical value indicating whether to automatically order the models by their AUCs.
If |
Value
A list of various results. If the output files are not in desired format, these results can be modified for further use.
metric_table: A data frame containing the performance metrics for each model.
roc_plot: A
ggplot
object of ROC curves.dca_plot: A
ggplot
object of decision curve analysis plots.calibration_plot: A
ggplot
object of calibration plots.
Metrics
AUC: Area Under the Receiver Operating Characteristic Curve
Accuracy: Overall accuracy
Sensitivity: True positive rate
Specificity: True negative rate
Pos Pred Value: Positive predictive value
Neg Pred Value: Negative predictive value
F1: F1 score
Kappa: Cohen's kappa
Brier: Brier score
cutoff: Optimal cutoff for classification, metrics that require a cutoff are based on this value.
Youden: Youden's J statistic
HosLem: Hosmer-Lemeshow test p-value
Examples
data(cancer, package = "survival")
df <- kidney
df$dead <- ifelse(df$time <= 100 & df$status == 0, NA, df$time <= 100)
df <- na.omit(df[, -c(1:3)])
model0 <- glm(dead ~ age + frail, family = binomial(), data = df)
model <- glm(dead ~ ., family = binomial(), data = df)
df$base_pred <- predict(model0, type = "response")
df$full_pred <- predict(model, type = "response")
classif_model_compare(df, "dead", c("base_pred", "full_pred"), save_output = FALSE)