pinterval_conformal {pintervals}R Documentation

Conformal Prediction Intervals of Continuous Values

Description

This function calculates conformal prediction intervals with a confidence level of 1-alpha for a vector of (continuous) predicted values using inductive conformal prediction. The intervals are computed using either a calibration set with predicted and true values or a set of pre-computed non-conformity scores from the calibration set. The function returns a tibble containing the predicted values along with the lower and upper bounds of the prediction intervals.

Usage

pinterval_conformal(
  pred,
  calib = NULL,
  calib_truth = NULL,
  alpha = 0.1,
  ncs_function = "absolute_error",
  weighted_cp = FALSE,
  ncs = NULL,
  lower_bound = NULL,
  upper_bound = NULL,
  min_step = 0.01,
  grid_size = NULL,
  return_min_q = FALSE
)

Arguments

pred

Vector of predicted values

calib

A numeric vector of predicted values in the calibration partition or a 2 column tibble or matrix with the first column being the predicted values and the second column being the truth values

calib_truth

A numeric vector of true values in the calibration partition. Only required if calib is a numeric vector

alpha

The confidence level for the prediction intervals. Must be a single numeric value between 0 and 1

ncs_function

A function or a character string matching a function that takes two arguments, a vector of predicted values and a vector of true values, in that order. The function should return a numeric vector of nonconformity scores. Default is 'absolute_error' which returns the absolute difference between the predicted and true values.

weighted_cp

Logical. If TRUE, the function will use weighted conformal prediction. Default is FALSE. Experimental, use with caution.

ncs

A numeric vector of pre-computed nonconformity scores from a calibration partition. If provided, calib will be ignored

lower_bound

Optional minimum value for the prediction intervals. If not provided, the minimum (true) value of the calibration partition will be used

upper_bound

Optional maximum value for the prediction intervals. If not provided, the maximum (true) value of the calibration partition will be used

min_step

The minimum step size for the grid search. Default is 0.01. Useful to change if predictions are made on a discrete grid or if the resolution of the interval is too coarse or too fine.

grid_size

Alternative to min_step, the number of points to use in the grid search between the lower and upper bound. If provided, min_step will be ignored.

return_min_q

Logical. If TRUE, the function will return the minimum quantile of the nonconformity scores for each predicted value. Default is FALSE. Primarily used for debugging purposes.

Value

A tibble with the predicted values and the lower and upper bounds of the prediction intervals.

Examples

library(dplyr)
library(tibble)
x1 <- runif(1000)
x2 <- runif(1000)
y <- rlnorm(1000, meanlog = x1 + x2, sdlog = 0.5)
df <- tibble(x1, x2, y)
df_train <- df %>% slice(1:500)
df_cal <- df %>% slice(501:750)
df_test <- df %>% slice(751:1000)
mod <- lm(log(y) ~ x1 + x2, data=df_train)
calib <- exp(predict(mod, newdata=df_cal))
calib_truth <- df_cal$y
pred_test <- exp(predict(mod, newdata=df_test))

pinterval_conformal(pred_test,
calib = calib,
calib_truth = calib_truth,
alpha = 0.1,
lower_bound = 0,
grid_size = 10000)


[Package pintervals version 0.7.7 Index]