wroc {svyROC}R Documentation

Estimation of the ROC curve of logistic regression models with complex survey data

Description

Calculate the ROC curve of a logistic regression model considering sampling weights with complex survey data

Usage

wroc(
  response.var,
  phat.var,
  weights.var = NULL,
  tag.event = NULL,
  tag.nonevent = NULL,
  data = NULL,
  design = NULL,
  cutoff.method = NULL
)

Arguments

response.var

A character string with the name of the column indicating the response variable in the data set or a vector (either numeric or character string) with information of the response variable for all the units.

phat.var

A character string with the name of the column indicating the estimated probabilities in the data set or a numeric vector containing estimated probabilities for all the units.

weights.var

A character string indicating the name of the column with sampling weights or a numeric vector containing information of the sampling weights. It could be NULL if the sampling design is indicated in the design argument. For unweighted estimates, set all the sampling weight values to 1.

tag.event

A character string indicating the label used to indicate the event of interest in response.var. The default option is tag.event = NULL, which selects the class with the lowest number of units as event.

tag.nonevent

A character string indicating the label used for non-event in response.var. The default option is tag.nonevent = NULL, which selects the class with the greatest number of units as non-event.

data

A data frame which, at least, must incorporate information on the columns response.var, phat.var and weights.var. If data=NULL, then specific numerical vectors must be included in response.var, phat.var and weights.var, or the sampling design should be indicated in the argument design.

design

An object of class survey.design generated by survey::svydesign indicating the complex sampling design of the data. If design = NULL, information on the data set (argument data) and/or sampling weights (argument weights.var) must be included.

cutoff.method

A character string indicating the method to be used to select the optimal cut-off point. If cutoff.method = NULL, then no optimal cut-off point is calculated. If an optimal cut-off point is to be calculated, one of the following methods needs to be selected: Youden, MaxProdSpSe, ROC01, MaxEfficiency.

Details

S indicate a sample of n observations of the vector of random variables (Y,\pmb X), and \forall i=1,\ldots,n, y_i indicate the i^{th} observation of the response variable Y, and \pmb x_i the observations of the vector covariates \pmb X. Let w_i indicate the sampling weight corresponding to the unit i and \hat p_i the estimated probability of event. Let S_0 and S_1 be subsamples of S, formed by the units without the event of interest (y_i=0) and with the event of interest (y_i=1), respectively. Then, the ROC curve is estimated as follows:

\widehat{ROC}_w(\cdot)=\{(1-\widehat{Sp}_w(c),\widehat{Se}_w(c)),\:c\in (-\infty, \infty)\}

, where, the sensitivity and specificity parameters for a given cut-off point c are estimated as follows:

\widehat{Se}_w(c)=\dfrac{\sum_{i\in S_1}w_i\cdot I (\hat p_i\geq c)}{\sum_{i\in S_1}w_i}\:;\:\widehat{Sp}_w(c)=\dfrac{\sum_{i\in S_0}w_i\cdot I (\hat p_i<c)}{\sum_{i\in S_0}w_i}.

See Iparragirre et al (2023) for more information. More information of the rest of the elements is given in the documentation of the functions wauc() and wocp().

Value

The output object of this function is a list of class wroc, which contains information about the weighted ROC curve of a logistic regression model and some of its components. In particular, this list contains a total of 5 or 6 elements (depending on the selected arguments) with the following information:

References

Iparragirre, A., Barrio, I. and Arostegui, I. (2023). Estimation of the ROC curve and the area under it with complex survey data. Stat 12(1), e635. (https://doi.org/10.1002/sta4.635)

Examples

data(example_data_wroc)

mycurve <- wroc(response.var = "y", phat.var = "phat", weights.var = "weights",
                data = example_data_wroc,
                tag.event = 1, tag.nonevent = 0,
                cutoff.method = "Youden")

# Or equivalently

mycurve <- wroc(response.var = example_data_wroc$y,
                phat.var = example_data_wroc$phat,
                weights.var = example_data_wroc$weights,
                tag.event = 1, tag.nonevent = 0,
                cutoff.method = "Youden")


[Package svyROC version 1.0.0 Index]