The AUC estimates the area under the receiver operator curve (ROC) for a nominal/categorical predicted-observed dataset using the Mann-Whitney U-statistic.
Arguments
- data
(Optional) argument to call an existing data frame containing the data.
- obs
Vector with observed values (character | factor).
- pred
Vector with predicted values (character | factor).
- tidy
Logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list (default).
- na.rm
Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE.
Value
an object of class numeric
within a list
(if tidy = FALSE) or within a
data frame
(if tidy = TRUE).
Details
The AUC tests whether positives are ranked higher than negatives. It is simply the area under the ROC curve.
The AUC estimation of this function follows the procedure described by
Hand & Till (2001). The AUC_roc estimated following the trapezoid approach is
equivalent to the average between recall and specificity (Powers, 2011), which is
equivalent to the balanced accuracy (balacc
):
\(AUC_roc = \frac{(recall - FPR + 1)}{2} = \frac{recall+specificity}{2} = 1-\frac{FPR+FNR}{2}\)
Interpretation: the AUC is equivalent to the probability that a randomly case from a given class (positive for binary) will have a smaller estimated probability of belonging to another class (negative for binary) compared to a randomly chosen member of the another class.
Values: the AUC is bounded between 0 and 1. The closer to 1 the better. Values close to 0 indicate inaccurate predictions. An AUC = 0.5 suggests no discrimination ability between classes; 0.7 < AUC < 0.8 is considered acceptable, 0.8 < AUC < 0.5 is considered excellent, and AUC > 0.9 is outstanding (Mandrekar, 2010).
For the multiclass cases, the AUC is equivalent to the average of AUC of each class (Hand & Till, 2001).
Finally, the AUC is directly related to the Gini-index (a.k.a. G1) since Gini + 1 = 2*AUC. (Hand & Till, 2001).
For the formula and more details, see online-documentation
References
Hanley, J.A., McNeil, J.A. (2017). The meaning and use of the area under a receiver operating characteristic (ROC) curve. _ Radiology 143(1): 29-36_ doi:10.1148/radiology.143.1.7063747
Hand, D.J., Till, R.J. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. _ Machine Learning 45: 171-186_ doi:10.1023/A:1010920819831
Mandrekar, J.N. (2010). Receiver operating characteristic curve in diagnostic test assessment. _ J. Thoracic Oncology 5(9): 1315-1316_ doi:10.1097/JTO.0b013e3181ec173d
Powers, D.M.W. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies 2(1): 37–63. doi:10.48550/arXiv.2010.16061
Examples
# \donttest{
set.seed(123)
# Two-class
binomial_case <- data.frame(labels = sample(c("True","False"), 100,
replace = TRUE), predictions = sample(c("True","False"), 100, replace = TRUE))
# Multi-class
multinomial_case <- data.frame(labels = sample(c("Red","Blue", "Green"), 100,
replace = TRUE), predictions = sample(c("Red","Blue", "Green"), 100, replace = TRUE) )
# Get AUC estimate for two-class case
AUC_roc(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)
#> AUC_roc
#> 1 0.5692277
# Get AUC estimate for multi-class case
AUC_roc(data = multinomial_case, obs = labels, pred = predictions, tidy = TRUE)
#> AUC_roc
#> 1 0.4582371
# }