Skip to contents

The AUC estimates the area under the receiver operator curve (ROC) for a nominal/categorical predicted-observed dataset using the Mann-Whitney U-statistic.


AUC_roc(data = NULL, obs, pred, tidy = FALSE, na.rm = TRUE)



(Optional) argument to call an existing data frame containing the data.


Vector with observed values (character | factor).


Vector with predicted values (character | factor).


Logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list (default).


Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE.


an object of class numeric within a list (if tidy = FALSE) or within a data frame (if tidy = TRUE).


The AUC tests whether positives are ranked higher than negatives. It is simply the area under the ROC curve.

The AUC estimation of this function follows the procedure described by Hand & Till (2001). The AUC_roc estimated following the trapezoid approach is equivalent to the average between recall and specificity (Powers, 2011), which is equivalent to the balanced accuracy (balacc):

\(AUC_roc = \frac{(recall - FPR + 1)}{2} = \frac{recall+specificity}{2} = 1-\frac{FPR+FNR}{2}\)

Interpretation: the AUC is equivalent to the probability that a randomly case from a given class (positive for binary) will have a smaller estimated probability of belonging to another class (negative for binary) compared to a randomly chosen member of the another class.

Values: the AUC is bounded between 0 and 1. The closer to 1 the better. Values close to 0 indicate inaccurate predictions. An AUC = 0.5 suggests no discrimination ability between classes; 0.7 < AUC < 0.8 is considered acceptable, 0.8 < AUC < 0.5 is considered excellent, and AUC > 0.9 is outstanding (Mandrekar, 2010).

For the multiclass cases, the AUC is equivalent to the average of AUC of each class (Hand & Till, 2001).

Finally, the AUC is directly related to the Gini-index (a.k.a. G1) since Gini + 1 = 2*AUC. (Hand & Till, 2001).

For the formula and more details, see online-documentation


Hanley, J.A., McNeil, J.A. (2017). The meaning and use of the area under a receiver operating characteristic (ROC) curve. _ Radiology 143(1): 29-36_ doi:10.1148/radiology.143.1.7063747

Hand, D.J., Till, R.J. (2001). A simple generalisation of the area under the ROC curve for multiple class classification problems. _ Machine Learning 45: 171-186_ doi:10.1023/A:1010920819831

Mandrekar, J.N. (2010). Receiver operating characteristic curve in diagnostic test assessment. _ J. Thoracic Oncology 5(9): 1315-1316_ doi:10.1097/JTO.0b013e3181ec173d

Powers, D.M.W. (2011). Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation. Journal of Machine Learning Technologies 2(1): 37–63. doi:10.48550/arXiv.2010.16061


# \donttest{
# Two-class
binomial_case <- data.frame(labels = sample(c("True","False"), 100, 
replace = TRUE), predictions = sample(c("True","False"), 100, replace = TRUE))
# Multi-class
multinomial_case <- data.frame(labels = sample(c("Red","Blue", "Green"), 100, 
replace = TRUE), predictions = sample(c("Red","Blue", "Green"), 100, replace = TRUE) )

# Get AUC estimate for two-class case
AUC_roc(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)
#>     AUC_roc
#> 1 0.5692277

# Get AUC estimate for multi-class case
AUC_roc(data = multinomial_case, obs = labels, pred = predictions, tidy = TRUE)
#>     AUC_roc
#> 1 0.4582371
# }