Likelihood Ratios (Classification) — likelihood

It estimates the positive likelihood ratio posLr, negative likelihood ratio negLr, and diagnostic odds ratio dor for a nominal/categorical predicted-observed dataset.

Usage

posLr(
  data = NULL,
  obs,
  pred,
  pos_level = 2,
  atom = FALSE,
  tidy = FALSE,
  na.rm = TRUE
)

negLr(
  data = NULL,
  obs,
  pred,
  pos_level = 2,
  atom = FALSE,
  tidy = FALSE,
  na.rm = TRUE
)

dor(
  data = NULL,
  obs,
  pred,
  pos_level = 2,
  atom = FALSE,
  tidy = FALSE,
  na.rm = TRUE
)

Arguments

data: (Optional) argument to call an existing data frame containing the data.
obs: Vector with observed values (character | factor).
pred: Vector with predicted values (character | factor).
pos_level: Integer, for binary cases, indicating the order (1|2) of the level corresponding to the positive. Generally, the positive level is the second (2) since following an alpha-numeric order, the most common pairs are (Negative | Positive), (0 | 1), (FALSE | TRUE). Default : 2.
atom: Logical operator (TRUE/FALSE) to decide if the estimate is made for each class (atom = TRUE) or at a global level (atom = FALSE); Default : FALSE. When dataset is "binomial" atom does not apply.
tidy: Logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list; Default : FALSE.
na.rm: Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE.

Value

an object of class numeric within a list (if tidy = FALSE) or within a data frame (if tidy = TRUE).

Details

The likelihood ratios are metrics designed to assess the expectations of classification tasks. They are highly related to recall (sensitivity or true positive rate) and specificity (selectivity or true negative rate).

For a multiclass case, positive and negative results are class-wise. We can either hit the actual class, or the actual non-class. For example, a classification may have 3 potential results: cat, dog, or fish. Thus, when the actual value is dog, the only true positive is dog, but we can obtain a true negative either by classifying it as a cat or a fish. The posLr, negLr, and dor estimates are the mean across all classes.

The positive likelihood ratio (posLr) measures the odds of obtaining a positive prediction in cases that are actual positives.

The negative likelihood ratio (negLr) relates the odds of obtaining a negative prediction for actual non-negatives relative to the probability of actual negative cases obtaining a negative prediction result.

Finally, the diagnostic odds ratio (dor) measures the effectiveness of classification. It represents the odds of a positive prediction result in actual (observed) positive cases with respect to the odds of a positive prediction for the actual negative cases.

The ratios are define as follows:

\(posLr = \frac{recall}{1+specificity} = \frac{TPR}{FPR}\)

\(negLr = \frac{1-recall}{specificity} = \frac{FNR}{TNR}\)

\(dor = \frac{posLr}{negLr}\)

For more details, see online-documentation

References

GlasaJeroen, A.S., Lijmer, G., Prins, M.H., Bonsel, G.J., Bossuyta, P.M.M. (2009). The diagnostic odds ratio: a single indicator of test performance. Journal of Clinical Epidemiology 56(11): 1129-1135. doi:10.1016/S0895-4356(03)00177-X

Examples

# \donttest{
set.seed(123)
# Two-class
binomial_case <- data.frame(labels = sample(c("True","False"), 100, replace = TRUE), 
predictions = sample(c("True","False"), 100, replace = TRUE))
# Multi-class
multinomial_case <- data.frame(labels = sample(c("Red","Blue", "Green"), 100, replace = TRUE),
predictions = sample(c("Red","Blue", "Green"), 100, replace = TRUE)    )

# Get posLr, negLr, and dor for a two-class case
posLr(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)
#>       posLr
#> 1 0.9807018
negLr(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)
#>      negLr
#> 1 1.016781
dor(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)
#>         dor
#> 1 0.9645161

# }