It estimates the Cohen's Kappa Coefficient for a nominal/categorical predicted-observed dataset.
Arguments
- data
(Optional) argument to call an existing data frame containing the data.
- obs
Vector with observed values (character | factor).
- pred
Vector with predicted values (character | factor).
- pos_level
Integer, for binary cases, indicating the order (1|2) of the level corresponding to the positive. Generally, the positive level is the second (2) since following an alpha-numeric order, the most common pairs are
(Negative | Positive)
,(0 | 1)
,(FALSE | TRUE)
. Default : 2.- tidy
Logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list; Default : FALSE.
- na.rm
Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE.
Value
an object of class numeric
within a list
(if tidy = FALSE) or within a
data frame
(if tidy = TRUE).
Details
The Cohen's Kappa Coefficient is the accuracy normalized by the possibility of agreement by chance. Thus, it is considered a more robust agreement measure than simply the accuracy. The kappa coefficient was originally described for evaluating agreement of classification between different "raters" (inter-rater reliability).
It is positively bounded to 1, but it is not negatively bounded. The closer to 1 the better as Kappa assumes its theoretical maximum value of 1 (perfect agreement) only when both observed and predicted values are equally distributed across the classes (i.e. identical row and column sums). Thus, the lower the kappa the lower the prediction quality.
For the formula and more details, see online-documentation
References
Cohen, J. (1960). A coefficient of agreement for nominal scales. _ Educational and Psychological Measurement 20 (1): 37–46._ doi:10.1177/001316446002000104
Examples
# \donttest{
set.seed(123)
# Two-class
binomial_case <- data.frame(labels = sample(c("True","False"), 100, replace = TRUE),
predictions = sample(c("True","False"), 100, replace = TRUE))
# Multi-class
multinomial_case <- data.frame(labels = sample(c("Red","Blue", "Green"), 100,
replace = TRUE), predictions = sample(c("Red","Blue", "Green"), 100, replace = TRUE))
# Get Cohen's Kappa Coefficient estimate for two-class case
khat(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)
#> khat
#> 1 -0.008702532
# Get Cohen's Kappa Coefficient estimate for each class for the multi-class case
khat(data = multinomial_case, obs = labels, pred = predictions, tidy = TRUE)
#> khat
#> 1 -0.08050525
# Get Cohen's Kappa Coefficient estimate for the multi-class case at a global level
khat(data = multinomial_case, obs = labels, pred = predictions, tidy = TRUE)
#> khat
#> 1 -0.08050525
# }