In the field of machine learning and specifically the problem of statistical classification, a confusion matrix, also known as error matrix,[11] is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one; in unsupervised learning it is usually called a matching matrix.
Table layout for visualizing performance; also called an error matrix
Terminology and derivations
from a confusion matrix
- condition positive (P)
- the number of real positive cases in the data
- condition negative (N)
- the number of real negative cases in the data
- true positive (TP)
- A test result that correctly indicates the presence of a condition or characteristic
- true negative (TN)
- A test result that correctly indicates the absence of a condition or characteristic
- false positive (FP), Type I error
- A test result which wrongly indicates that a particular condition or attribute is present
- false negative (FN), Type II error
- A test result which wrongly indicates that a particular condition or attribute is absent
- sensitivity, recall, hit rate, or true positive rate (TPR)

- specificity, selectivity or true negative rate (TNR)

- precision or positive predictive value (PPV)

- negative predictive value (NPV)

- miss rate or false negative rate (FNR)

- fall-out or false positive rate (FPR)

- false discovery rate (FDR)

- false omission rate (FOR)

- Positive likelihood ratio (LR+)

- Negative likelihood ratio (LR-)

- prevalence threshold (PT)

- threat score (TS) or critical success index (CSI)

- Prevalence

- accuracy (ACC)

- balanced accuracy (BA)

- F1 score
- is the harmonic mean of precision and sensitivity:

- phi coefficient (φ or rφ) or Matthews correlation coefficient (MCC)

- Fowlkes–Mallows index (FM)

- informedness or bookmaker informedness (BM)

- markedness (MK) or deltaP (Δp)

- Diagnostic odds ratio (DOR)

Sources: Fawcett (2006),[1] Piryonesi and El-Diraby (2020),[2]
Powers (2011),[3] Ting (2011),[4] CAWCR,[5] D. Chicco & G. Jurman (2020, 2021, 2023),[6][7][8] Tharwat (2018).[9] Balayla (2020)[10] |
Each row of the matrix represents the instances in an actual class while each column represents the instances in a predicted class, or vice versa – both variants are found in the literature.[12] The name stems from the fact that it makes it easy to see whether the system is confusing two classes (i.e. commonly mislabeling one as another).
It is a special kind of contingency table, with two dimensions ("actual" and "predicted"), and identical sets of "classes" in both dimensions (each combination of dimension and class is a variable in the contingency table).