Confusion matrix (I, II type errors)

Disclaimer: This terms applicable not only for Machine Learning

Jun 6, 2021 | - views

Lets talk about documents that we want to classify as relevant (mark as 1) or irrelevant (mark as 0). In this context does not necessary to specify classification method: it could be complex machine learning algorithms or just manual attached label. If we known actual label then we can compare our marks (known as predicted) with actual: compare actual and prediction.

If we got multiple documents than we can score (evaluate quality) our prediction mechanism with some sort of quality functions.

	Predicted positive	Predicted negative
Actual positive	True Positive hit	False Negative II type miss
Actual negative	False Positive I type false alarm	True Negative correct rejection

Based on TP, FN, FP, TN we can evaluate prediction and recall:

Accuracy = (TP + TN) / (TP + TN + FP + FN)
Precision = TP / (TP + FP)
Recall = TP / (TP + FN)
F1 = 2 * Precision * Recall / (Precision + Recall)
F_Beta = (1 + Beta^2) * Precision * Recall / (Beta^2 * Precision + Recall)

Why accuracy is not all what you need? If we have imbalance classes like many positives and some negatives we can achieve high accuracy just predict positive class all the time.

Reduce I vs II type

All the time we should choose between I type "false alarm" and II type "miss".

Reduce I type when we want to minimize false alarms.
Reduce II type when we want to minimize miss.

Applications

Static code analysis. We want to detect bug in code expression so we can got "false alarm" (code is correct but tool is alarmed) / "hit" (correct but founded) / "miss" (missed bug).

Fraud detection. We want to detect fraud operation based on operation pattern.

Computer virus detection. We want to detect virus based on program activity.

Text search. We want to detect relevant documents. But in these case actually we want to found most relevant documents. That means that we want to use rank quality functions like @K (top K) for precition@K and accuracy@K.