In machine learning, evaluation metrics are used to assess the performance of a
model and determine how well it is generalizing to unseen data. Choosing the
right evaluation metric is crucial to understanding how well your model is
performing, as different metrics give different insights depending on the problem
you're trying to solve. In this section, we’ll explore the most common evaluation
metrics for classification, regression, and clustering problems.
Why are Evaluation Metrics Important?
Evaluation metrics are essential because they help quantify how good or bad a
model’s predictions are. Without these metrics, it would be impossible to
compare models and select the one that performs the best. Different metrics are
suited for different types of tasks, and using the wrong metric can give misleading
results. For example, accuracy might work well for balanced datasets but fail on
imbalanced datasets.
Evaluation Metrics for Classification Problems
Classification problems involve predicting a categorical label, and the most
commonly used metrics for classification tasks include:
1. Accuracy
o Definition: Accuracy is the proportion of correctly classified instances
out of the total instances. It is the most straightforward metric and is
commonly used in binary or multiclass classification tasks.
, o Formula:
Accuracy=TP+TNTP+TN+FP+FNAccuracy = \frac{TP + TN}{TP + TN + FP
+ FN}Accuracy=TP+TN+FP+FNTP+TN
Where:
TP = True Positives
TN = True Negatives
FP = False Positives
FN = False Negatives
o Limitations: Accuracy can be misleading when the dataset is
imbalanced (e.g., when one class is much more frequent than the
other).
o Fun Fact: Did you know that in a dataset where 95% of the data
belongs to one class, a model that always predicts the majority class
will still have 95% accuracy? But, it is actually a poor model since it
fails to predict the minority class.
2. Precision
o Definition: Precision is the ratio of correctly predicted positive
observations to the total predicted positives. It is a useful metric
when the cost of false positives is high.
o Formula: Precision=TPTP+FPPrecision = \frac{TP}{TP +
FP}Precision=TP+FPTP
o Use Case: Precision is particularly important in cases where you want
to minimize false positives, such as in email spam detection, where
misclassifying a legitimate email as spam could be problematic.
3. Recall (Sensitivity or True Positive Rate)
o Definition: Recall is the ratio of correctly predicted positive
observations to the total actual positives. It answers the question,
"Out of all the actual positive cases, how many did the model
correctly identify?"
o Formula: Recall=TPTP+FNRecall = \frac{TP}{TP + FN}Recall=TP+FNTP
o Use Case: Recall is crucial when the cost of false negatives is high,
such as in medical diagnosis where failing to identify a disease could
be dangerous.