What is F1 Score?

The F1 Score is the harmonic mean between precision and recall.

One of the problems in binary classification is that it is impossible to optimize either precision or recall without adversely affecting the other. This is the general tradeoff between false positives and false negatives, or Type I and Type II error. For example, maximizing the recall would require minimizing the number of false negatives, yet creating a decision threshold that does so would subsequently lead to more false positives and thus a lower precision score.

The F1 score is a weighted average of both the precision and recall, and if the each is given equal weight, the formula reduces to:

F1 = 2 * (precision * recall) / (precision + recall)

If either precision or recall is deemed to be more important for a specific classification task, they can be weighted accordingly. 

Using an example:

Confusion Matrix

Precision = True Positives / (True Positives + False Positives)

Recall = True Positives / (True Positives + False Negatives)

F1 Score = 2 * (precision * recall) / (precision + recall)

Precision = 150/(150 + 60) = .714

Recall = 150/(150 + 50) = .75

F1 Score = 2 * (0.714 * 0.75) / (0.714 + 0.75) = 0.179

Author

Help us improve this post by suggesting in comments below:

– modifications to the text, and infographics
– video resources that offer clear explanations for this question
– code snippets and case studies relevant to this concept
– online blogs, and research publications that are a “must read” on this topic

Leave the first comment

Partner Ad