Precision vs Recall. What Do They Actually Tell You?
Last Updated on July 8, 2022 by Editorial Team
Author(s): Nikita Andriievskyi
Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.
Understand the idea behind Precision andΒ Recall
If you asked any data scientist or machine learning engineer about the easiest and most confusing topic they learnedβββone of the first things that would come to their mind would be Precision vsΒ Recall.
On the one hand, this topic is indeed confusing, and I myself spent a ton of time trying to understand the difference, and most importantly, what these two terms tellΒ you.
On the other hand, the topic is very simple and requires no understanding of math, programming, or anything else complex. And in this article, I will combine all resources that Iβve come across, and I will do my best in explaining the topic to you so that you donβt have to ever worry about itΒ anymore.
First of all, we have to understand that Precision and Recall are other ways to evaluate your model. Often, a simple accuracy measurement is not going toΒ suffice.
Why canβt I just use accuracy?
Letβs say you have a dataset of 1000 fruit images, 990 apples, and 10 oranges. You trained a model to classify apples vs oranges on this dataset, and your model decided to say that all the images are apples. If you compute the accuracy (correct predictions / all predictions): 990/1000, then you would get an accuracy of 99%!! Even though your model has amazing accuracy, it completely misclassifies allΒ oranges.
But, if we also evaluated the model using Precision and Recall, we would get humbled immediately.
Before going into the explanation of the two terms, we do have to know what True Positive, True Negative, False Positive, and False NegativeΒ mean.
Note: If you already know the difference perfectly, you can skip thisΒ part
Letβs first look at the TrueΒ βclassβ.
True here means that the model predicted correctly.
- So, if we created a model to classify images of dogs vs not dogsβββa True Positive prediction would be when the model classified an image as a dog, and it was indeed aΒ dog.
- A True Negative prediction would be if the model classified an image as not a dog, and it was actually not aΒ dog.
What about the FalseΒ class?
As you might have guessed, False means that the model predicted incorrectly.
- False Positiveβ the model classified an image as a dog, but it was actually not aΒ dog.
- False Negativeβββthe model classified an image as not a dog, but it was actually aΒ dog.
As you see, True and False tell you if the model was right or wrong. Whereas Positive and Negative tell you what class the model predicted. (In our example, the Positive class was dog images, and the Negative class was non-dogΒ images)
NOTE
The Positive class doesnβt necessarily have to be images of dogs, you could also say that the Positive class is images of not dogs. But mostly, data scientists refer to Positive as the class that they target/focus on.
Hopefully, by now you understand the previous part well because we are going to need it in understanding Precision vsΒ Recall.
Precision
Letβs say weβve created a model to classify images of dogs vs not dogs, and we tested our model on 10 images, letβs focus on the dogΒ class:
Letβs seeβ¦ There are 6 images of dogs, and 4 non-dog images. Our model classified 7 images as a dog, however, only 4 of them areΒ correct.
The formula for Precision:
Letβs compute Precision for the dog class together: there are 4 True Positive images (4 dog images that were classified as dogs). Now we divide it by 4 True Positives + 3 False Positives (3 images of not dogs that were classified asΒ dogs).
4/7= 0.57.
What does it tellΒ us?
Well, basically, it tells us what percentage of dog images was classified correctly among all images classified asΒ dogs.
When we calculate Precision, we focus on the predictions.
If we have 100% Precision, we can be 100% sure that If our model classifies an image as a dog, itβs definitely correct. Even if it has not classified most dog images as dogs, those that are classified as dogsβββare 100% correct. In Precision, we only care about the predicted targeted class being correctly classified.
Recall
Letβs keep the same example, I will attach the same photo, so you donβt have to scroll upΒ again:
The formula forΒ Recall:
Letβs compute it together: 4 True Positives divided by 4 True Positives + 2 False Negatives (2 images of dogs that were classified as notΒ dogs).
4/6 =Β 0.67
What does it tellΒ us?
So, if Precision showed us the percentage of dog images that were correctly classified among predicted images of dogsβ Recall shows us the percentage of dog images correctly classified among actual images ofΒ dogs.
When we calculate Recall, we focus on the actualΒ data.
If we have a 100% Recall, we can be 100% sure that if given a set of, letβs say, 20 images, 8 of which are dog images, our model will classify all 8 dog images as dogs. However, it might also say that the other 12 images are dogs too, but in Recall we only care about the actual targeted images being classified correctly.
Precision vs RecallΒ Tradeoff
I wonβt focus on this topic a lot, if you want me to write a post about it, do let me know in the comments. But basically, usually, when the Precision goes up, the Recall goes down, and viceΒ versa.
How do I know what I need to focus on more? (Real-world examples)
You might have a question like: βOkay, I understand the difference, and that there is a tradeoff, but when do you I need a higher precision, and when do I need a higherΒ recall?β
When do we care about Precision more?
Letβs imagine we are working for Google, specifically, we are working on a model that will detect email as spam vs not spam, and all spam emails will be hidden. In this case, we donβt want to accidentally classify an important email as spam, since it will be hidden. Therefore, we want to be sure that if the model classifies an email as spam, itβs correct. We donβt really care if some spam emails wonβt get hidden from the user, they will be able to hide it on their own. Take a while to process it, and try to understand how Precision fits theΒ example.
What about Recall? When do we focus on itΒ more?
Letβs imagine we are working at an airport, and we have built a model to classify dangerous people. If our model doesnβt classify a dangerous person as a dangerous person, they might get undetected by security too, and then cause a lot of damage. This is why we need to classify every dangerous person as a dangerous person.
On the other hand, if the model classifies a regular civilian as dangerous, we donβt really care, they will get checked, and then moveΒ on.
Exercises
- Take the example data with dogs vs not dogs, and try to calculate Precision and Recall for the not a dog class. (Think of the not a dog class as your PositiveΒ class).
- Try to come up with your own definition for Precision andΒ Recall.
- Think of a project or even a real-world problem where Precision would be more important, and viceΒ versa.
- Let me know your answers in the comments, and I will tell you if you are right orΒ wrong.
I do hope this article helped you, I tried to make everything as clear as it can get. But if you still have some questions, do ask them in the comments, and I will try to helpΒ you.
Precision vs Recall. What Do They Actually Tell You? was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Join thousands of data leaders on the AI newsletter. Itβs free, we donβt spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI