Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Precision vs Recall. What Do They Actually Tell You?
Latest

Precision vs Recall. What Do They Actually Tell You?

Last Updated on July 8, 2022 by Editorial Team

Author(s): Nikita Andriievskyi

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Understand the idea behind Precision andΒ Recall

Dart board with 3 darts hitting the center
Photo by Michiel onΒ Pexel

If you asked any data scientist or machine learning engineer about the easiest and most confusing topic they learnedβ€Šβ€”β€Šone of the first things that would come to their mind would be Precision vsΒ Recall.

On the one hand, this topic is indeed confusing, and I myself spent a ton of time trying to understand the difference, and most importantly, what these two terms tellΒ you.

On the other hand, the topic is very simple and requires no understanding of math, programming, or anything else complex. And in this article, I will combine all resources that I’ve come across, and I will do my best in explaining the topic to you so that you don’t have to ever worry about itΒ anymore.

First of all, we have to understand that Precision and Recall are other ways to evaluate your model. Often, a simple accuracy measurement is not going toΒ suffice.

Why can’t I just use accuracy?

Let’s say you have a dataset of 1000 fruit images, 990 apples, and 10 oranges. You trained a model to classify apples vs oranges on this dataset, and your model decided to say that all the images are apples. If you compute the accuracy (correct predictions / all predictions): 990/1000, then you would get an accuracy of 99%!! Even though your model has amazing accuracy, it completely misclassifies allΒ oranges.

But, if we also evaluated the model using Precision and Recall, we would get humbled immediately.

Before going into the explanation of the two terms, we do have to know what True Positive, True Negative, False Positive, and False NegativeΒ mean.

Note: If you already know the difference perfectly, you can skip thisΒ part

Let’s first look at the TrueΒ β€œclass”.

True here means that the model predicted correctly.

  1. So, if we created a model to classify images of dogs vs not dogsβ€Šβ€”β€Ša True Positive prediction would be when the model classified an image as a dog, and it was indeed aΒ dog.
  2. A True Negative prediction would be if the model classified an image as not a dog, and it was actually not aΒ dog.

What about the FalseΒ class?

As you might have guessed, False means that the model predicted incorrectly.

  1. False Positiveβ€” the model classified an image as a dog, but it was actually not aΒ dog.
  2. False Negativeβ€Šβ€”β€Šthe model classified an image as not a dog, but it was actually aΒ dog.

As you see, True and False tell you if the model was right or wrong. Whereas Positive and Negative tell you what class the model predicted. (In our example, the Positive class was dog images, and the Negative class was non-dogΒ images)

NOTE

The Positive class doesn’t necessarily have to be images of dogs, you could also say that the Positive class is images of not dogs. But mostly, data scientists refer to Positive as the class that they target/focus on.

Hopefully, by now you understand the previous part well because we are going to need it in understanding Precision vsΒ Recall.

Precision

Let’s say we’ve created a model to classify images of dogs vs not dogs, and we tested our model on 10 images, let’s focus on the dogΒ class:

10 images, 6 of them are dog images, and the other ones are not, the model classified 7 images as dogs, 4 of which are correct predictions
Video by codebasics

Let’s see… There are 6 images of dogs, and 4 non-dog images. Our model classified 7 images as a dog, however, only 4 of them areΒ correct.

The formula for Precision:

Precision = # of True Positives / # of True Positives # of False Positives

Let’s compute Precision for the dog class together: there are 4 True Positive images (4 dog images that were classified as dogs). Now we divide it by 4 True Positives + 3 False Positives (3 images of not dogs that were classified asΒ dogs).

4/7= 0.57.

What does it tellΒ us?

Well, basically, it tells us what percentage of dog images was classified correctly among all images classified asΒ dogs.

When we calculate Precision, we focus on the predictions.

If we have 100% Precision, we can be 100% sure that If our model classifies an image as a dog, it’s definitely correct. Even if it has not classified most dog images as dogs, those that are classified as dogsβ€Šβ€”β€Šare 100% correct. In Precision, we only care about the predicted targeted class being correctly classified.

Recall

Let’s keep the same example, I will attach the same photo, so you don’t have to scroll upΒ again:

Video by codebasics

The formula forΒ Recall:

Recall = # of True Positives / # of True Positives + # of False Negatives

Let’s compute it together: 4 True Positives divided by 4 True Positives + 2 False Negatives (2 images of dogs that were classified as notΒ dogs).

4/6 =Β 0.67

What does it tellΒ us?

So, if Precision showed us the percentage of dog images that were correctly classified among predicted images of dogsβ€” Recall shows us the percentage of dog images correctly classified among actual images ofΒ dogs.

When we calculate Recall, we focus on the actualΒ data.

If we have a 100% Recall, we can be 100% sure that if given a set of, let’s say, 20 images, 8 of which are dog images, our model will classify all 8 dog images as dogs. However, it might also say that the other 12 images are dogs too, but in Recall we only care about the actual targeted images being classified correctly.

Precision vs RecallΒ Tradeoff

I won’t focus on this topic a lot, if you want me to write a post about it, do let me know in the comments. But basically, usually, when the Precision goes up, the Recall goes down, and viceΒ versa.

How do I know what I need to focus on more? (Real-world examples)

You might have a question like: β€œOkay, I understand the difference, and that there is a tradeoff, but when do you I need a higher precision, and when do I need a higherΒ recall?”

When do we care about Precision more?

Let’s imagine we are working for Google, specifically, we are working on a model that will detect email as spam vs not spam, and all spam emails will be hidden. In this case, we don’t want to accidentally classify an important email as spam, since it will be hidden. Therefore, we want to be sure that if the model classifies an email as spam, it’s correct. We don’t really care if some spam emails won’t get hidden from the user, they will be able to hide it on their own. Take a while to process it, and try to understand how Precision fits theΒ example.

What about Recall? When do we focus on itΒ more?

Let’s imagine we are working at an airport, and we have built a model to classify dangerous people. If our model doesn’t classify a dangerous person as a dangerous person, they might get undetected by security too, and then cause a lot of damage. This is why we need to classify every dangerous person as a dangerous person.

On the other hand, if the model classifies a regular civilian as dangerous, we don’t really care, they will get checked, and then moveΒ on.

Exercises

  1. Take the example data with dogs vs not dogs, and try to calculate Precision and Recall for the not a dog class. (Think of the not a dog class as your PositiveΒ class).
  2. Try to come up with your own definition for Precision andΒ Recall.
  3. Think of a project or even a real-world problem where Precision would be more important, and viceΒ versa.
  4. Let me know your answers in the comments, and I will tell you if you are right orΒ wrong.

I do hope this article helped you, I tried to make everything as clear as it can get. But if you still have some questions, do ask them in the comments, and I will try to helpΒ you.


Precision vs Recall. What Do They Actually Tell You? was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓