Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Introduction to Confusion Matrix
Latest

Introduction to Confusion Matrix

Last Updated on January 7, 2023 by Editorial Team

Last Updated on September 23, 2022 by Editorial Team

Author(s): Saurabh Saxena

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

What is Confusion Matrix, and how to plot it inΒ Python?

Image byΒ Author

The Confusion Matrix is the visual representation of the Actual VS Predicted values. It is a performance evaluation tool for classification algorithms, also known as the errorΒ matrix.

A two-dimensional table layout of how many predicted classes or categories were correctly predicted and how many were not allows visualization of the performance of an algorithm, typically in supervised learning.

In predictive analytics, a Confusion Matrix for binary classification is a table with two rows and two columns that reports the number of true positives, false negatives, false positives, and true negatives. This allows for more detailed analysis than simply observing accuracy.

Image byΒ Author

Why Confusion Matrix over Accuracy?

The accuracy metric can be misleading if used for the Imbalance dataset when the numbers of observations in different classes vary greatly. Whereas the Confusion Matrix provides a detailed comparison between Positives and Negatives.

Confusion Matrix consists of four important metrics True Positive(TP), True Negative(TN), False Positive(FP), False Negative(FN).

Let’s Understand them with an analogy where the algorithm has to categorize if a Person is Healthy orΒ Sick.

Confusion Matrix for Binary Classification | Image byΒ Author

(1) True PositiveΒ (TP)

The Algorithm predicted a β€œPerson is Sick” who is Sick. This concludes that the algorithm has correctly classified the positive. It is the number of correct predictions when the actual class is positive.

(2) True NegativeΒ (TN)

The Algorithm predicted a β€œPerson is Healthy” who is Healthy. This concludes that the algorithm has correctly classified the negative. It is the number of correct predictions when the actual class is negative.

(3) False PositiveΒ (FP)

The Algorithm predicted a β€œPerson is Sick” who is Healthy. Here algorithm gave a false alarm by misclassifying it as Positive instead of Negative. It is the number of incorrect predictions when the actual class is positive, also referred to as Type IΒ Error.

(4) False NegativeΒ (FN)

The Algorithm predicted a β€œPerson is Healthy” who is Sick. Here algorithm missed a Sick Person by categorizing it healthy. It is the number of incorrect predictions when the actual class is negative, also referred to as Type IIΒ Error.

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.33,
random_state=42)
lr= LogisticRegression()
lr.fit(X_train,y_train)
y_pred=lr.predict(X_test)
confusion_matrix(y_test, y_pred)
Output:
array([[ 63, 4],
[ 3, 118]])

The confusion_matrix API in sklearn provides an array as an output that has TN, FP, FN, and TP, respectively, and the same can be plotted using ConfusionMatrixDisplay API or Heatmap API of any visualization library.

Below is the python method for evaluating and plotting the Confusion matrix. It will give an array of tn, fp, fn, and tp as a return type and print the confusion matrix created by in seabornΒ theme.

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.33,
random_state=42)
lr= LogisticRegression()
lr.fit(X_train,y_train)
y_pred=lr.predict(X_test)
conf_mat, ax = confusion_matrix(y_test, y_pred)

Below is the output for theΒ code

Confusion Matrix | Image byΒ Author

The goal is to keep as many TP and TN values as possible.

In this blog, we understood what confusion Matrix is and How we can plot it in Python. Interpretation of True Positive(TP), True Negative(TN), False Positive(FP), and False Negative(FN) are the building metrics of the Confusion Matrix.

However, multiple metrics can be derived from the Confusion Matrix like Accuracy, Precision, Recall, ROC, and many more. Please refer to Deep dive into Confusion Matrix forΒ details.

References:

[1] sklearn Confusion Matrix API. https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html#sklearn.metrics.confusion_matrix

[2] sklearn ConfusionMatrixDisplay API. https://scikit-learn.org/stable/modules/generated/sklearn.metrics.ConfusionMatrixDisplay.html#sklearn.metrics.ConfusionMatrixDisplay

[3] seaborn Heatmap API. https://seaborn.pydata.org/generated/seaborn.heatmap.html


Introduction to Confusion Matrix was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓