A Unique Way of Visualising Confusion Matrix — Sankey Chart

Last Updated on January 6, 2023 by Editorial Team

Author(s): Hrishikesh Patel

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

A Unique Way Of Visualising Confusion Matrix — Sankey Chart

Go Sankey for Less Confusion!

A confusion matrix in machine learning conveniently summarizes a model’s performance. However, when communicating with non-technical stakeholders, the confusion matrix might seem unintuitive 🤔. So what’s the fix — create a Sankey diagram.

Sankey diagram representing a binary confusion matrix (image by the author

The above image illustrates the Sankey diagram for a typical binary confusion matrix. In the diagram,

The rectangle boxes on the left show True classes whereas the right counterparts show Predicted classes.
The green color highlights correct classifications and the red color is for misclassifications.

Story outline

What’s a Sankey diagram in a nutshell?
How to Create a Sankey diagram from a Confusion Matrix?
Bonus 🎁

What’s a Sankey diagram in a nutshell?

A Sankey diagram is used to visualize flow or connections from source to sink. Let’s understand its application with a simple example.

Consider we have a dataset of enrolments👨‍🎓👩‍🎓 in data science or business analytics courses in three universities🏫. Here the universities can be treated as the source and the courses as the sink. The number of enrolments indicates a connection from the source to the sink. Some of these connections can be heavier than others e.g. connection from University A to Data Science is heavier than its connection to business analytics.

Sankey diagram created from https://sankeymatic.com/build/

Sankey diagram for confusion matrix has the following components:

Source: True Classes
Target (Sink): Predicted Classes
Connection/flow: Number of instances

How to Create a Sankey diagram from a Confusion Matrix?

We’ll follow 3 steps approach as illustrated in the below image to create the Sankey diagram.

3 steps approach to plot Sankey diagram from Confusion Matrix (Image by the author)

Step-1: Get Confusion Matrix

In this step, we’ll generate a confusion matrix. This can be output from the sci-kit learn confusion_matrix function. For simplicity, we’ll use the following confusion matrix.

Step-2: Transform Confusion Matrix to DataFrame

We’ll divide this step into several small steps.

2.1 — Create a dataframe from the confusion matrix

2.2 — Restructure the dataframe

2.3 — Add a new column ‘color’

Now we’ll add a new column ‘color’ to highlight the truth of predictions. Here rgba(211,255,216,0.6) indicates the green color, which will highlight correct predictions. Whereas incorrect predictions will be highlighted in red color which is rgba(245,173,168,0.6) .

2.4 — Map source and target columns to a numeric index

Let’s add a new column for the text to show when we hover over the chart.

2.5 — Add New Column “tooltip”

Now we are ready to plot the chart.

Step-3: Create a Sankey Chart

The plotting function go.Sankey takes two main arguments — node & link. Nodes are the classes i.e. True Class 1, Predicted Class 2, etc. and links are the connections/flow between True and Predicted Classes.

Bonus

Thanks for finishing all the steps😀 I know it’s a tedious process to go through the steps so why not create a function? Well, I have created a handy function to plot a Sankey chart for any confusion matrix (binary & multi-class).

Feel free to check out this notebook on GitHub to learn more about the function.

Before you go!

I hope you have enjoyed the story and found it useful. Follow me on Medium if you’d like more stories like this and subscribe to me to get my new stories directly into your inbox.

My other stories you might enjoy…

A Unique Way of Visualising Confusion Matrix — Sankey Chart was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

A Unique Way of Visualising Confusion Matrix — Sankey Chart

Author(s): Hrishikesh Patel

A Unique Way Of Visualising Confusion Matrix — Sankey Chart

Go Sankey for Less Confusion!

Story outline

What’s a Sankey diagram in a nutshell?

How to Create a Sankey diagram from a Confusion Matrix?

Step-1: Get Confusion Matrix

Step-2: Transform Confusion Matrix to DataFrame

Step-3: Create a Sankey Chart

Bonus

Towards AI Team

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Do AI Agents Really Use the Tools You Build for Them? I Tested It.

Understanding Neural Networks — and Building One!

LLMs Don’t Just Need to Be Smart — They Need to Be Specific. Here’s How.

Beyond pre-trained LLMs: Augmenting LLMs through vector databases to create a chatbot on organizational data

Harnessing the power of LLMs and LangChain for structured data extraction from unstructured data

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

A Unique Way of Visualising Confusion Matrix — Sankey Chart

Author(s): Hrishikesh Patel

A Unique Way Of Visualising Confusion Matrix — Sankey Chart

Go Sankey for Less Confusion!

Story outline

What’s a Sankey diagram in a nutshell?

How to Create a Sankey diagram from a Confusion Matrix?

Step-1: Get Confusion Matrix

Step-2: Transform Confusion Matrix to DataFrame

Step-3: Create a Sankey Chart

Bonus

Towards AI Team

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement