Decision Tree Splitting: Entropy vs. Misclassification Error

Last Updated on October 26, 2022 by Editorial Team

Author(s): Poojatambe

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Why is entropy preferred over misclassification error to perform decision tree splitting?

The decision tree uses a top-down, greedy search approach with recursive partitioning. In the decision tree, the goal is to partition regions recursively until homogeneous clusters are formed. To make these partitions, a sufficient number of questions are asked.

To split the tree at each step, we need to choose the best attribute that maximizes the decrease in loss from parent to children node. Hence, defining a suitable loss function is an important step.

Here, we will try to understand the entropy and misclassification error. Also, answer why misclassification error is not used for splitting.

Entropy

Entropy is the phenomenon of information theory used to calculate uncertainty or impurity in information. ID3 tree algorithm uses entropy and information gain as loss functions to choose data splitting attributes at each step.

Consider a dataset with C classes. The cross-entropy for region R is calculated as follows:

Where Pc= Proportion of randomly selected examples in class c.

The entropy ranges between 0 to 1. The zero value of entropy indicates the data is pure or homogeneous.

Misclassification error

The misclassification loss computes the fraction of misclassified samples. Hence, it considers major class proportion in region R. Consider C target classes. Let Pc be the proportion of samples of class c belonging to the C target classes.

The misclassification loss is computed as follows:

The misclassification error ranges between 0 to 0.5.

Entropy vs Misclassification Error

The maximum decrease in loss from the parent region to children nodes or minimize children loss is used to decide the attribute for the splitting of a tree. This decrease is called information gain given as follows:

To calculate loss, we need to define a suitable loss function. Let’s compare entropy and misclassification loss with the help of an example.

Consider 900 “positive” samples and 100 “negative” samples. Let’s assume the X1 attribute is used for splitting at the parent node. Consider the following decision tree with unequal distribution of data samples after splitting.

It has one pure node classified as 200 “positive” samples and an impure node with 700 “positive” and 100 “negative” samples.

With entropy as a loss function, parent loss is 0.467, and children loss is 0.544. As one node is pure, the entropy is zero, and the impure node has a non-zero entropy value.

Using the information gain formula, the loss reduction from parent to children region is calculated as,

Gain = Entropy(parent) — [Entropy(left child)*(No of samples in left child/No of samples in parent) + Entropy(right child)*(No of samples in right child/No of samples in parent)]

Gain = 0.467 –[0.544*(800/1000) + 0 *(200/1000)]

Gain = 0.0318

With a misclassification error, parent loss is 0.1, and children loss is 0.125.

A decision tree with misclassification loss

The information gain is calculated as,

Gain = ME(parent) — [ME(left child)*(No of samples in left child/No of samples in the parent) + ME(right child)*(No of samples in right child/No of samples in the parent)]

Gain = (100/1000) — [(100/800)*(800/1000) + 0*(200/1000)]

Gain =0

From the above gain values, we can say that as the misclassification error has not gained any information hence, further splitting of the tree is not required, and the decision tree is stopped growing. But in the case of entropy, the decision tree can be partitioned further until the leaf node is reached and the entropy value becomes zero.

Let’s prove this with a geometrical perspective.

Entropy and misclassification error graphs.

The above graphs are plotted with the assumption of an even split of data into two nodes. The cross-entropy function has concave nature that proves the loss of children is always less than that of the parent. But this is not the case with the misclassification error. Hence the children and parent loss are equal.

Therefore, compared to entropy, the misclassification loss is not sensitive to changes in the class probabilities, due to which entropy is often used in building the decision tree for classification.

The Gini impurity has the same nature as entropy which is also preferred for decision tree building over misclassification loss.

References

Check my previous stories,

Image Classifier with Streamlit

2. Everything about Focal Loss

Happy Learning!!

Decision Tree Splitting: Entropy vs. Misclassification Error was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Decision Tree Splitting: Entropy vs. Misclassification Error

Author(s): Poojatambe

Why is entropy preferred over misclassification error to perform decision tree splitting?

Entropy

Misclassification error

Entropy vs Misclassification Error

References

Towards AI Team

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Exploring Deep Learning Models: Comparing ANN vs CNN for Image Recognition

LAI #72: From Python Groundwork to Function Calling, ICL Theory, and Load Balancing MoEs

Quantum AI Is Coming. Here’s What No One Is Telling You (But Should)

Tool Descriptions Are Critical: Making Better LLM Tools + Research Capability

Top 5 AI Chatbot projects to showcase on your Portfolio: with Code

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Decision Tree Splitting: Entropy vs. Misclassification Error

Author(s): Poojatambe

Why is entropy preferred over misclassification error to perform decision tree splitting?

Entropy

Misclassification error

Entropy vs Misclassification Error

References

Towards AI Team

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥