Towards AI Can Help your Team Adopt AI: Corporate Training, Consulting, and Talent Solutions.


Do Not Curse Your Machine Learning Models When They Are Not Performing Well in Real-time — Instead, Do This
Latest   Machine Learning

Do Not Curse Your Machine Learning Models When They Are Not Performing Well in Real-time — Instead, Do This

Last Updated on July 17, 2023 by Editorial Team

Author(s): Suhas Maddali

Originally published on Towards AI.

While the performance of machine learning models can seem extremely good on the test data, failing to understand the chances of them not performing well on real-time data can cause a lot of loss to the business.

Photo by Jack Blueberry on Unsplash

Alright, you have done quite a lot of work in taking the relevant data and trying to frame it as a machine-learning problem. After that, you have considered various approaches to perform data processing before giving it to ML models for prediction. Later, you asked your models to get to know more information about the data and understand the relationship between the input and the output. The model has learned to understand these intricate connections between the input and the output before making a guess on the test data. The results are that the model is able to generalize well not only on the training data (initial data) but also on the test data (data that it has not seen before). You did a great job. You consider the next step to deploy it in real-time only to find that it is not performing as you have expected during your training and the test time.

In this article, we will be looking forward to ways in which we can understand this phenomenon better and potentially deal with them. Though on a theoretical basis, the model might be doing well on the test data, the same performance cannot always be guaranteed on the real-time data. Therefore, understanding the factors that might cause this can be the first step to better diagnosing the issue well. Let us now go over the list of factors that can potentially make the models not so useful on real-time data and how taking a list of steps can reduce these problems to a large extent.

There can be many causes why machine learning models can fail in real-time, especially during scenarios where there is no constant monitoring of their performance on the real-time data. On the back end, we might have done a sufficient amount of work in training the models and ensuring that they are performing quite well on the test data (unseen data). However, the same performance can never be guaranteed once these models reach the production stage. We will now look at the various ways in which machine learning models can fail after putting the models in production, along with the steps that could be taken to reduce them.

Ways at which ML models fail after production

Photo by Michael Dziedzic on Unsplash

We will now be going over why some machine learning models can often fail when they are put into production.

Data Drift

There can often be situations where there might be changes in the relationship between features. Machine learning models primarily work on determining the relationships between various features and how influential they are in determining the output. There can be examples where the sentiment of users over various products might change due to unforeseen situations.

A solid example would be to consider the behavior change of people pre-covid and post-covid. We see that during pre-covid, a lot of people were included to go out and shop and buy more products instead of spending too much time watching television. This has changed, however, with the post-covid where members are spending good amounts of time watching television. Therefore, we see that there is a data drift where the data that was used to train our models in the first place is not capable of accurately predicting the behavior of customers and the outcomes due to data drift.

We might reduce the data drift further by using various statistical tests that help determine the usefulness of the features in helping us get to know their role in the outcome or the target. There are various feature selection techniques, such as forward feature selection and backward feature selection, that would also mitigate the effects of data drift during the testing phase and real-time deployment phases, respectively.

Concept Drift

This phenomenon is similar to data drift, except that this has to do more with how the relationship between features changes with the output. In other words, the relationship between the input and the output features had changed significantly from what it was before when we first trained and tested our machine learning models. In such cases, the real-time data can be different from the relationship between the input and the output variable. As a result, the model can make errors and might not always reflect similar performance as we have observed in the testing phase.

One of the interesting examples we can consider is the spending behavior of customers for online advertisements. During pre-covid, people used to travel a lot and spend a good amount outside without being restricted to online shopping. During covid, however, there has been a rise in different trends where people are mostly moving towards online shopping, leading to a significant difference in their behavior. Our ML models that were trained to predict whether a customer is going to purchase a product or not based on the past can no longer be relied upon as there is a shift in the behavior of customers that is still not yet reflected by the ML models. This is a scenario where there is concept drift.

One of the best ways to avoid this situation would be to constantly re-train and test the models with the latest data that also reflect the current trends rather than just relying on the past for the models to make useful predictions. As a result, this would lead to models learning new patterns from the data before making their predictions which would serve their purpose.


After going through this article, you will have a good idea about why machine learning models can fail in real-time though they might have performed quite well on the training and the test data. The fundamental issues could be either because of data drift or concept drift that could have a strong influence on the performance of the models with real-time data. Taking steps such as performing forward feature selection and backward feature selection can help reduce data drift. To avoid concept drift, it would be a great approach to re-train the models and perform cross-validation with the most recent data so that it is able to pick up the trends or sudden changes in the relationship between the inputs and the output, respectively.

If you like to get more updates about my latest articles and also have unlimited access to the medium articles for just 5 dollars per month, feel free to use the link below to add your support for my work. Thanks.

Below are the ways where you could contact me or take a look at my work.

GitHub: suhasmaddali (Suhas Maddali ) (


LinkedIn: (1) Suhas Maddali, Northeastern University, Data Science U+007C LinkedIn

Medium: Suhas Maddali — Medium

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓