Biggest Problem With ML Systems Today
Last Updated on August 11, 2022 by Editorial Team
Author(s): Astha Puri
Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.
So you built a machine learning model and deployed it in production. Most of the models built today do not make it to production. So in this scenario, we are the rare few whose ML system is actually out there in the world. Hurray!
Only to see that it is worsening in performance with time. Why!! Why is its performance degrading? We looked for everything. It performed great on all the model evaluation metrics there are. And still. Why? The Boss is annoyed, customers are complaining, and business is heading to doom…..
Welcome to the concept of drift.
Drift happens when there are changes to the operational environment of your ML system. This was the kind of environment it was trained in. With time, if this changes inherently, your ML model no longer predicts correctly. Hence, the progression toward poorer and poorer performance.
How many times have you heard this at your company — ‘yes, there was a data scientist before you who wanted to work on models and built this amazing model, which everyone was sold for. Then they left the company for more $$$. Their model no longer works, and well, now everyone is skeptical about using data science/ML altogether’.
If the model was never performing well, it could be a gazillion things that could be causing that. But if it was performing well at a point in time, and you notice the degradation over time, drift is something you might want to look at seriously.
There are 2 types of drift that commonly occur:
- Concept drift
- Data drift (also called covariate shift or feature drift)
1. Concept Drift
This happens when something in the base environment changes in which the model was built.
Consider digital and online scams, for example. Fraudsters are constantly adapting to changing security and protection protocols online. The definition of what is considered spam has evolved over time. This is like a fundamental change in the environment.
To fix a drift like this, we need to change the model design itself. We can:
- add new features
- change ML model
- or do both of the above
2. Data Drift
This affects the feature extraction part of ML modeling. It can usually be fixed by retraining the model on new data.
Anything that degrades the data quality could lead to a data drift, for example, badly calibrated sensors. Or, for example, an NLP model trained on millennial language but then not being able to predict because, with time, gen Z have grown up and entered the world where they are also generating a lot of gen Z lingo data.
It is a very real world problem and drift will happen. There is nothing you can do about it to prevent it like other problems in ML such as overfitting, data leaks etc.
Good news — we can win the battle against drift
Like blood is a very good sign that you’ve been physically hurt, your model performing badly and accuracy going down is a very good indicator that drift has happened in some form.
Is it a very good indicator? Yes.
Is it the best timing? No.
You want to identify drift before things have blown up and customers are complaining or worse — churning. An additional drawback with waiting till you see drift in model accuracy is that, even after identifying drift, we don’t know which type of drift it is. So the problem here has a higher operational impact, and problem-solving takes longer.
A better approach might be to ‘monitor’ the model and identify early signs. This can be done by monitoring the intermediate stages of the model’s outputs. This can be done at:
- Feature extraction stage — by comparing baseline distribution of features to current distributions. If the gap is large, there might be a potential drift. This analysis can be done using statistical tests for the Kolmogorov-Smirnov test (K-S testing)
- Modeling stage — we can compare basic patterns that appear in various layers of the model. An auxiliary model can be trained to compare these under normalcy vs outliers from training data.
- Checking confidence — keeping a tab on the confidence of the model in its results can be a good indicator. If the confidence starts going down, it could be a potential flag for drift.
Drift is inevitable. Even though it is something we cannot prevent during our modeling, there are still ways to detect it and fix it. The key lies in early detection so that it can be solved before it affects business operations, customer experience, reviews, and revenue.
Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI