Biggest Problem With ML Systems Today

Last Updated on August 11, 2022 by Editorial Team

Author(s): Astha Puri

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

So you built a machine learning model and deployed it in production. Most of the models built today do not make it to production. So in this scenario, we are the rare few whose ML system is actually out there in the world. Hurray!

Only to see that it is worsening in performance with time. Why!! Why is its performance degrading? We looked for everything. It performed great on all the model evaluation metrics there are. And still. Why? The Boss is annoyed, customers are complaining, and business is heading to doom…..

Welcome to the concept of drift.

Drift happens when there are changes to the operational environment of your ML system. This was the kind of environment it was trained in. With time, if this changes inherently, your ML model no longer predicts correctly. Hence, the progression toward poorer and poorer performance.

How many times have you heard this at your company — ‘yes, there was a data scientist before you who wanted to work on models and built this amazing model, which everyone was sold for. Then they left the company for more $$$. Their model no longer works, and well, now everyone is skeptical about using data science/ML altogether’.

If the model was never performing well, it could be a gazillion things that could be causing that. But if it was performing well at a point in time, and you notice the degradation over time, drift is something you might want to look at seriously.

There are 2 types of drift that commonly occur:

Concept drift
Data drift (also called covariate shift or feature drift)

1. Concept Drift

This happens when something in the base environment changes in which the model was built.

Consider digital and online scams, for example. Fraudsters are constantly adapting to changing security and protection protocols online. The definition of what is considered spam has evolved over time. This is like a fundamental change in the environment.

To fix a drift like this, we need to change the model design itself. We can:

add new features
change ML model
or do both of the above

2. Data Drift

This affects the feature extraction part of ML modeling. It can usually be fixed by retraining the model on new data.

Anything that degrades the data quality could lead to a data drift, for example, badly calibrated sensors. Or, for example, an NLP model trained on millennial language but then not being able to predict because, with time, gen Z have grown up and entered the world where they are also generating a lot of gen Z lingo data.

It is a very real world problem and drift will happen. There is nothing you can do about it to prevent it like other problems in ML such as overfitting, data leaks etc.

Good news — we can win the battle against drift

Like blood is a very good sign that you’ve been physically hurt, your model performing badly and accuracy going down is a very good indicator that drift has happened in some form.
Is it a very good indicator? Yes.
Is it the best timing? No.

You want to identify drift before things have blown up and customers are complaining or worse — churning. An additional drawback with waiting till you see drift in model accuracy is that, even after identifying drift, we don’t know which type of drift it is. So the problem here has a higher operational impact, and problem-solving takes longer.

A better approach might be to ‘monitor’ the model and identify early signs. This can be done by monitoring the intermediate stages of the model’s outputs. This can be done at:

Feature extraction stage — by comparing baseline distribution of features to current distributions. If the gap is large, there might be a potential drift. This analysis can be done using statistical tests for the Kolmogorov-Smirnov test (K-S testing)
Modeling stage — we can compare basic patterns that appear in various layers of the model. An auxiliary model can be trained to compare these under normalcy vs outliers from training data.
Checking confidence — keeping a tab on the confidence of the model in its results can be a good indicator. If the confidence starts going down, it could be a potential flag for drift.

Summary

Drift is inevitable. Even though it is something we cannot prevent during our modeling, there are still ways to detect it and fix it. The key lies in early detection so that it can be solved before it affects business operations, customer experience, reviews, and revenue.

Biggest Problem With ML Systems Today was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Biggest Problem With ML Systems Today

Author(s): Astha Puri

1. Concept Drift

2. Data Drift

Good news — we can win the battle against drift

Summary

JOIN NOW!

🔥 Recommended Articles 🔥

Towards AI Team

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

TAI #148: New API Models from OpenAI (4.1) & xAI (grok-3); Exploring Deep Research’s Scaling Laws

Traditional RAG vs Graph RAG

I Was About to Order Taco Bell Again. Instead, I Built an AI That Talks Me Down

MCP is on Fire.

Efficient Fine-Tuning of LLMs: LoRA and QLoRA in Enterprise AI LangGraph Workflows

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Biggest Problem With ML Systems Today

Author(s): Astha Puri

1. Concept Drift

2. Data Drift

Good news — we can win the battle against drift

Summary

JOIN NOW!

🔥 Recommended Articles 🔥

Towards AI Team

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement