Accelerate your AI journey. Join our AI Community!



ML-OPS Guide Series- 1

Author(s): Rashmi Margani

Machine Learning

What it is, Why it matters, its importance, and more…

Intro: Let gets started with,

Figure 1. Basic /Fundamental Elements for ML systems for ML-Ops. Adapted from Hidden Technical Debt in Machine Learning Systems.

What is MLops?

To put it in simple terms, MLOps or ML Ops is a set of practices that aims to deploy and maintain ML models in production reliably and efficiently from defining the scope(problem statement) of the project to monitoring even after deployment to make sure everything is working as expected to be.

Now the question may be about, how DevOps VS MLOps?

Figure 2 — Google’s 5 dimensions (DevOps VS MLOps)
Figure 2 — Google’s 5 dimensions (DevOps VS MLOps)

Why it is important & how to bring ML-Ops in practicality?

Importance of ML-Ops:

Over the last decade, we have witnessed the adoption of ML in everyday life applications. Not only for esoteric applications such as Dota or AlphaGo, but ML has also made its way to pretty standard applications such as machine translation, image processing, and voice recognition.

This adoption is powered by developments in infrastructure, especially in terms of the utilization of computation power. It has unlocked the potential of deep learning and ML.

Figure 3: sourced from OpenAI:

As AI is rapidly expanding into new applications and industries, and research is making tremendous strides. Yet building successful projects is still difficult. The models fail to adapt to changes in the dynamics of the environment or changes in the data that describes the environment. So the need to establish effective practices and processes around designing, building, and deploying models is increased.

Hence MLOps play a major role in monitoring and performing periodic checks on the dependencies of the model, the usage, and the performance to ensure that it serves as expected. MLOps encourages that the model’s desired behaviours should be pre-recorded and used as a benchmark which when the model underperforms or spikes irregularly, necessary actions are taken.

Feasible steps to bring the ML-Ops in practicality,

> Keep monitoring the quality of your model in production so that it lets you detect the performance degradation, which is, in turn, a cue to retraining the model on new data.

> Use most recent data to capture evolving and emerging patterns.

> Try new implementation such as feature engineering, model architecture changes, hyperparameters to improve the Performace

Above mentioned three points require a lot of manual processes, to address these challenges ML-Ops helps to automate using CI/CD and CT.


CI refers to continuous integration: It is just no longer only about testing and validating code and components as in the case of DevOps but also testing and validating data, data schemas, and models for ML systems.

CD refers to continuous delivery: Not just about a single software package or a service, but an ML training pipeline that should automatically deploy another service such as a model prediction service within the existing ML system when there is a new business need.

CT refers to continuous training: It is a new property, unique to ML systems, that’s concerned with automatically retraining and serving the models.


Figure 4:

Going forward with ML-Ops Series will discuss in detail, the practicality of automating each Data science step for ML Systems such as Data Extraction, Data Analysis, Data Preparation, Model training, Model evaluation, Model validation, Model serving & Model monitoring using ML pipeline triggers. Various techniques and use cases to deal during a different phase of ML lifecycle and applicability of ML-ops which includes a diversity of robust automation(CI/CD) techniques for maintaining ML system.

Stay tune…

ML-OPS Guide Series- 1 was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓