Are You Ready for MLOps?
Last Updated on July 25, 2023 by Editorial Team
Author(s): Yoshiyuki Igarashi
Originally published on Towards AI.
U+1F33EMoving Towards Scalable and Labor Efficient ML Projects
The introduction of irrigation systems revolutionized the way farmers develop and maintain croplands. Prior to this innovation, farmers had to manually carry water from nearby rivers or rely on flood irrigation in certain areas. As crop lands expanded, landlords had to hire more laborers to transport water to their fields, making the traditional approach both time-consuming and labor-intensive. With the implementation of irrigation systems, the water flow was automated, freeing farmers from unwanted duties and allowing them to focus on improving the quality of their crops. This resulted in massive gains for landlords who could witness their crops expanding towards the horizon, resulting in a more profitable harvest. With the benefits of irrigation systems in mind, are you ready to adopt MLOps to bring a new level of efficiency and success to your projects?
U+2753Why MLOps Matters to You?
Just as the implementation of irrigation systems revolutionized agriculture, the adoption of MLOps can bring significant benefits to your organizationβs ML projects. Youβre likely aware of the importance of streamlining operations to increase efficiency and productivity. As I have worked as a data scientist, ML engineer, and consultant for various companies, I have noticed that many of them face similar challenges when it comes to deploying ML models to production. In my experience at DataRobot, an end-to-end AI Platform, we were able to help customers reduce project duration to 3β5 days, which included everything from data preparation and exploratory data analysis to model development and deployment with monitoring and governance. Without proper ML infrastructure, many companies spend 1β2 months or more on these tasks. Thatβs why I wrote this blog post specifically for data science managers who are struggling with overloaded backlogs and data scientists who are frustrated with working in isolated environments and tedious tasks.
U+1F4A1What are the Common Challenges?
The graph illustrates the popularity of the search term βMLOpsβ on the internet from January 2019 to February 1, 2022. As you can see, there has been a noticeable uptick in interest from your peers in adopting MLOps since late 2020. Even if your organization has not yet adopted MLOps, please donβt be discouraged. We can learn from their takeaways.
According to Practitioners Guide to Machine Learning Operations (MLOps) by Google Cloud, here are the insights about the challenges from several organizations.
- Google considers they do not have reusable or reproducible components, and their processes involve difficulties in handoffs between data scientists and IT.
- Deloitte identified lack of talent and integration issues as factors that can stall or derail AI initiatives.
- Algorithmiaβs survey highlighted that challenges in deployment, scaling, and versioning efforts still hinder teams from getting value from their investments in ML.
- Capgemini Research noted that the top three challenges faced by organizations in achieving deployments at scale are lack of mid- to senior-level talent, lack of change-management processes, and lack of strong governance models for achieving scale.
We gained valuable insights into the common challenges faced when starting MLOps. In the upcoming section, we will explore the key components to actually implement MLOps. Additionally, I have compiled a list of useful platforms that may help in your understanding and implementation of MLOps.
U+1F50EBuilding Blocks of MLOps
While a comprehensive list of MLOps features may include many items, the following foundational building blocks of MLOps are essential to successfully operationalize machine learning within an organization.
- Model Deployment
Deploying a machine learning model involves integrating it into your companyβs operations to facilitate data-driven decision making. This can be a challenging process that requires collaboration between data scientists, IT teams, software developers, and business professionals to ensure the model is reliable and valuable. While ML models can provide useful insights in the research phase, their full potential is realized through deployment in production, which can generate revenue and decrease costs. Therefore, mature ML teams focus on deploying their models in the production environment to reap the benefits of their investment.
Platforms: AWS SageMaker, BentoML, MLFlow
- Model Monitoring
It is important to monitor the performance of machine learning models in production to identify potential issues before they impact the business. Even after a model is deployed, its predictive performance can deteriorate, leading to inaccurate or harmful results. For example, real-time predictions on customer data can be influenced by changing customer behavior due to factors such as natural disasters, corporate collapses and market volatility. As a result, models trained on outdated data may no longer be effective or relevant, highlighting the need for ongoing evaluation and updates. By closely monitoring model performance, production and AI teams can quickly identify potential issues and take action to maintain model accuracy and relevance. Regarding key metrics and common drifts, Made With ML is a great resource to read.
Platforms: AWS SageMaker, Azure ML, DataRobot MLOps
- Model Lifecycle Management
Model Lifecycle Management ensures the trustworthiness and scalability of machine learning models in production, itβs crucial to establish a robust and repeatable process for model lifecycle management. Without this process, inaccurate data, poor performance, or unexpected results can harm the reputation of your business. Effective model lifecycle management enables a MLOps system that can support hundreds or thousands of models in production. It also allows IT Operations to take charge of model testing updates, freeing up data science resources to work on new projects.
Platforms: AWS SageMaker, Azure ML, MLFlow, DVC, Seldon Core
- Model Experiment Tracking
Experiment tracking is a process that helps you keep a record of all the essential components of an experiment, such as the variables you change, the metrics you measure, and the models you create. By doing this, you can easily organize everything in one place and access it whenever you need it. You can also replicate experiments and verify if you get the same outcomes, which is crucial for maintaining governance standards. Additionally, tracking your progress over time enables you to identify what works and what doesnβt, making it easier to make informed decisions about your product or marketing strategies. It also ensures that valuable knowledge assets are retained within your team, even if data scientists move to other companies.
Platforms: Weights and Biases, MLFlow, Commet ML
U+1F4B0Buy an MLOps Platform vs DIY?
There is a huge debate whether we should buy an MLOps platform or build it on your own. Pros of buying an pre-built platform is it does a job from Day 1. But, cons is those pre-built platforms are not customized for your operation and the product roadmap may not necessarily align with your business and technical requirements in the long term. In order to make the most optimal decision, it is highly recommended to build a foundational platform that can accommodate a variety of MLOps products. If your team is new to MLOps projects, it is still wise to begin with an end-to-end platform to gain a comprehensive understanding of the requirements. However, it is crucial to acknowledge that relying solely on one MLOps product can limit your ability to explore better features offered by other products, especially in the rapidly evolving and still immature nature of the MLOps product market. By remaining open to exploring new options and building a foundational platform that can accommodate different products, you can ensure that your business and technical requirements are being met while also staying up-to-date with the latest industry trends and innovations.
Flexibility is a key factor for MLOps to be widely accepted. MLOps should not be tied to a specific programming language or framework, and it should be easy to combine multiple frameworks when deploying ML models. Furthermore, MLOps should be adaptable to different platforms and infrastructure to work within an organizationβs existing infrastructure without requiring significant infrastructure changes. This βagnosticβ approach to specific programming languages, frameworks, platforms, and infrastructure will make MLOps more accessible and widely adopted, fitting smoothly into a companyβs current infrastructure. Overall, the ability to remain flexible and adaptable is essential to ensure MLOps is integrated into an organizationβs workflows in a way that maximizes the value it can provide.
Next Steps: Good MLOps Resources
This github repo has one of the most comprehensive resources.
You can find the lectures on MLOps with examples and codes by topic.
If you are looking for an overview about MLOps from a reliable source, this whitepaper would be a good next step.
Thanks for reading my post! I will publish more articles on Data Science and MLOps. Before you go:
- U+1F44F Clap for the story and follow the author U+1F449
- U+1F514 Follow me: LinkedIn
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI