A Brief Implementation of MLFlow!
Last Updated on April 4, 2024 by Editorial Team
Author(s): Harish Siva Subramanian
Originally published on Towards AI.
Have you ever run into a scenario where you experiment with multiple models and lose track of the performance of each of the models?
Are you someone who just names the model name with accuracy and all the methods you have used to keep track of everything?
Well, we have all been there and done that! Thatβs how I used to track during the initial stage of my Data Science career.
To avoid all of these we use MLFlow. There is a concept called MLOps that most of us should be familiar with if we are building models.
MLOps, short for Machine Learning Operations, refers to the practices, processes, and technologies used to streamline and automate the deployment, monitoring, and management of machine learning models in production environments.
MLflow is a platform that facilitates managing the end-to-end machine learning lifecycle. It helps data scientists and engineers streamline the development, deployment, and monitoring of machine learning models. While MLflow itself is not a complete MLOps platform, it plays a significant role within the MLOps ecosystem.
Hereβs how MLflow contributes to MLOps:
- Experiment Tracking: MLflow allows users to track experiments, including parameters, metrics, and artifacts. This enables teams to compare different model iterations, reproduce results, and collaborate effectively. In an MLOps context, experiment tracking helps maintain a record of model development and performance across the entire lifecycle.
- Model Packaging and Deployment: MLflow provides tools for packaging and deploying models to various environments, including batch inference, real-time serving, and edge devices. By packaging models in a standardized format, teams can easily deploy and manage models in production, a crucial aspect of MLOps.
- Model Registry: MLflowβs model registry allows teams to organize and manage models throughout their lifecycle. It provides versioning, permissions, and audit capabilities, ensuring that models are tracked, validated, and promoted in a controlled manner. This aligns with MLOps principles of governance and lifecycle management.
- Model Monitoring: While MLflow itself does not offer extensive model monitoring capabilities, it integrates with external tools and platforms for monitoring model performance and drift. By logging metrics during training and inference, MLflow can feed data into monitoring systems for ongoing model evaluation and management.
- Collaboration and Reproducibility: MLflow promotes collaboration and reproducibility by capturing the code, data, and environment settings associated with each experiment. This allows teams to share, reproduce, and build upon each otherβs work, essential aspects of MLOps culture and practices.
In this article, I will cover how to log the metrics, parameters, models, and artifacts. Now letβs write some code to log certain metrics and model to the mlflow.
First things first, we need to pip install mlflow.
pip install mlflow
Now letβs build a sample image classification model using pytorch and letβs log the necessary things to mlflow.
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import mlflow
import mlflow.pytorch
import matplotlib.pyplot as plt
# Set random seed for reproducibility
torch.manual_seed(42)
# Check if CUDA is available and select device accordingly
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Log CUDA device GPU name
if torch.cuda.is_available():
mlflow.log_param("cuda_device", torch.cuda.get_device_name())
# Define transforms
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
# Load MNIST dataset
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
# Define dataloaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=1000, shuffle=False)
# Define model architecture
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = self.conv1(x)
x = nn.ReLU()(x)
x = self.conv2(x)
x = nn.ReLU()(x)
x = nn.MaxPool2d(2)(x)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = nn.ReLU()(x)
x = self.dropout2(x)
x = self.fc2(x)
output = nn.LogSoftmax(dim=1)(x)
return output
# Initialize model, loss, and optimizer
model = Net().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
This will download the MNIST data and define the model. Now letβs start with the mlflow code,
# End the current MLflow run
mlflow.end_run()
# Set the experiment name
experiment_id =mlflow.create_experiment("MNIST_Classification1")
with mlflow.start_run(run_name="3Epochs",experiment_id=experiment_id) as parent_id:
# Log parameters
mlflow.log_param("optimizer", "Adam")
mlflow.log_param("learning_rate", 0.001)
mlflow.log_param("batch_size", 64)
mlflow.log_param("epochs", 5)
# Lists to store loss and accuracy for plotting
train_losses = []
train_accuracies = []
test_losses = []
test_accuracies = []
epochs=3
# Training loop
for epoch in range(epochs):
model.train()
running_loss = 0.0
correct = 0
total = 0
for i, data in enumerate(train_loader, 0):
inputs, labels = data[0].to(device), data[1].to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
train_loss = running_loss / len(train_loader)
train_accuracy = correct / total
print(f'Epoch {epoch + 1}, Training Loss: {train_loss}, Training Accuracy: {train_accuracy}')
train_losses.append(train_loss)
train_accuracies.append(train_accuracy)
# Log metrics for training
mlflow.log_metric("training_loss", train_loss, step=epoch+1)
mlflow.log_metric("training_accuracy", train_accuracy, step=epoch+1)
# Test the model
model.eval()
correct = 0
total = 0
with torch.no_grad():
running_loss = 0.0
for data in test_loader:
images, labels = data[0].to(device), data[1].to(device)
outputs = model(images)
loss = criterion(outputs, labels)
running_loss += loss.item()
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
test_loss = running_loss / len(test_loader)
test_accuracy = correct / total
print(f'Testing Loss: {test_loss}, Testing Accuracy: {test_accuracy}')
test_losses.append(test_loss)
test_accuracies.append(test_accuracy)
# Log metrics for validation
mlflow.log_metric("validation_loss", test_loss, step=epoch+1)
mlflow.log_metric("validation_accuracy", test_accuracy, step=epoch+1)
# Save the model
mlflow.pytorch.log_model(model, "models")
# Plot loss curves
plt.figure(figsize=(10, 5))
plt.plot(range(1, epochs+1), train_losses, label='Training Loss', marker='o')
plt.plot(range(1, epochs+1), test_losses, label='Validation Loss', marker='o')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Loss Curves')
plt.legend()
plt.grid(True)
plt.savefig('loss_curve.png')
mlflow.log_artifact('loss_curve.png')
# Plot accuracy curves
plt.figure(figsize=(10, 5))
plt.plot(range(1, epochs+1), train_accuracies, label='Training Accuracy', marker='o')
plt.plot(range(1, epochs+1), test_accuracies, label='Validation Accuracy', marker='o')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('Accuracy Curves')
plt.legend()
plt.grid(True)
plt.savefig('accuracy_curve.png')
mlflow.log_artifact('accuracy_curve.png')
# mlflow.end_run()
print('Finished Training')
In the above case, we first created an experiment named βMNIST_Classification1β. Under this experiment, we will have multiple runs with each run we modify something. In this article, for the sake of simplicity, I will modify the number of epochs for both runs.
In the above code, we define the run named β3Epochsβ and specify the experiment_id that we created in the previous lines.
We log all the parameters using
mlflow.log_param(key,value)
If you want to log a metric, you would use
mlflow.log_metric(key, value, step=None)
To log a pytorch model,
mlflow.pytorch.log_model(model, "models")
Here the model is the actual model.
If you want to log a figure or a plot,
plt.savefig('loss_curve.png')
mlflow.log_artifact('loss_curve.png')
Save the model locally and then log them as an artifact like the above.
You can view the logged plot and all the metrics in the MLflow UI. Make sure you have MLflow server running and accessible in order to view the logged artifacts.
Now as soon as you run the code, you will see a new directory being automatically created in the project folder named βmlrunsβ
Inside that, we see 0 and also a folder with a bunch of numbers. Those numbers indicate the experiment_id that was created when we initially created an experiment. For every new experiment, you get to see a new folder.
Now go to the project folder, and open Anaconda Prompt or Terminal and then type the following,
mlflow ui
In my case, it looked like this,
As soon as you run, you will see a local host being created, and the link will pop up on the terminal. If you open the terminal, you can see the UI of the mlflow,
You can see the run name β3Epochsβ being created. When you click that you can see all the logged metrics, parameters and the artifacts.
Run 2
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import mlflow
import mlflow.pytorch
import matplotlib.pyplot as plt
# Set random seed for reproducibility
torch.manual_seed(42)
# Check if CUDA is available and select device accordingly
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Log CUDA device GPU name
if torch.cuda.is_available():
mlflow.log_param("cuda_device", torch.cuda.get_device_name())
# Define transforms
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
# Load MNIST dataset
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
# Define dataloaders
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=1000, shuffle=False)
# Define model architecture
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = self.conv1(x)
x = nn.ReLU()(x)
x = self.conv2(x)
x = nn.ReLU()(x)
x = nn.MaxPool2d(2)(x)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = nn.ReLU()(x)
x = self.dropout2(x)
x = self.fc2(x)
output = nn.LogSoftmax(dim=1)(x)
return output
# Initialize model, loss, and optimizer
model = Net().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# End the current MLflow run
mlflow.end_run()
# Set the experiment name
#experiment_id =mlflow.create_experiment("MNIST_Classification1")
# Start MLflow run with a model name
with mlflow.start_run(run_name="5Epochs",experiment_id=240219543460999387) as parent_id:
#with mlflow.start_run(run_name="3Epochs",experiment_id=experiment_id) as parent_id:
# Log parameters
mlflow.log_param("optimizer", "Adam")
mlflow.log_param("learning_rate", 0.001)
mlflow.log_param("batch_size", 64)
mlflow.log_param("epochs", 5)
# Lists to store loss and accuracy for plotting
train_losses = []
train_accuracies = []
test_losses = []
test_accuracies = []
epochs=5
# Training loop
for epoch in range(epochs):
model.train()
running_loss = 0.0
correct = 0
total = 0
for i, data in enumerate(train_loader, 0):
inputs, labels = data[0].to(device), data[1].to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
train_loss = running_loss / len(train_loader)
train_accuracy = correct / total
print(f'Epoch {epoch + 1}, Training Loss: {train_loss}, Training Accuracy: {train_accuracy}')
train_losses.append(train_loss)
train_accuracies.append(train_accuracy)
# Log metrics for training
mlflow.log_metric("training_loss", train_loss, step=epoch+1)
mlflow.log_metric("training_accuracy", train_accuracy, step=epoch+1)
# Test the model
model.eval()
correct = 0
total = 0
with torch.no_grad():
running_loss = 0.0
for data in test_loader:
images, labels = data[0].to(device), data[1].to(device)
outputs = model(images)
loss = criterion(outputs, labels)
running_loss += loss.item()
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
test_loss = running_loss / len(test_loader)
test_accuracy = correct / total
print(f'Testing Loss: {test_loss}, Testing Accuracy: {test_accuracy}')
test_losses.append(test_loss)
test_accuracies.append(test_accuracy)
# Log metrics for validation
mlflow.log_metric("validation_loss", test_loss, step=epoch+1)
mlflow.log_metric("validation_accuracy", test_accuracy, step=epoch+1)
# Save the model
mlflow.pytorch.log_model(model, "models")
# Plot loss curves
plt.figure(figsize=(10, 5))
plt.plot(range(1, epochs+1), train_losses, label='Training Loss', marker='o')
plt.plot(range(1, epochs+1), test_losses, label='Validation Loss', marker='o')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Loss Curves')
plt.legend()
plt.grid(True)
plt.savefig('loss_curve.png')
mlflow.log_artifact('loss_curve.png')
# Plot accuracy curves
plt.figure(figsize=(10, 5))
plt.plot(range(1, epochs+1), train_accuracies, label='Training Accuracy', marker='o')
plt.plot(range(1, epochs+1), test_accuracies, label='Validation Accuracy', marker='o')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('Accuracy Curves')
plt.legend()
plt.grid(True)
plt.savefig('accuracy_curve.png')
mlflow.log_artifact('accuracy_curve.png')
# mlflow.end_run()
print('Finished Training')
Now, in this case, I wanted to make it a second run under the same experiment. So I fed the expeirment_id from the folder name that was created for the first run when we initialized the experiment. Also, this time we don't need to create an experiment, so commenting out that line!
We use that experiment_id and we create a new run. This time I named it as β5Epochsβ and changed the number of epochs to 5 in the code.
That is the one corresponding to 5Epochs!
That is it!! Thank you for reading this article!
If you like the article and would like to support me, make sure to:
- U+1F44F Clap for the story (50 claps) to help this article be featured
- Follow me on Medium
- U+1F4F0 View more content on my medium profile
- U+1F514 Follow Me: LinkedIn U+007C GitHub
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI