Deploying Custom Detectron2 Models with a REST API: A Step-by-Step Guide.

Author(s): Gennaro Daniele Acciaro

Originally published on Towards AI.

In the life of a Machine Learning Engineer, training a model is only half the battle.

Indeed, after obtaining a neural network that accurately predicts all the test data, it remains useless unless it’s made accessible to the world.

Model deployment is the process of making a model accessible and usable in production environments, where it can generate predictions and provide real-time insights to end-users and it’s an essential skill for every ML or AI engineer.

In this guide, we’ll walk through the process of deploying a custom model trained using the Detectron2 framework.

🤖 What is Detectron2?

Image taken from the official Colab for Detectron2 training.

Detectron2 is a powerful library for object detection and segmentation, built on PyTorch and developed by Meta. It provides an excellent framework for training and deploying your custom models. With Detectron2, you can easily build and fine-tune neural networks to accurately detect and segment objects in images and videos.

The library offers many pre-trained models and state-of-the-art algorithms, making it a popular choice among machine learning engineers and researchers. Whether you’re working on computer vision tasks or building applications that require object detection capabilities, Detectron2 provides the tools and flexibility you need to achieve accurate and efficient results.

For more information, refer to the official GitHub repository: Detectron2 GitHub

Finetune a Detectron2 model

We’re going to assume you already have a fine-tuned model ready to be deployed. If you don’t have a model yet, don’t worry! The amazing Detectron2 team has provided an official Colab tutorial to help you out: Detectron2 Colab Tutorial

In this article, we will specifically focus on how to deploy the model trained using that Colab, finetuned on segmenting balloons🎈.

⚠️ Important: we need to extract two files (class_names.txt and config.yaml) from the trained model. We will need these files shortly.

Hence, run this snippet after having executed the training of the model:

import os 
from pathlib import Path 

# Save class names 
class_names = MetadataCatalog.get(cfg.DATASETS.TRAIN[0]).thing_classes 
class_names_path: str = cfg.OUTPUT_DIR + "/class_names.txt" 

with open(class_names_path, "w") as f: 
 for item in class_names: 
 f.write("%s\n" % item) 

print(f"\033[92m 🎉 Class names file saved in: {class_names_path} 🎉 \033[00m") 

# Save the config yaml 
config_yaml_path: str = cfg.OUTPUT_DIR + "/config.yaml" 
with open(config_yaml_path, 'w') as file: 
 file.write(cfg.dump()) 

print(f"\033[92m 🎉 Config file saved in: {config_yaml_path} 🎉 \033[00m")

Now you can download the following files (located in the /output folder) from Colab:

class_names.txt
config.yaml
model_final.pth

Deploying your model with Torchserve

In this guide, we will use TorchServe to implement our model, you will end up having a fully functional REST API that can deliver predictions from your custom Detectron2 model, ready to be integrated into your applications.

What is Torchserve?

TorchServe is an open-source tool designed to facilitate the efficient deployment of PyTorch models. It streamlines the process of exposing models via a REST API, manages models, handles inference requests, and collects logs and metrics. TorchServe supports simultaneous serving of multiple models, dynamic batching, and adapts to different deployment environments, making it an optimal choice for serving models at production scale.

More details here: TorchServe GitHub , TorchServe Website.

An overview of the deployment process

The deployment process involves the following steps:

Installation of required packages for Torchserve.
Review of the handler: this component is responsible for processing inference requests through the API.
Generation of the .mar package for the model.
Deployment of the model: we use the Torchserve CLI commands to complete the deployment.

Step 0: Installing the requirements

Please note: In this guide we assume that you already have Detectron2 installed, if not, you can follow the official guide: https://detectron2.readthedocs.io/en/latest/tutorials/install.html

After installing Detectron2 we can install everything we need to distribute the models with the following commands:

pip install torch-model-archiver 
pip install pyyaml torch torchvision captum nvgpu 
pip install torchserve

Step 1: The handle

One of the key components of TorchServe is the handler.

This is a Python file responsible for loading the model into memory and managing the entire inference pipeline, including preprocessing, inference, and postprocessing.

For our tutorial, you can find the handler file at the following link: medium-repo/deploying_custom_detectron2_models_with_torchserve/model_handler.py at main · vargroup-datascience/medium-repo

If you’re deploying your own model, don’t forget to change the parameters 😊.

Copy and paste this handler into a new file named model_handler.py.

Step 2: The .mar file

Torchserve requires a .mar file to function, which contains the model weights, inference code and necessary dependencies.

It is interesting to note that uploading a .mar file to a server can be done even if the server is already running, without restarting the server.This allows us to update our models in production without interrupting service. This aspect will not be discussed in detail in this manual.

**torch-model-archiver** creates an archive file (.mar) containing a pre-trained PyTorch model, its configuration files and any dependencies.

At this point then we now have our four files:

class_names.txt
config.yaml
model_final.pth
model_handler.py

Next, let’s create the archive that will contain all the files. This archive, named my_d2_model.mar, can be created using the following command:

torch-model-archiver --model-name my_d2_model --version 1.0 --handler model_handler.py --serialized-file model_final.pth --extra-files config.yaml,class_names.txt -f

This command creates a my_d2_model.marfile containing all the files needed for the model; if a model with the same name and version already exists, you will need to increment the version.

Once executed, you should see a my_d2_model.mar file in the current folder.

Step 3: Deploy the model

To use the server, we need to create a model_store folder where the models will be stored.

mkdir model_store

Then we copy the file my_d2_model.mar into the folder model_store:

Linux/Mac OS:

cp my_d2_model.mar model_store

Windows:

copy my_d2_model.mar model_store

At this point then we can start the server with the following command:

torchserve --start --model-store model_store --models my_d2_model=my_d2_model.mar --disable-token-auth

To test whether the server has started correctly, we ping the server with the command:

$ curl http://localhost:8080/ping

If all goes well, we will have this answer:

{ 
 "status": "Healthy"
}

Futhermore, to test if the deployment was successful, we need to run this command, which will return the models currently active on the server.

$ curl http://localhost:8081/models

The output that we expect is as follows:

{ 
 "models": [ 
 { 
 "modelName": "my_d2_model", 
 "modelUrl": "my_d2_model.mar" 
 } 
 ] 
}

At this point we are ready to use our model.

Step 4: Access the API

Now that the server is running and the model is deployed, you can interact with the model using the provided API endpoint.

To get predictions from the deployed model, you can use the following curl command. This command sends an image file to the model, and the server returns the prediction results.

$ curl http://127.0.0.1:8080/predictions/my_d2_model -T img.jpg

Alternatively, you can use any HTTP client of your choice (such as Postman, Python’s requests library, or even a custom application) to interact with the API and send the image for predictions.

If the request is successful, the server processes the image using the deployed model and returns a JSON response with the prediction results.

Conclusions

In conclusion, this guide has illustrated the step-by-step process of deploying a custom Detectron2 model using a REST API via TorchServe.

We started by installing the necessary dependencies, then created a custom handler to manage the inference pipeline, and finally generated a .mar file containing the model and its dependencies.

By using TorchServe, it was possible to launch a server capable of exposing the model through the API, allowing for real-time inference. This procedure allows not only to quickly put machine learning models into production, but also to update the models without interrupting the service.

All of this can be useful for integrating machine learning models into one’s own applications, ensuring a robust and scalable deployment.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Deploying Custom Detectron2 Models with a REST API: A Step-by-Step Guide.

Author(s): Gennaro Daniele Acciaro

🤖 What is Detectron2?

Finetune a Detectron2 model

Deploying your model with Torchserve

What is Torchserve?

An overview of the deployment process

Step 0: Installing the requirements

Step 1: The handle

Step 2: The .mar file

Step 3: Deploy the model

Step 4: Access the API

Conclusions

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

TAI #148: New API Models from OpenAI (4.1) & xAI (grok-3); Exploring Deep Research’s Scaling Laws

Traditional RAG vs Graph RAG

I Was About to Order Taco Bell Again. Instead, I Built an AI That Talks Me Down

MCP is on Fire.

Efficient Fine-Tuning of LLMs: LoRA and QLoRA in Enterprise AI LangGraph Workflows

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Deploying Custom Detectron2 Models with a REST API: A Step-by-Step Guide.

Author(s): Gennaro Daniele Acciaro

🤖 What is Detectron2?

Finetune a Detectron2 model

Deploying your model with Torchserve

What is Torchserve?

An overview of the deployment process

Step 0: Installing the requirements

Step 1: The handle

Step 2: The .mar file

Step 3: Deploy the model

Step 4: Access the API

Conclusions

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥