Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Easing up the process of Tensorflow 2.0 Object Detection API and TensorRT
Computer Vision

Easing up the process of Tensorflow 2.0 Object Detection API and TensorRT

Last Updated on January 6, 2023 by Editorial Team

Author(s): Abhishek Annamraju

Computer Vision

Detailed steps to train your own object detector with Monk’s TF-Object-Detection-API, optimize using TensorRT and run inference on GPUΒ systems

The entire code is available as a jupyter notebook at Monk Object Detection Library

Every computer vision engineer takes up an opensource library with the goal of using it on a custom dataset. And one of the most used libraries for object detection is Tensorflow for its ever-expanding model zoo. Tensorflow recently added support of TF 2.0 for Object Detection API. Using TensorFlow object detection API for custom object detection and further model optimization using TensorRT is a lengthy time-consuming process and prone toΒ errors.

To overcome the issues that are usually faced and to reduce the workload on the developer
* modifying tfrecord examples and arranging data in rigid formats to fit in custom data
* updating config files
* using the right files to train the engine
* converting trained checkpoints to other formats for inferencing
* searching for right ways to run inference
* optimizing the model using TensorRT, etc
we integrated TF object detection API with a low-code monk AIΒ toolkit

With it, developers can easily
β˜… convert custom datasets to tf-records
β˜… update config files using pythonic syntax
β˜… train the engine and export to different inference formats
β˜… Infer either using checkpoints or saved model formats
β˜… Optimize model using TensorRT engines for faster inference

Issues faced while training a custom dataset using TF 2.0 Object Detection API

Along with the procedural steps, mentioned are the issues a developer or a researcher usually faces and why we thought of easing up the process using a low-code opensource library.

Step 1: Installing prerequisites and compiling models.

* Training with older versions 2.0.0 and 2.1.0 resulted inΒ error

AttributeError: module β€˜tensorflow_core.keras.utils’ has no attribute β€˜register_keras_serializable’

* Training with version 2.2.0 also resulted in the following errors

AttributeError: 'CollectiveAllReduceExtended' object has no attribute '_cfer_fn_cache'

* Version 2.3.0 runs without any errors and is compatible with colabΒ too.

Hence V-2.3.0 was selected, soon upgrade to V2.4.0 will be done as soon as it is released as tf-lite conversion is error-prone inΒ V-2.3.0

Step 2: DatasetΒ settings

* TF model’s dataset_tools provide examples for public datasets such as COCO, VOC, OID, etc; But not all public datasets are annotated in theseΒ formats.

* Once annotated in this format, it needs to be arranged in a data structure that can be fed into the files present in object_detection/dataset_tools/; Or these files are to be modified to ingest custom data in theΒ pipeline

To make this process easier, we added a simple parser to convert annotations to VOC format data and further to tfrecords.

Step 3: Model and Config file +Β Train

* Weights and config files have to be downloaded from the Model zoo, and data structure to be updated in thisΒ format

Data format.Β Credits

* Post this, the config file elements have to be updated. A set of more than 25 elements have to be changed includingβ€Šβ€”β€Šdataset details, base feature extraction details, checkpoint details, optimizer details, tf-record details.

* Once config details are updated training can beΒ done

To avoid manual changes to config files and folder structure formats pythonic API wrapper wasΒ created.

Step 4: Export Model to saved-model format and inference

* Checkpoint files then can be converted to the β€œSaved Model” format.

* General issues here include conversion of ssd fpn and resnetΒ formats.

* Running inference on checkpoint files required object detection model builders whereas saved model β€œ.pb” formats can be loaded using tf’s load_model function.

* Creating a graph function over saved_model before running inference usually tends to speed up theΒ process

Step 5: TensorRT model conversion and inference

* Saved model can then be optimized to run on different NVidia GPUΒ machine

* The issue with TensorRT is that the library version for the development system (one where you train and convert the model) and the deployment system (one where you deploy your model) has to be theΒ same.

* * The next issue with TRT is that it optimizes the model based on the compute compatibility of the model being used. In simple terms, a model optimized on colab cannot run on the Jetson Nano board. Hence a model needs to be just converted and built on runtime on the deployment machine.

* TensorRT models can be converted only from saved_model (β€œ.pb”) files and it generates an optimized model in a similar format which then needs to be inferredΒ on.

Let’s Get Started with theΒ process

Note: Entire code is available as a jupyter notebook at Monk Object Detection Library. Here, I have mentioned only the important steps involved in the entireΒ work.

Step 1β€Šβ€”β€ŠInstallation

For local or cloud-based systems:

This will install
* Pre-requisite libraries such as numpy, scipy, opencv, pillow, lxml, etc
* Tensorflow 2.3.0 and Tensorflow-models-python-2.2.0
* Tensorflow Object Detection API

Similarly, it can also be installed onΒ colab

Step 2β€Šβ€”β€ŠData preparation

The dataset needs to be in simple pascal VOC format as mentioned below

PASCAL VOCΒ format

To convert your dataset from COCO or any other format check these detailed tutorials

For this tutorial, we used the BDD100K on-road object detection datasetβ€Šβ€”β€ŠCredits

Complete steps to download the data and convert it to Pascal VOC format are mentioned in this jupyter notebook onΒ Github

Step 3β€Šβ€”β€ŠSystem parametric setup

β˜… Load theΒ detector

β˜… List all the models. At present 26 different models are supported

β˜… Set training and validation data withΒ params

* Set batch size as per the GPU memory available. A size of 24 fits well on AWS p3.2x instance with V100 GPU (16 GBΒ VRAM)

β˜… Create TF Record!!!

* Batch size, number of classes, tf_record details all will be saved automatically in the configΒ file

β˜… Select a model from modelΒ list

* For this example we have selected SSD Resnet50 with Feature Pyramid network and it takes input image of shape 640x604x3 (RGBΒ image)

β˜… Set all other hyperparameters

* Set training stepsβ€Šβ€”β€Šan ideal value would be to train for 100K steps for largeΒ datasets

* Optimal initial learning rate for all models can be set around 0.01, whereas ssd_mobilenet_v2 and faster_rcnn_inception models can take up even higher learningΒ rates.

β˜… Set output inference graph path and TensorRTΒ params

* TensorRT optimization isΒ optional

* TensorRT optimization supports 3 types of optimizationsβ€Šβ€”β€ŠFP32, FP16,Β INT8

* Floating-point quantizations are useful for boards like JetsonΒ Nano.

* INT8 optimization builds at the time of creation while the other two build on the deployment machine. Build can be easily considered as actual optimization being done on the model. An issue mentioned above with TensorRT is that it should be built on a machine where you want to deploy it, or should be built on a machine with the same TensorRT library and Cuda compute capabilities.

* Hence, it is advised to run INT8 optimization on the deployment machineΒ itself.

* Since, for this example, we trained and ran inference on AWS P3.2x instance, we went for INT8 quantization.

Step 4β€” Training and modelΒ export

β˜… To train run the following command.

* Since it runs a TF engine, running it as a wrapper on jupyter notebook results in systemΒ exit.

* For that reason, a script named train.py is provided.

* Once training is completed, checkpoint files will be saved in the output_directory that you have mentioned in the hyperparameter setupΒ command.

β˜… To export the trained model to saved_model (β€œ.pb”) format run the following command.

* Since it runs a TF engine, running it as a wrapper on jupyter notebook results in systemΒ exit.

* For that reason a script named export.py is provided.

* OutputΒ .pb file will be saved in the export_directory that you have set in export_params setupΒ command.

Step 5β€Šβ€”β€ŠInference and speed benchmark before optimization

β˜… Load theΒ detector

β˜… Load the trainedΒ model

* Load form saved_model in exported directory

β˜… Run inference on a singleΒ image

β˜… Run benchmark speedΒ analysis

* Without optimization, these are the results on AWS P3.2xΒ instance

Average Image loading time : 0.0121 sec
Average Inference time : 0.0347 sec
Result extraction time : 0.0848 sec
total_repetitions : 100
total_time : 3.4712 sec
images_per_sec : 28
latency_mean : 34.7123 ms
latency_median : 34.9255 ms
latency_min : 32.2594 ms

Step 6β€Šβ€”β€ŠOptimize using TensorRT-6

β˜… Install TensorRT6

* Visit Nvidia TensorRT page to downloadΒ TRT6

* Download packages from TensorRT website depending upon os and CUDA versions.

β˜… To optimize using TensorRT run the following command.

* Since it runs a TF engine, running it as a wrapper on jupyter notebook results in systemΒ exit.

* For that reason, a script named optimize.py is provided.

* Run it over exported saved_model

Step 7β€” Inference and speed benchmark after optimization

β˜… Load theΒ detector

β˜… Load the trainedΒ model

* Load from optimized saved_model

β˜… Run inference on a singleΒ image

β˜… Run benchmark speedΒ analysis

* Without optimization, these are the results on AWS P3.2xΒ instance

Average Image loading time : 0.0117 sec
Average Inference time : 0.0169 sec
Result extraction time : 0.0822 sec
total_repetitions : 100
total_time : 1.6907 sec
images_per_sec : 59
latency_mean : 16.9070 ms
latency_median : 16.8167 ms
latency_min : 16.2708 ms

Conclusion

With the Monk Object detection library one canΒ easily

β˜… convert custom datasets to tf-records
β˜… update config files using pythonic syntax
β˜… train the engine and export to different inference formats
β˜… Infer either using checkpoints or saved model formats
β˜… Optimize model using TensorRT engines for faster inference
β˜… Post optimization images processed per second nearlyΒ doubled

The entire code is available on GitHub at Monk Object Detection Library

Happy Coding!!

Appendixβ€Šβ€”β€Š1

More About Monk Object Detection Library


Easing up the process of Tensorflow 2.0 Object Detection API and TensorRT was originally published in Towards AIβ€Šβ€”β€ŠMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓