Detection of Soccer Players from Thermal Images using Monk AI
Last Updated on October 10, 2020 by Editorial Team
Author(s): Kushagra Awasthi
Computer Vision
Making computer vision easy with Monk, low code Deep Learning tool, and a unified wrapper for computerΒ vision.
Introduction
In this tutorial, we will be making an object detection application using the thermal image dataset from an indoor soccer field. Using this application we will be able to track the number of players present on the ground at a particular time, this application can therefore be used in target tracking activities. This would help us in tracking multiple people, especially in activities where people move quickly and erratically and wear similar uniforms. Monkβs object detection toolkit allows us to deploy our model using low-code syntax, and one-line installation of different deep learning pipelines makes our workΒ easier.
Create real-world Object Detection applications usingΒ Monk
About Dataset
The thermal soccer dataset is available on Kaggle, this dataset is captured using thermal cameras which ensures better segmentation and ensures the privacy of people in public facilities.
This dataset contains four 30-seconds video sequences of 8 people playing soccer in an indoor arena. The video is captured using thermal cameras of type AXIS Q1922 with a resolution of 640480 pixels and 25 fps. The three images are stitched to one image of 1920*480 pixels.
The videos are manually annotated for tracking.
Table ofΒ Content
1. Installation Instructions
2. Use the trained model to detect soccerΒ players
3. Training your own detector using MMdetection wrapper
β VOC to MONK TYPE
βββMONK to COCO TYPE
βββTraining
4. Inference
Installation
The first step is to set up the MONK AI toolkit and its dependencies on the platform we are working on, I am using Google Colab as my environment.
Use an already trained model for detection.
MONK toolkit also allows us to use pre-trained models to demonstrate our applications. I have also used a model, pre-trained by me for the detection of soccer players in the thermalΒ images.
Downloading the pre-trained model folder and using it to infer some testΒ images.
Loading the model parameters from the pre-trained modelΒ folder.
Using the predict function we will predict the bounding box of soccer players for some testΒ images.
Training a CustomΒ Detector
The first step while training a custom detector is to convert the VOC format to MONK TYPE format, but before that, we need to prepare a proper VOC type dataset for which we would need to follow the steps givenΒ below:
- Download the dataset onto your local system from the following link.
- Move all the images in different folders to a commonΒ folder.
- Select all images and rename the first image asΒ βimgβ.
- Upload this image folder onto your drive and mount your drive in the notebook.
- Now the XML files are downloaded from Kaggle and we will create separate XML files for each image and save them in a separateΒ folder.
- The above steps are performed so that proper label matching can be achieved after the dataset is converted from VOC to MONKΒ type.
Saving images in an image directory βPersonsβ in the root directory.
Creating separate XML files for each image and saving them in a separate directory βPerson_bbox1β in the root directory.
Similarly, the XML files of images in the other three folders are saved in the annotation folder in the root directory.
Now after the VOC type dataset is ready we will convert it to MONK format.
So now, what is MONKΒ format?
VOC to MONKΒ TYPE
So to convert our data to the above shown MONK format we run the code snippet givenΒ below.
So the CSV file generated will be as shownΒ below.
MONK to COCOΒ TYPE
The MONK TYPE dataset is now to be converted to COCO TYPE which will be used for object detection. In COCO format the annotation details of bounding boxes for each image are saved in the JSON file and the classes.txt file contained all the possible classes of objects which can be present in anΒ image.
Training
After the conversion of the dataset to COCO format we can proceed to the final step of training our detector using MMDetection wrapperΒ class.
Importing the DetectorΒ module.
Now, we will update the dataset parameters, model parameters, hyperparameters, and training parameters for our detector.
Now, we are all set to start training ourΒ model.
Inference
Once the training is complete we can run inference on some images to validate the accuracy and efficiency of ourΒ model.
Setting up the model parameters for inference, according to the latest epoch of the trainedΒ model.
Now, we will infer anΒ image.
Conclusion
So, we saw how using the MONKβs low-code syntax easily created Soccer Player detection application using a thermal image dataset. This type of application helps in real-time tracking of people at the same time ensure the privacy of the people as we are using thermal images. The ability of thermal cameras to see at night and even in severe weather conditions makes it very useful for target detection applications. This application can also be used by security forces for surveillance purposes, for more such applications refer to the Application Model Zoo of MONK object detection library.
Tutorial available onΒ Github.
Detection of Soccer Players from Thermal Images using Monk AI was originally published in Towards AIβββMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI