Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Car and Pool Detector Using Monk AI
Computer Vision

Car and Pool Detector Using Monk AI

Last Updated on January 7, 2023 by Editorial Team

Last Updated on August 29, 2020 by Editorial Team

Author(s): Omkar Tupe

Photo by Kelly Lacy fromΒ Pexels

Computer Vision

About theΒ project

This project is focused on detecting cars and pools from satellite images using CornerNet[1]. Performing object detection, by coding from scratch, can be difficult and tedious for someone not very well acquainted with the field. With Monk AI this can be done in a seemingly easier way. Using Monk AI [2], one can accomplish various computer vision tasks like object detection with very few lines of code. This project can help gain a better understanding of the MonkAI toolkit. Through this blog, I will share some insights about MonkAI, and how it can be used to simplify the process of object detection and build other computer vision applications.

Tutorial available onΒ GitHub.

Cars and pools detection(Images are taken from KaggleΒ dataset)

Features of MonkΒ AI

  • A low code programming environment.
  • Using MonkAI one can access PyTorch, MXNet, Keras, TensorFlow, etc. with a commonΒ syntax.
  • For Competition and Hackathon participants: The hassle-free setup makes prototyping faster andΒ easier

The Dataset

For this project, Satellite images are used for training the model to detect cars and pools. Annotations are stored in VOC format. The dataset has 3748 train images and 2703 test images. The dataset is available onΒ Kaggle.

CornerNet

CornerNet, a new approach to object detection where one can detect an object bounding box as a pair of key points, the top-left corner, and the bottom-right corner, using a single convolution neural network. By detecting objects as paired key points, it is possible to eliminate the need for designing a set of anchor boxes that were previously used commonly in single-stage detectors.

Don’t worry about architecture MONK will take care of this(Image Ref-https://arxiv.org/abs/1808.01244)

Table ofΒ contents

  1. Installation instructions
  2. Use an already trainedΒ model
  3. Train a customΒ detector

β€” Annotations conversion (VOC to COCO viaΒ MONK)

β€” Training

4. Inference model

1. Installation instructions

Here we are using Google Colab for training as it provides CUDA GPU. But one can use a local device or Kaggle notebook. Now we will setup MonkAI toolkit and dependencies on theΒ colab.

2. Use an already trainedΒ model

Monk helps to understand the detection results using a pre-trained model to demonstrate our application.

Downloading the pre-trained model.

Unzip theΒ folder

The obj_satellite_car_pool_trained folder will have the pre-trained model file and some testΒ images.

Setting up aΒ detector

From a given folder, we are using a weight file (obj_satellite_car_pool_trained/CornerNet_Saccade_final-1000.pkl)

From the unzipped folder, we are using some images for inference purposes.

Inference-1

Inference-2

3. Train a customΒ detector

We are using a dataset from Kaggle so we have to install Kaggle API onΒ colab.

Please follow the steps below to download and use Kaggle data within Google Colab[4]:

  1. Go to your account, Scroll to API section and Click Expire API Token to remove previousΒ tokens
  2. Click on Create New API Tokenβ€Šβ€”β€ŠIt will download the kaggle.json file on yourΒ machine.
  3. Go to your Google Colab project file and run the following commands:

Time to download yourΒ dataset

Go to the dataset you want to download on Kaggle and copy the API command which Kaggle provides. That should look like the following:

To train a model using Cornernet-Saccade, the annotation should be in the COCO format, but we have the annotation in VOC format. Hence, we need to convert VOC format to COCO format via MONK format. You can find detailed code about this onΒ Github

VOC format(Dataset directory structure)

Monk format

COCO format(Desired annotations)

Annotation Conversion

You can find detailed code about this onΒ Github

Training

Using Monk AI we can build concurrent pipeline, we can do model selection and it is easy to set hyperparameters

  1. Importing dependencies
  2. Set detectorΒ path
  3. Set dataset and annotation path
  4. Select model (here we are using CornerNet_Saccade)
  5. Set hyperparameters(learning rate as 0.00025,total iterations 10000)

6. Now complete setup for training. Here we are loading annotations in memory followed by index creation and loadingΒ model.

7. Now we will start our trainingΒ as

Now here we can see all hyperparameters settings. The total number of available parameters for training is 116849063. It is always best practice that we should shuffle data during each iteration so we are shuffling dataset’s indices.

Weights file will be stored in β€˜cache/nnet/CornerNet_Saccade/’ as intermediate as well as the finalΒ file.

4.Inference

It will be similar to the pre-trained model but now we will use our own trained model so the model path will be different.

  1. Set detectorΒ path
  2. Define classes.

3. Set the trained modelΒ path.

4. Provide some test images forΒ testing.

Test image-1

Test image-2

From multiple trial and error threshold is set asΒ 0.3

For threshold value, less than 0.3 multiple detector boxes are observed and for threshold value greater than 0.3 it is difficult for a detector to detectΒ car.

As we used satellite images the size of the car is much less than the pool in terms of pixels. More features are available forΒ pools

Conclusion

To conclude, our task was done in very few lines of code. We have demonstrated only one pipeline in this article, but Monk AI has a total of seven such pipelines from GluonCV to YOLOv3. All in all, Monk AI is a great library that makes working with such computer vision tasks prettyΒ easy.

All the code shown in this article resides in this colab notebook.

You can find many more examples of detection and segmentation in the application modelΒ zoo.

Thanks for Reading! I hope you find this article informative & useful. Do share your feedback in the commentsΒ section!

References

  1. CornerNet-https://arxiv.org/abs/1808.01244
  2. Monk AI- https://github.com/Tessellate-Imaging/Monk_Object_Detection
  3. Kaggle dataset- https://www.kaggle.com/kbhartiya83/swimming-pool-and-car-detection
  4. Downloading Kaggle dataset on Google colab- https://www.kaggle.com/general/74235


Car and Pool Detector Using Monk AI was originally published in Towards AIβ€Šβ€”β€ŠMultidisciplinary Science Journal on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓