Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Tackle COVID detection with Lightning Flash and IceVision
Latest

Tackle COVID detection with Lightning Flash and IceVision

Last Updated on January 7, 2023 by Editorial Team

Last Updated on October 9, 2021 by Editorial Team

Author(s): むルカ Borovec

Illustration photo by LinkedIn Sales Navigator fromΒ Pexels.

Deep Learning

Tackling the Kaggle COVID Detection Challenge with Lightning Flash and IceVision

This post walks through how we approached the Kaggle SIIM-FISABIO-RSNA COVID-19 Detection challenge using Lightning Flash and its new support for IceVision's rich collection of models and backbones.

Object Detection is a Computer Vision task that aims to detect and classify individual objects in a scene or image. There are various model architectures for Object Detection, but the two most common are region proposal (e.g. Fast/Faster RCNN) and one-shot detection (e.g. SSD,Β YOLO).

The Lightning Flash team recently released a new exciting integration with the IceVision Object Detection library that enables dozens of new state-of-the-art backbones that can be fine-tuned and inference just a few lines ofΒ code.

Object Detection – Flash documentation

We participated in the Kaggle: COVID19 detection/classification challenge to showcase the new integration, which presents a realistic and challenging dataset of CT scans from over six thousand patients.

Checkout our Kaggle kernelβ€Šβ€”β€ŠCOVID detection with Lightning Flash ⚑️

Covid detection with Lightning Flash ⚑️

All code snapshots and visualizations are part of this sample repository and can be installed as pip install https://github.com/Borda/kaggle_COVID-detection/archive/refs/heads/main.zip

COVID Detection Challenge

The recent Kaggle: COVID19 detection/classification aims to facilitate medical screening and eventually assist medical experts/doctors with making diagnoses. The challenge is a hybrid image classification/ object detection task. First, you need to identify whether a given CT scan has one of four abnormalities and then provide a collection of bounding boxes indicating where the abnormalities are present to justify the diagnosis.

While the task is presented as a hybrid Detection and Classification task, it can also be modelled as a traditional object detection problem where the image class is determined by the aggregation of the majority class of the detected abnormalities.

Label distribution over the trainingΒ dataset.

A Study represents a unique anonymized patient with a single Computed Tomography (CT) scan. A single DICOM image should represent each scan. It is important to note that some annotated images were accidentally incompletely labelled. These images and their annotations were later duplicated and fixed, as discussed in this thread. As part of the challenge, incomplete annotations and duplicate images need to be filtered out of the dataset before training.

The annotations are stored in two separate CSV tables. The first table contains a one-hot encoding representing the abnormality class of each image and the second table contains a list of all the bounding boxes, if any, in theΒ images.

Loading DICOM images with annotations

The CT scans are provided in a DICOM file, including a header with some metadata and the compressed imageΒ bitmap.

For loading image data, we use the pydicom package, see the following sample code how to doΒ it:

After loading images, we merged the two annotation tables by Study ID. The code above displays a few examples scans from each class and draws bounding boxes if they are available in the metadata:

Visualization of several samples per class (rows) with painted detection bounding boxes from annotations.

As expected, there are no bounding boxes in negative scans. More interestingly, most of the positive scans contain more than one detection perΒ scan.

Additional observation on a relation between detections andΒ labels

Let's take a look at the dataset from the perspective of counting annotated bounding boxes. As you can see from the samples per class, there are some positive cases without any COVID-19 detections.

The figure below shows how many bounding boxes there are in each image per a given class. Eventually, this observation can be used in the final aggregation heuristic.

The pie charts below show the distribution of annotated abnormalities across the different studies. As expected, there are no negative studies containing bounding box annotations.

Label histogram according to annotation no or 1+ detection perΒ image.

However, 5% of the images containing abnormalities are missing bounding box annotations. This means that while we can estimate a given image class using object detection, these noisy images will make it challenging to model the task perfectly with this approach which explains why many used a hybrid image classification/object detection approach.

Flash Baseline with EfficientDet

Flash is an AI Factory for fast prototyping, baselining, fine-tuning, and deep learning solving business and scientific problems. It is built on top of the powerful Pytorch Lightning library to facilitate training atΒ scale.

In Flash Object Detection, models are initialized with two key arguments Backbone and Head, which comes from the model architecture/composition:

  • Backboneβ€Šβ€”β€Špre-trained classification networks such as ResNet50, EffieientNet are used for feature extraction in object detection models.
  • Headβ€Šβ€”β€Šdefining the architecture of the Object Detector such as Faster-RCNN, RetinaNet, etc.

Object Detection – Flash documentation

Let's look at the EfficientDet head's schema to understand these arguments better, using an EfficientNet backbone.

EfficientDet architecture. EfficientDet uses EfficientNet; source: EfficientDet: Towards Scalable and Efficient Object Detection.

With Flash, we only need a few lines of code to fine-tune state-of-the-art methods like EfficientDet on our competition dataset. All you need to do is converting the Kaggle object detection labels to the COCO format, setting a model and training parameters, and start training…

1. Convert Dataset to COCOΒ format

Working with perfect annotations in practice is quite rare, and often it only occurs in standard/benchmarking datasets. The IceVision and Flash integration currently use the COCO dataset format. The COCO format is comprised of a folder with raw images and a JSON annotation file, which contains the following metadata:

  1. Relative image paths and sizes for each image in theΒ dataset
  2. A list of bounding boxes and their class indexes for eachΒ image
  3. A mapping between bounding box class indexes andΒ labels

In COCO, a bounding box comprises the coordinates of the top right corner and the width and height of the bounding box within the image. We need to write a custom script that does this conversion from competition annotation to the COCO format. These steps are described in the provided notebook, and the code can be found in the provided repository.

2. Create the DataModule

In Flash, the from_coco the method loads the COCO images and annotation file created in the last step. We can provide abatch_size to fully utilize our GPU and an image_size to resize our images for ourΒ model.

Code snippet from Flash Object Detection Task trainingΒ script.

3. Build theΒ Task

To build the Object Detection Task, we select model backbones. In this example, we use a state-of-the-art EfficientDet with an EfficientNet D5 backbone. The learning rate we experimentally set to 1e-5, we can optimize this value using a hyper-parameter search.

Code snippet from trainingΒ script.

4. Create the Trainer and Fine-tune theΒ model

Flash's Trainer inherits from the Lightning’s Trainer, enabling us to leverage all the trainer flags we know and love efficiently. To train, we use method finetune, which takes an argument strategythat configures the fine-tuning process. For example, thefreeze_unfreeze strategy below freezes the backbone for the first 10 epochs to update only the head and then expands training to the entireΒ model.

Code snippet from trainingΒ script.

Using the built-in Tensorboard logger, we can observe the training process in real-time and see how the training loss decreases.

At the end of the training, we save the fine-tuned model so that we can make inferences.

5. Load model and Run predictions

Now we have a trained model. It's time to evaluate our model's performance. The code below shows a simple use-case of how Flash can load pre-trained models from a file and predict on testΒ dataset:

Code snippet from prediction script.

There you have it; in the post, you learned howΒ to

  1. Perform Exploratory Data Analysis on Object Detection datasets.
  2. Train end-to-end state-of-the-art Object Detection models with Flash and IceVision.
  3. Evaluate trained models to generate a Kaggle Submission for the Kaggle: COVID19 detection/classification challenge.

Stay tuned for the following stories With Lightning andΒ Flash!

Flash 0.5β€Šβ€”β€ŠYour PyTorch AI Factory!

About theΒ Author

Jirka Borovec holds a Ph.D. in Computer Vision from CTU in Prague. He has been working in Machine Learning and Data Science for a few years in several IT startups and companies. He enjoys exploring interesting world problems and solving them with State-of-the-Art techniques, and developing open-source projects.


Tackle COVID detection with Lightning Flash and IceVision was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓