Tackle COVID detection with Lightning Flash and IceVision
Last Updated on January 7, 2023 by Editorial Team
Last Updated on October 9, 2021 by Editorial Team
Author(s): γ€γ«γ« Borovec
Deep Learning
Tackling the Kaggle COVID Detection Challenge with Lightning Flash and IceVision
This post walks through how we approached the Kaggle SIIM-FISABIO-RSNA COVID-19 Detection challenge using Lightning Flash and its new support for IceVision's rich collection of models and backbones.
Object Detection is a Computer Vision task that aims to detect and classify individual objects in a scene or image. There are various model architectures for Object Detection, but the two most common are region proposal (e.g. Fast/Faster RCNN) and one-shot detection (e.g. SSD,Β YOLO).
The Lightning Flash team recently released a new exciting integration with the IceVision Object Detection library that enables dozens of new state-of-the-art backbones that can be fine-tuned and inference just a few lines ofΒ code.
Object Detection – Flash documentation
We participated in the Kaggle: COVID19 detection/classification challenge to showcase the new integration, which presents a realistic and challenging dataset of CT scans from over six thousand patients.
Checkout our Kaggle kernelβββCOVID detection with Lightning FlashΒ β‘οΈ
Covid detection with Lightning Flash β‘οΈ
All code snapshots and visualizations are part of this sample repository and can be installed as pip install https://github.com/Borda/kaggle_COVID-detection/archive/refs/heads/main.zip
COVID Detection Challenge
The recent Kaggle: COVID19 detection/classification aims to facilitate medical screening and eventually assist medical experts/doctors with making diagnoses. The challenge is a hybrid image classification/ object detection task. First, you need to identify whether a given CT scan has one of four abnormalities and then provide a collection of bounding boxes indicating where the abnormalities are present to justify the diagnosis.
While the task is presented as a hybrid Detection and Classification task, it can also be modelled as a traditional object detection problem where the image class is determined by the aggregation of the majority class of the detected abnormalities.
A Study represents a unique anonymized patient with a single Computed Tomography (CT) scan. A single DICOM image should represent each scan. It is important to note that some annotated images were accidentally incompletely labelled. These images and their annotations were later duplicated and fixed, as discussed in this thread. As part of the challenge, incomplete annotations and duplicate images need to be filtered out of the dataset before training.
The annotations are stored in two separate CSV tables. The first table contains a one-hot encoding representing the abnormality class of each image and the second table contains a list of all the bounding boxes, if any, in theΒ images.
Loading DICOM images with annotations
The CT scans are provided in a DICOM file, including a header with some metadata and the compressed imageΒ bitmap.
For loading image data, we use the pydicom package, see the following sample code how to doΒ it:
After loading images, we merged the two annotation tables by Study ID. The code above displays a few examples scans from each class and draws bounding boxes if they are available in the metadata:
As expected, there are no bounding boxes in negative scans. More interestingly, most of the positive scans contain more than one detection perΒ scan.
Additional observation on a relation between detections andΒ labels
Let's take a look at the dataset from the perspective of counting annotated bounding boxes. As you can see from the samples per class, there are some positive cases without any COVID-19 detections.
The figure below shows how many bounding boxes there are in each image per a given class. Eventually, this observation can be used in the final aggregation heuristic.
The pie charts below show the distribution of annotated abnormalities across the different studies. As expected, there are no negative studies containing bounding box annotations.
However, 5% of the images containing abnormalities are missing bounding box annotations. This means that while we can estimate a given image class using object detection, these noisy images will make it challenging to model the task perfectly with this approach which explains why many used a hybrid image classification/object detection approach.
Flash Baseline with EfficientDet
Flash is an AI Factory for fast prototyping, baselining, fine-tuning, and deep learning solving business and scientific problems. It is built on top of the powerful Pytorch Lightning library to facilitate training atΒ scale.
In Flash Object Detection, models are initialized with two key arguments Backbone and Head, which comes from the model architecture/composition:
- Backboneβββpre-trained classification networks such as ResNet50, EffieientNet are used for feature extraction in object detection models.
- Headβββdefining the architecture of the Object Detector such as Faster-RCNN, RetinaNet, etc.
Object Detection – Flash documentation
Let's look at the EfficientDet head's schema to understand these arguments better, using an EfficientNet backbone.
With Flash, we only need a few lines of code to fine-tune state-of-the-art methods like EfficientDet on our competition dataset. All you need to do is converting the Kaggle object detection labels to the COCO format, setting a model and training parameters, and start trainingβ¦
1. Convert Dataset to COCOΒ format
Working with perfect annotations in practice is quite rare, and often it only occurs in standard/benchmarking datasets. The IceVision and Flash integration currently use the COCO dataset format. The COCO format is comprised of a folder with raw images and a JSON annotation file, which contains the following metadata:
- Relative image paths and sizes for each image in theΒ dataset
- A list of bounding boxes and their class indexes for eachΒ image
- A mapping between bounding box class indexes andΒ labels
In COCO, a bounding box comprises the coordinates of the top right corner and the width and height of the bounding box within the image. We need to write a custom script that does this conversion from competition annotation to the COCO format. These steps are described in the provided notebook, and the code can be found in the provided repository.
2. Create the DataModule
In Flash, the from_coco the method loads the COCO images and annotation file created in the last step. We can provide abatch_size to fully utilize our GPU and an image_size to resize our images for ourΒ model.
3. Build theΒ Task
To build the Object Detection Task, we select model backbones. In this example, we use a state-of-the-art EfficientDet with an EfficientNet D5 backbone. The learning rate we experimentally set to 1e-5, we can optimize this value using a hyper-parameter search.
4. Create the Trainer and Fine-tune theΒ model
Flash's Trainer inherits from the Lightningβs Trainer, enabling us to leverage all the trainer flags we know and love efficiently. To train, we use method finetune, which takes an argument strategythat configures the fine-tuning process. For example, thefreeze_unfreeze strategy below freezes the backbone for the first 10 epochs to update only the head and then expands training to the entireΒ model.
Using the built-in Tensorboard logger, we can observe the training process in real-time and see how the training loss decreases.
At the end of the training, we save the fine-tuned model so that we can make inferences.
5. Load model and Run predictions
Now we have a trained model. It's time to evaluate our model's performance. The code below shows a simple use-case of how Flash can load pre-trained models from a file and predict on testΒ dataset:
There you have it; in the post, you learned howΒ to
- Perform Exploratory Data Analysis on Object Detection datasets.
- Train end-to-end state-of-the-art Object Detection models with Flash and IceVision.
- Evaluate trained models to generate a Kaggle Submission for the Kaggle: COVID19 detection/classification challenge.
Stay tuned for the following stories With Lightning andΒ Flash!
Flash 0.5βββYour PyTorch AI Factory!
About theΒ Author
Jirka Borovec holds a Ph.D. in Computer Vision from CTU in Prague. He has been working in Machine Learning and Data Science for a few years in several IT startups and companies. He enjoys exploring interesting world problems and solving them with State-of-the-Art techniques, and developing open-source projects.
Tackle COVID detection with Lightning Flash and IceVision was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI