APTOS 2019 Blindness Detection — Playing around with ResNeXts and Progressive NASNet

Last Updated on July 24, 2023 by Editorial Team

Author(s): Luka Chkhetiani

Originally published on Towards AI.

APTOS 2019 Blindness Detection U+007C Towards AI

A while ago Kaggle announced challenge: APTOS 2019 Blindness Detection — Detect diabetic retinopathy to stop blindness before it’s too late.

The idea of competition is to predict the severity of diabetic retinopathy on a scale of 0–4.

0 - No DR1 - Mild2 - Moderate3 - Severe4 - Proliferative DR

The training set consists of 3661 images and 4 classes. We should note that the dataset is not well-balanced. Well, we’ll use a couple of tricks to maximize the possible outcome, including augmentation, freezing & unfreezing layers, etc.

I’ve tried a couple of models, such as DenseNet, ResNet50/101, Inception v3/v4 and even ResNeXt-101–32x8d — architecture with 88M parameters, 82.2 top-1 and 96.4 top-5 errors weren’t able to surpass the 66.2% accuracy limit on submission set, nonetheless, some of them have shown >95.0% acc on validation.

So, I sat down and analyzed the dataset, and a couple of models.
Idea is that — big models overfitted, and small models under fitted, no matter how good I’ve tuned them.

Searched through models once again, and I got:

ResNeXt 5 32x4d — {see paper here}
PNASNet 5 Large — {see paper here}

And, decided to try both models.

I’ve implemented TensorboardX , which generates ‘run’ directory after execution, so we can visualize the training process.

torch.DataParallel?
No, I’m a victim of Google Colab for now.

My PyTorch code for ResNeXt50 training:

For PNASNet 5 Large:

I used Cadene’s implementation initially, and then converted to my code for fine-tuning, by just loading the model.

Training Set

After combining the datasets, the total number of images per class looks like the next:

Total - 3661 images


0 - No DR : 1805 images1 - Mild : 370 images2 - Moderate : 999 images3 - Severe : 193 images4 - Proliferative DR : 295 images

Actually, there’s a huge difference between the classes. We’ll use augmentation, but anyway — augmenting images can help, but not a lot.

A simple code that will help you to sort the images by classes after unzipping them would be:

Data Augmentation

Dataset is not so rich. We’ve 3661 images total. So, I’m going to use PyTorch’s data augmentation techniques, such as:

transforms.RandomVerticalFlip(p=0.5),

transforms.RandomRotation((0,360), center=None),

transforms.RandomHorizontalFlip(p=0.5)

Randomly flipping images vertically with 0.5 probability.
Random Rotation within 0,359 range (basically giving the ability to make a full rotation)
Randomly flipping horizontally with 0.5 probability.

PyTorch data transformation techniques work perfectly. By the end of the epoch, the model will have seen additionally 3 different augmentations on a single image.

Data Resizing

ResNeXt is being trained on 299×299 resized images. But, PNASNet requires 331×331 inputs. Thus, I’m going to modify the code respectively.

Little UX:

In case we want the process to run in the background, and not have a laptop/PC up all night, we can use nohup.

The command will be:

nohup python3 train.py &

And, we can see the stdouts via:

cat nohup.out

Or, tail them via:

tail -f nohup.out

Additionally, I really didn’t want to mess with ngrok. So, I used subprocess function to download the tensorboard output every once in a while, and refresh it.

from time import sleepimport subprocessfor i in range(10000):subprocess.run('scp user@ip_address:~/APTOS/runs/Aug30_21-58-43/* runs/', shell=True)sleep(20)

Let’s start training.

Plan:

Train the ResNeXt for 3 and PNASNet for 2 epochs with learning rate 1e-3.
Train the ResNeXt and PNASNet for 3 epochs with learning rate 1e-4
ResNeXt — Freeze all layers but 3,4, decrease lr to 1e-5, and unfreeze the blocks step-by-step by decreasing lr by 10x.
PNASNet — Freeze everything but cell_9, 10, 11 and continue training on 1e-5 lr. Afterward, unfreeze the cells step by step and anneal the lr by 10x.

Why?

While playing around with those two models, I noticed that it yielded persuasive accuracy on the first two epochs with lr 1e-3, but if the training was continued with the same parameters, it drafted around the same accuracy and loss.
Decreasing lr 10x times helps a lot to increase accuracy and continue decreasing the loss. 4 epochs are perfectly enough for the whole model to get to know with the dataset and study enough features.

But, after leaving only the deepest layers, which are responsible for deeper features makes the model more sophisticated and fastens up the training procedure. To say it in other words: I’m giving the model some time to familiarize with the dataset and study it, and afterward concentrating it on the most powerful components of the dataset.

1st Part

Learning rate 1e-3, all layers, 2 epochs.

PNASNet 5 Large

Epoch 1/50.. Train loss: 3.503.. Test loss: 5.913.. Test accuracy: 0.011 
Epoch 1/50.. Train loss: 1.449.. Test loss: 1.084.. Test accuracy: 0.697 
Epoch 1/50.. Train loss: 0.806.. Test loss: 0.673.. Test accuracy: 0.745 
Epoch 1/50.. Train loss: 0.630.. Test loss: 0.606.. Test accuracy: 0.776 
Epoch 1/50.. Train loss: 0.681.. Test loss: 0.591.. Test accuracy: 0.773 
Epoch 2/50.. Train loss: 0.610.. Test loss: 0.546.. Test accuracy: 0.796 
Epoch 2/50.. Train loss: 0.711.. Test loss: 0.564.. Test accuracy: 0.792 
Epoch 2/50.. Train loss: 0.474.. Test loss: 0.582.. Test accuracy: 0.790 
Epoch 2/50.. Train loss: 0.484.. Test loss: 0.539.. Test accuracy: 0.807 
Epoch 2/50.. Train loss: 0.555.. Test loss: 0.527.. Test accuracy: 0.811 
Epoch 3/50.. Train loss: 0.559.. Test loss: 0.520.. Test accuracy: 0.814 
Epoch 3/50.. Train loss: 0.455.. Test loss: 0.507.. Test accuracy: 0.821 
Epoch 3/50.. Train loss: 0.572.. Test loss: 0.486.. Test accuracy: 0.822 
Epoch 3/50.. Train loss: 0.408.. Test loss: 0.520.. Test accuracy: 0.831 
Epoch 3/50.. Train loss: 0.546.. Test loss: 0.466.. Test accuracy: 0.829

ResNeXt 50 32x4d

Epoch 1/50.. Train loss: 2.787.. Test loss: 3.990.. Test accuracy: 0.254 
Epoch 1/50.. Train loss: 1.460.. Test loss: 0.893.. Test accuracy: 0.684 
Epoch 1/50.. Train loss: 0.722.. Test loss: 0.654.. Test accuracy: 0.738 
Epoch 2/50.. Train loss: 0.817.. Test loss: 0.640.. Test accuracy: 0.771 
Epoch 2/50.. Train loss: 0.590.. Test loss: 0.533.. Test accuracy: 0.807 
Epoch 2/50.. Train loss: 0.517.. Test loss: 0.496.. Test accuracy: 0.820 
Epoch 3/50.. Train loss: 0.579.. Test loss: 0.559.. Test accuracy: 0.772 
Epoch 3/50.. Train loss: 0.470.. Test loss: 0.493.. Test accuracy: 0.809 
Epoch 3/50.. Train loss: 0.512.. Test loss: 0.477.. Test accuracy: 0.836

2nd Part

All Layers

PNASNet 5 Large

Epoch 1/50.. Train loss: 0.472.. Test loss: 0.402.. Test accuracy: 0.848 
Epoch 1/50.. Train loss: 0.452.. Test loss: 0.370.. Test accuracy: 0.857 
Epoch 1/50.. Train loss: 0.474.. Test loss: 0.383.. Test accuracy: 0.840 
Epoch 1/50.. Train loss: 0.402.. Test loss: 0.383.. Test accuracy: 0.855 
Epoch 2/50.. Train loss: 0.552.. Test loss: 0.398.. Test accuracy: 0.848 
Epoch 2/50.. Train loss: 0.453.. Test loss: 0.378.. Test accuracy: 0.867 
Epoch 2/50.. Train loss: 0.431.. Test loss: 0.391.. Test accuracy: 0.846 
Epoch 2/50.. Train loss: 0.307.. Test loss: 0.379.. Test accuracy: 0.857 
Epoch 3/50.. Train loss: 0.453.. Test loss: 0.378.. Test accuracy: 0.855 
Epoch 3/50.. Train loss: 0.372.. Test loss: 0.376.. Test accuracy: 0.851 
Epoch 3/50.. Train loss: 0.429.. Test loss: 0.380.. Test accuracy: 0.862 
Epoch 3/50.. Train loss: 0.408.. Test loss: 0.385.. Test accuracy: 0.855

ResNeXt 50

Epoch 1/50.. Train loss: 0.230.. Test loss: 0.423.. Test accuracy: 0.837 
Epoch 1/50.. Train loss: 0.454.. Test loss: 0.416.. Test accuracy: 0.846 
Epoch 1/50.. Train loss: 0.462.. Test loss: 0.410.. Test accuracy: 0.841 
Epoch 2/50.. Train loss: 0.458.. Test loss: 0.416.. Test accuracy: 0.837 
Epoch 2/50.. Train loss: 0.437.. Test loss: 0.386.. Test accuracy: 0.854 
Epoch 2/50.. Train loss: 0.407.. Test loss: 0.406.. Test accuracy: 0.846 
Epoch 3/50.. Train loss: 0.425.. Test loss: 0.401.. Test accuracy: 0.845 
Epoch 3/50.. Train loss: 0.424.. Test loss: 0.394.. Test accuracy: 0.836
Epoch 3/50.. Train loss: 0.552.. Test loss: 0.398.. Test accuracy: 0.846

After freezing & unfreezing layers and tuning the lr

PNASNet 5 Large

Epoch 1/50.. Train loss: 0.220.. Test loss: 0.344.. Test accuracy: 0.884 
Epoch 1/50.. Train loss: 0.477.. Test loss: 0.361.. Test accuracy: 0.873 
Epoch 1/50.. Train loss: 0.428.. Test loss: 0.345.. Test accuracy: 0.881 
Epoch 1/50.. Train loss: 0.410.. Test loss: 0.358.. Test accuracy: 0.872 
Epoch 1/50.. Train loss: 0.403.. Test loss: 0.356.. Test accuracy: 0.874 
Epoch 1/50.. Train loss: 0.414.. Test loss: 0.332.. Test accuracy: 0.884

ResNeXt 50

Epoch 1/50.. Train loss: 0.176.. Test loss: 0.354.. Test accuracy: 0.876 
Epoch 1/50.. Train loss: 0.332.. Test loss: 0.370.. Test accuracy: 0.865 
Epoch 2/50.. Train loss: 0.401.. Test loss: 0.361.. Test accuracy: 0.865 
Epoch 2/50.. Train loss: 0.376.. Test loss: 0.366.. Test accuracy: 0.866 
Epoch 2/50.. Train loss: 0.342.. Test loss: 0.354.. Test accuracy: 0.870 
Epoch 3/50.. Train loss: 0.399.. Test loss: 0.372.. Test accuracy: 0.870 
Epoch 3/50.. Train loss: 0.330.. Test loss: 0.349.. Test accuracy: 0.875

Results

We’ve 87.5% accuracy on ResNeXt 50, and 88.4% on PNASNet 5.

Basically, I’ve tried SOTA, mediocre, and lastly — two out of top architectures for image classification.
And, we know — the models don’t work exactly the same way in real life as they show in test/validation procedures.

Anyway, tuning two models overnight was fun, and lastly — I’ll make predictions on actual test (submission) set, and we’ll see the results.

Prediction

My inference for the prediction part.

In case we’re using Google Colab for prediction, we should note that sometimes tqdm is not a great option, as long as it refreshes the stdout for every output, and the page will crash. I always try with tqdm firstly, and if it’s not working well, just erasing it on the loop, and writing my version of the script to see the prediction process. We should make sure that we erase the *.csv file after the first try, or it’ll append the new predictions to it.

Kaggle

While trying to make a submission after the prediction part, kaggle made me furious. Basically they had a bug in the kernel, that threw submission error every time I tried to submit the predictions.
After searching for a while, I saw one kaggler’s comment, that actually helped me.

P.S. Turning off GPU and Internet in Kernel were helpful as well, in addition to downgrading Python docker image to 1–7 versions on Kaggle.

I’m gonna lend their code for submission part:

Lended from https://www.kaggle.com/kinnachen’s comment

Final Results:

ResNeXt 50

PNASNet 5

When I checked the reason why PNASNet worked so poorly, I noticed many of 0’s and 2’s, a couple of 1’s as prediction numbers in submission. And absolutely no 4’s or 3’s.
NASNet overfitted on 0 and 2 classes, as long as they held the most of the data, and made poor predictions on other classes, or didn’t make at all.

And, ResNeXt 50 turned out to work well for just overnight tuning.

There are 3 more days until the challenge closes, and I’m planning to try more
promising approaches as soon as I have time.

Hope you enjoyed it!

UPDATE:

By cropping dataset, and training the last layers longer, I made 4.1% improvement.

2nd UPDATE

Mean color subtraction gave 2.0% improvement.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

LAI #66: Information Theory for People in a Hurry

🔎 Decoding LLM Pipeline — Step 1: Input Processing & Tokenization

Meta to Launch Its Own In-House AI Chip

I Built an AI Money Coach in Python — Here’s How You Can Too (Step-by-Step Guide!)

ChatGPT Now Works Natively in Xcode and VS Code

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

APTOS 2019 Blindness Detection — Playing around with ResNeXts and Progressive NASNet

Author(s): Luka Chkhetiani

APTOS 2019 Blindness Detection U+007C Towards AI

A while ago Kaggle announced challenge: APTOS 2019 Blindness Detection — Detect diabetic retinopathy to stop blindness before it’s too late.

My PyTorch code for ResNeXt50 training:

For PNASNet 5 Large:

Data Augmentation

Data Resizing

Little UX:

Let’s start training.

1st Part

2nd Part

After freezing & unfreezing layers and tuning the lr

Results

Prediction

Kaggle

Final Results:

UPDATE:

2nd UPDATE

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥