Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Supervised Contrastive Learning for Cassava Leaf Disease Classification
Deep Learning

Supervised Contrastive Learning for Cassava Leaf Disease Classification

Last Updated on January 26, 2021 by Editorial Team

Author(s): Dimitre Oliveira

Deep Learning

Applying deep learning with supervised contrastive learning to detect diseases on cassavaΒ leaves.

Photo by malmanxx onΒ Unsplash

Supervised Contrastive Learning (Prannay Khosla et al.) is a training methodology that outperforms supervised training with cross-entropy on classification tasks.
The idea is that training models using Supervised Contrastive Learning (SCL) can make the model encoder learn better class representation from the samples, this should lead to better generalization and robustness to image and label corruption.

In this article you will learn what is, and how supervised contrastive learn works, you will see the code implementation, an use case application and finally a comparison between SCL and regular cross-entropy.

In short, this is how SCLΒ works:

Clusters of points belonging to the same class are pulled together in embedding space, while simultaneously pushing apart clusters of samples from different classes.

There are many contrastive learning methods like β€œsupervised contrastive learning”, β€œself-supervised contrastive learning”, β€œSimCLR” and others, the contrastive part that they share in common is that they learn to contrast (push apart) samples that are of one domain from samples of other domains, but SCL leverages label information in a supervised way for this task.
For more detailed information check out theΒ paper.

Different training methods architectures.

Essentially, training a classification model with Supervised Contrastive Learning is performed in twoΒ phases:

  1. Training an encoder to learn to produce vector representations of input images such that representations of images in the same class will be more similar compared to representations of images in different classes.
  2. Training a classifier on top of the frozenΒ encoder.

The useΒ case

We are going to apply Supervised Contrastive Learning to a dataset from a Kaggle competition (Cassava Leaf Disease Classification) the objective is to classify images of leaves from cassava plants into 5 categories:

0: Cassava Bacterial Blight (CBB)
1: Cassava Brown Streak Disease (CBSD)
2: Cassava Green Mottle (CGM)
3: Cassava Mosaic Disease (CMD)
4: Healthy

We have four kinds of diseases and one category for healthy leaves, here are some imageΒ samples:

Cassava leaves image samples from the competition.

For more information about cassava leaf diseases, check out this link from PlantVillage.

The data has 21397 images for training and around 15000 for the testΒ set.

Experiment set-up

β€” The data: Images with resolution 512 x 512 pixels.
β€Šβ€”β€ŠThe model (encoder): EfficientNet B3.
Obs: you can check the full codeΒ here.

Usually, contrastive learning methods work better if each training batch has a sample of each class, this will help the encoder learn to contrast samples of a domain from the other domains batch-wise, this means using a large batch size, and in this case, I have oversampled the minority classes, so each batch has roughly the same probability of having samples from eachΒ class.

Class distribution of the dataset, after oversample.

Data augmentation usually helps computer vision tasks, during my experiments I also saw improvements from data augmentation, here I am using shear, rotation, flips, crops, cutout, and changes in saturation, contrast, and brightness, it may seem a lot, but the images don’t get too different from the originalΒ ones.

Augmented dataΒ samples.

Now we can look at theΒ code

Encoder

Our encoder will be an β€œEfficientNet B3” but with an average pooling layer at the top of the encoder, this pooling layer will output a vector of size 2048, later it will be used to inspect the representation learned by theΒ encoder.

def encoder_fn(input_shape):
inputs = L.Input(shape=input_shape, name=’inputs’)
base_model = efn.EfficientNetB3(input_tensor=inputs,
include_top=False,
weights=’noisy-student’,
pooling=’avg’)

model = Model(inputs=inputs, outputs=base_model.outputs)
return model

Projection head

The projection head will be placed at the top of the encoder, and it will be responsible for projecting the output of the encoder’s embedding layer into a smaller dimension, in our case, it will project the 2048-dimension encoder into a 128-dimension vector.

def add_projection_head(input_shape, encoder):
inputs = L.Input(shape=input_shape, name='inputs')
features = encoder(inputs)
outputs = L.Dense(128, activation='relu',
name='projection_head',
dtype='float32')(features)

model = Model(inputs=inputs, outputs=outputs)
return model

Classifier head

The classifier head is used for the optional second stage of the training, after the SCL training stage, we can remove the projection head and add this classifier head to the encoder and fine-tune the model with the regular cross-entropy loss, this should be done with the encoder’s layersΒ frozen.

def classifier_fn(input_shape, N_CLASSES, encoder, trainable=False):
for layer in encoder.layers:
layer.trainable = trainable

inputs = L.Input(shape=input_shape, name='inputs')

features = encoder(inputs)
features = L.Dropout(.5)(features)
features = L.Dense(1000, activation='relu')(features)
features = L.Dropout(.5)(features)
outputs = L.Dense(N_CLASSES, activation='softmax',
name='outputs', dtype='float32')(features)

model = Model(inputs=inputs, outputs=outputs)
return model

Supervised Contrastive learningΒ loss

This is the code implementation of the SCL loss, the only parameter here is temperature, β€œ0.1” is the default value, but it can be tweaked, larger temperatures can result in classes more separated, but smaller temperatures can benefit from longer training.

class SupervisedContrastiveLoss(losses.Loss):
def __init__(self, temperature=0.1, name=None):
super(SupervisedContrastiveLoss, self).__init__(name=name)
self.temperature = temperature

def __call__(self, labels, ft_vectors, sample_weight=None):
# Normalize feature vectors
ft_vec_normalized = tf.math.l2_normalize(ft_vectors, axis=1)
# Compute logits
logits = tf.divide(
tf.matmul(ft_vec_normalized,
tf.transpose(ft_vec_normalized)
), temperature
)
return tfa.losses.npairs_loss(tf.squeeze(labels), logits)

β€œtfa” is the alias for the Tensorflow addonsΒ package.

The training

I will skip the Tensorflow boilerplate training code because it is pretty standard, but you can check the complete code in this notebook.

First stage training (encoder + projection head)

The 1st stage training is done with the encoder + the projection head, using the supervised contrastive learningΒ loss.

Building theΒ model

with strategy.scope(): # Inside a strategy because I am using a TPU
encoder = encoder_fn((None, None, CHANNELS)) # Get the encoder
encoder_proj = add_projection_head((None, None, CHANNELS),encoder)
# Add the projection head to the encoder
encoder_proj.compile(optimizer=optimizers.Adam(lr=3e-4), 
loss=SupervisedContrastiveLoss(temperature=0.1))

Training

model.fit(x=get_dataset(TRAIN_FILENAMES, 
repeated=True,
augment=True),
validation_data=get_dataset(VALID_FILENAMES,
ordered=True),
steps_per_epoch=100,
epochs=10)

Second stage training (encoder + classifier head)

For the 2nd stage of the training, we remove the projection head and add the classifier head at the top of the encoder, which now has trained weights. For this step, we can use regular cross-entropy loss and train the model asΒ usual.

Building theΒ model

with strategy.scope():
model = classifier_fn((None, None, CHANNELS), N_CLASSES,
encoder, # trained encoder
trainable=False) # with frozen weights
    model.compile(optimizer=optimizers.Adam(lr=3e-4),
loss=losses.SparseCategoricalCrossentropy(),
metrics=[metrics.SparseCategoricalAccuracy()])

Training
Pretty much the same asΒ before

model.fit(x=get_dataset(TRAIN_FILENAMES, 
repeated=True,
augment=True),
validation_data=get_dataset(VALID_FILENAMES,
ordered=True),
steps_per_epoch=100,
epochs=10)

Visualizing the embeddings outputs

One interesting way of evaluating the learned representation of the encoder is to visualize the output of the feature embedding, in our case, it is the last layer of the encoder, which was the average pooling layer.
Here we will be comparing the model trained with SCL with another one trained with regular cross-entropy, you can see the complete training in the reference notebook.
The visualizations are generated by applying t-SNE at the embedding outputs of the validation dataset.

Cross-entropy embedding

Embedding visualization of the model trained with cross-entropy.

Supervised Contrastive Learning embedding

Embedding visualization of the model trained withΒ SCL.

We can see that both models seem to do a good job at clustering samples of each class together, but looking at the embeddings of the model trained with SCL, the samples of each class are clustered much more apart than samples of the other classes, this is the result of the contrastive learning, we can also expect that this behavior will lead to better generalization since the classes decision boundaries will be more clear, one intuitive exercise to understand this advantage, is trying to draw the decision boundaries lines to separate the classes at each embedding, you will have a much easier time with the SCL embedding.

Conclusion

We saw that training using the supervised contrastive learning methodology is both easy to implement and efficient, it can lead to better accuracy, and better class representations, which in turn can also result in more robust models able to better generalize.
If you are willing to give SCL a try, make sure to check out the linksΒ below.

References

Supervised Contrastive Learning paper.
SCL paper review (video by Yannic Kilcher).
SCL tutorial at the official Keras repository.
SCL used on Cassava Leaf Disease Classification (Kaggle competition).
SCL discussion thread (Kaggle competition).

Acknowledgments:

  • Paper authors: Authors: Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, Dilip Krishnan.
    – Keras tutorial: KhalidΒ Salama.

If you wanna check out how to build a training pipeline for computer vision using Tensorflow check out this article: β€œEfficiently using TPU for image classification”.


Supervised Contrastive Learning for Cassava Leaf Disease Classification was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓