Last Updated on November 3, 2020 by Editorial Team

Author(s): Dimitre Oliveira

Image for post — Source: https://www.kaggle.com/docs/tpu

Using the Tensorflow data module to build a complex image augmentation pipeline.

If you want to train your models with Tensorflow in the most efficient way you probably should use TFRecords and the Tensorflow data module to build your pipelines, but depending on the requirements and constraints of your applications, using them might be necessary not and an option, the good news is that Tensorflow has made both of them pretty clean and easy to use.

In this article, we will go through a simple yet efficient way of building pipelines with complex combinations of data augmentation using the Tensorflow data module.

One of the options I mentioned that could improve your models’ training, is to use TFRecords, TFRecord is a simple format provided by Tensorflow for storing data, I am not going into too many details about TFRecords because it is not the focus of this article but if you want to learn more check out this tutorial from Tensorflow.

The information provided here can be applied to train models with Tensorflow in any hardware, I am going to use TPU as the target hardware because if you are using TPUs, probably you are already trying to make the most of your resources, and you would need to use the Tensorflow data module anyway.

Data augmentation with Tensorflow

First, we will begin by taking a look at how data augmentation is done at the official data augmentation tutorial by Tensorflow.

# Data augmentation function
def augment(image, label):
  image = tf.image.random_crop(image, size=[IMG_SIZE, IMG_SIZE, 3])
  image = tf.image.random_brightness(image, max_delta=0.5)
  image = tf.clip_by_value(image, 0, 1)
  return image, label
# Tensorflow data pipeline
train_ds = (
    train_ds
    .shuffle(1000)
    .map(augment, num_parallel_calls=AUTOTUNE)
    .batch(batch_size)
    .prefetch(AUTOTUNE)
)

As we can see at the augment function, it will apply a sequence of transformations to the images, first, it will take a random crop, then apply random brightness and finally clip the values to keep them between 0 and 1.

Following Tensorflow best practices, a data augmentation function is usually applied to the data pipeline by a map operation.

The problem with the approach above is how the transformations are being applied to the images, you are basically just stacking them sequentially, generally, you will need to have some control over what and how is being applied, let me describe a few scenarios to make my point.

Scenario 1:

Your data may benefit from advanced data augmentations techniques like Cutout, Mixup, or CutMix, if you are familiar with how they work you know that for each sample you are probably going to apply only one of them.

Scenario 2:

You might want to use many “pixel-level” augmentations, by pixel-level I mean transformations like brightness, gamma adjust, contrast, or saturation, usually lighter variations of those transformations can be safely used at many different datasets, but using all of them at once might change too much your images and end up disturbing the model training.

So what could be done?

If you are familiar with data augmentation for computer vision tasks you might have heard of libraries like Imgaug or Albumentations, if not, here are two examples from the Albumentations library of how it can do data augmentation:

def augment(p=0.5):
    return Compose([
        RandomRotate90(),
        Flip(),
        Transpose(),
        OneOf([
            IAAAdditiveGaussianNoise(),
            GaussNoise(),
        ], p=0.2),
        OneOf([
            MotionBlur(p=0.2),
            MedianBlur(blur_limit=3, p=0.1),
            Blur(blur_limit=3, p=0.1),
        ], p=0.2),
        OneOf([
            OpticalDistortion(p=0.3),
            GridDistortion(p=0.1),
            IAAPiecewiseAffine(p=0.3),
        ], p=0.2),
        OneOf([
            CLAHE(clip_limit=2),
            IAASharpen(),
            IAAEmboss(),
            RandomBrightnessContrast(),
        ], p=0.3),
        HueSaturationValue(p=0.3),
    ], p=p)


augmented_image = augment(image=image)['image']

We can clearly see that Albumentations provides a much more efficient way of applying different transformations to images. You can apply them sequentially, like the Tensorflow tutorial, but you can also use operations like “OneOf” and choose only one among a group of transformations to be applied, and the most important detail is that here you can control the probability that each transformation has of being applied.
It is worth it noting that the transformations that these libraries use are heavily optimized to run as fast as possible, Albumentations even have a benchmark.

The best of both worlds would be if we could use a library like Albumentations that is very efficient and already implement a lot of different transformations with our Tensorflow data pipeline, but unfortunately, it is not possible, so what we can do?

Complex data augmentations with Tensorflow

Actually, if we use some creativity, we can build data augmentation functions that are pretty close to the ones provided by Albumentation, and only using Tensorflow code, so it can run on TPUs integrated with Tensorflow pipelines, here is a simple example:

def augment(image):
    p_spatial = tf.random.uniform([], 0, 1.0, dtype=tf.float32)
    p_rotate = tf.random.uniform([], 0, 1.0, dtype=tf.float32)
    
    
    # Flips
    if p_spatial >= .2:
        image = tf.image.random_flip_left_right(image)
        image = tf.image.random_flip_up_down(image)
        
    # Rotates
    if p_rotate > .75:
        image = tf.image.rot90(image, k=3) # rotate 270º
    elif p_rotate > .5:
        image = tf.image.rot90(image, k=2) # rotate 180º
    elif p_rotate > .25:
        image = tf.image.rot90(image, k=1) # rotate 90º

    return image

Great! this function has all the things that we liked about Albumentations and is pure Tensorflow, let’s check:
— [x] Apply transformation sequentially.
— [x] “OneOf” type of transformation (grouping).
— [x] Control the probability of applying a transformation.

Let’s breakdown what is going on at this function.

First, we define two variables p_spatial and p_rotate then assign to them probabilities, those probabilities are sampled from a random uniform distribution, this means that all numbers in the interval [0, 1] have the same chance of being sampled.
Then we have two different types of transformations that we want to apply, flips and rotates, they have different semantics so they belong to different groups.
For the flips transformations if p_spatial is greater than .2 we will apply two random flip transformations, in other words, there is an 80% chance of applying those two random flips.
At the rotates transformations we are using more control, this will be similar to the “OneOf” from Albumentations because we are applying only one of those transformations, each of them has a 25% chance of being applied and there is also a 25% chance of applying nothing at all, we needed this kind of control here because there is no point of rotating the image 90° thee times, then 2 more times and so on.

Using this idea you can build data augmentation functions that can be a lot more complex than this one, here is an example that I used for the SIIM-ISIC Melanoma Classification Kaggle competition:

def data_augment(image):
    p_rotation = tf.random.uniform([], 0, 1.0, dtype=tf.float32)
    p_rotate = tf.random.uniform([], 0, 1.0, dtype=tf.float32)
    p_cutout = tf.random.uniform([], 0, 1.0, dtype=tf.float32)
    p_shear = tf.random.uniform([], 0, 1.0, dtype=tf.float32)
    p_crop = tf.random.uniform([], 0, 1.0, dtype=tf.float32)
    
    
    if p_shear > .2:
        if p_shear > .6:
            image = transform_shear(image, config['HEIGHT'], shear=20.)
        else:
            image = transform_shear(image, config['HEIGHT'], shear=-20.)
    
    if p_rotation > .2:
        if p_rotation > .6:
            image = transform_rotation(image, config['HEIGHT'], rotation=45.)
        else:
            image = transform_rotation(image, config['HEIGHT'], rotation=-45.)

    if p_crop > .2:
        image = data_augment_crop(image)

    if p_rotate > .2:
        image = data_augment_rotate(image)
        
    image = data_augment_spatial(image)
    
    image = tf.image.random_saturation(image, 0.7, 1.3)
    image = tf.image.random_contrast(image, 0.8, 1.2)
    image = tf.image.random_brightness(image, 0.1)
    
    if p_cutout > .5:
        image = data_augment_cutout(image)
    
    return image

def data_augment_spatial(image):
    p_spatial = tf.random.uniform([], 0, 1.0, dtype=tf.float32)

    image = tf.image.random_flip_left_right(image)
    image = tf.image.random_flip_up_down(image)
    if p_spatial > .75:
        image = tf.image.transpose(image)

    return image

def data_augment_rotate(image):
    p_rotate = tf.random.uniform([], 0, 1.0, dtype=tf.float32)
    
    if p_rotate > .66:
        image = tf.image.rot90(image, k=3) # rotate 270º
    elif p_rotate > .33:
        image = tf.image.rot90(image, k=2) # rotate 180º
    else:
        image = tf.image.rot90(image, k=1) # rotate 90º

    return image

def data_augment_crop(image):
    p_crop = tf.random.uniform([], 0, 1.0, dtype=tf.float32)
    crop_size = tf.random.uniform([], int(config['HEIGHT']*.7), config['HEIGHT'], dtype=tf.int32)
    
    if p_crop > .5:
        image = tf.image.random_crop(image, size=[crop_size, crop_size, config['CHANNELS']])
    else:
        if p_crop > .4:
            image = tf.image.central_crop(image, central_fraction=.7)
        elif p_crop > .2:
            image = tf.image.central_crop(image, central_fraction=.8)
        else:
            image = tf.image.central_crop(image, central_fraction=.9)
    
    image = tf.image.resize(image, size=[config['HEIGHT'], config['WIDTH']])

    return image

I will also leave two links to complete code examples using a similar approach.
— Complete code for the example above
— Introductory notebook for advanced augmentation with Tensorflow

If you wanna check out how to build a complete Tensorflow pipeline to train models on TPUs here is a cool article that I have written “Efficiently Using TPU for Image Classification”.

To learn even more take a look at the references:
— Tensorflow TFRecords tutorial
— Tensorflow data module documentation
— Tensorflow data module tutorial
— Better performance with the tf.data API
— Tensorflow data augmentation tutorial
— Efficiently Using TPU for Image Classification
— TPU-speed data pipelines

Frequently Used, Contextual References

Resources

Publication

Building Complex Image Augmentation Pipelines with Tensorflow

Author(s): Dimitre Oliveira

Using the Tensorflow data module to build a complex image augmentation pipeline.

Data augmentation with Tensorflow

Scenario 1:

Scenario 2:

So what could be done?

Complex data augmentations with Tensorflow

Let’s breakdown what is going on at this function.

Towards AI Team

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

🔎 Decoding LLM Pipeline — Step 1: Input Processing & Tokenization

Meta to Launch Its Own In-House AI Chip

I Built an AI Money Coach in Python — Here’s How You Can Too (Step-by-Step Guide!)

ChatGPT Now Works Natively in Xcode and VS Code

TAI #143: New Scaling Laws Incoming? Ilya’s SSI Raises at $30bn, Manus Takes AI Agents Mainstream

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Building Complex Image Augmentation Pipelines with Tensorflow

Author(s): Dimitre Oliveira

Using the Tensorflow data module to build a complex image augmentation pipeline.

Data augmentation with Tensorflow

Scenario 1:

Scenario 2:

So what could be done?

Complex data augmentations with Tensorflow

Let’s breakdown what is going on at this function.

Towards AI Team

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement