Augmenting The Bank Complaints Data
Last Updated on April 22, 2024 by Editorial Team
Author(s): Adam Ross Nelson
Originally published on Towards AI.
A beginnerβs guide to data augmentation
The process of augmenting data usually involves a scenario where you have some data but not enough data. When you have some data you can apply a range of techniques that sample, re-sample, modify, adjust, and replicate some (or many) of the original observations.
These techniques and processes artificially increase the size and also sometimes the diversity of data that will be available for training a machine learning model. The sample, re-sample, modification, adjustments, and replications also often involve some form of transformation.
For example, if you have 1,000 images but you need 10,000 images for your project you can start by doubling those 1,000 images by rotating all of them from left to right (illustrated here).
Photo Credit: Stock images from Canva. Edits and annotations by author.
After flipping each of them you can then double that again by flipping all of them top to bottom (illustrated below). Now you have 4,000 images.
Photo Credit: Stock images from Canva. Edits and annotations by author.
One step further you can also double the original data again by applying a random zoom (not illustrated here). After the random zoom, you could then apply a random crop (also not illustrated). By now, instead of the desired 10,000 images… Read the full blog for free on Medium.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI