Using AI to control AI: How to Prevent Creating Biased Datasets
Last Updated on July 20, 2023

Author(s): Michelangiolo Mazzeschi

What are labels?

In the last few days, MIT took down a cited 80 million tiny images 32×32 size because it contained labels (if you do not know what it means, I will clarify it further on) that were inappropriate. In this article, I will attempt to tackle this specific problem using a Machine Learning approach.

If I understood correctly from the published paper on this dataset, this was the procedure undertaken in collecting data from the internet and storing it into the dataset:

Retrieving images in large scale from Google

The team responsible for the project scavenged the internet for these images, each one associated…

