Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

Computer Vision with fast.ai
Latest   Machine Learning

Computer Vision with fast.ai

Last Updated on July 25, 2023 by Editorial Team

Author(s): Dhairya Kumar

Originally published on Towards AI.

Building and Deploying an Image Classifier U+007C Towards AI

Build and deploy an image classifier using fast.ai and Render

Computer vision is ubiquitous and it has tons of real-life applications like object detection, face recognition, face verification, image classification, etc. and if you are interested in learning these concepts, but are not sure where to start, then don’t worry there are lots of people out there facing the same problem. In this article, I will try to help you out a little bit by teaching you how to build an image classifier from scratch.

We will use fast.ai, which is a library built on top of PyTorch. Fast.ai has a lot of handy functions which makes our lives easier by significantly reducing the lines of code.

The only prerequisite is basic knowledge of Python.

We will build this computer vision application from scratch, and by scratch I mean we will have absolutely nothing in the start, no dataset, no model files, nothing. I don’t want you to use the same crappy dogs, cats dataset to build an image classifier, instead, I want you to build something original.

The complete code for this project can be found on GitHub.

Table of Contents

  • Data Collection
  • Building an Image classifier
  • Deploying the model

Data Collection

As I said in the introduction that we won’t be using the same old cats, dogs dataset, so now the question is which dataset will we use, well guess what, I won’t answer that, rather you will have to figure that out on your own. What I mean by figuring out on your own is that you should build your dataset from scratch rather than downloading it from Kaggle or UCI or any other source.

The rationale behind this is that you will be more motivated to complete this project if you are creating something that actually matters to you rather than classifying dogs and cats. I will give you a few examples to help you get started.

I like cars and hence I just downloaded 500 images of Tesla, BMW, Mercedes, Toyota and Ford cars from Google. (I will explain how to create your dataset later in this article)

You can do the same, just look for something that excites you. If you are a Pokemon fan, download images of various Pokemons and build a classifier, If you are a GOT fan (or were at least before season 8) then download images of various GOT characters and build an image classifier around that, the point is you should look for something that excites you as it will provide you intrinsic motivation to complete this project and even to go beyond and learn new things.

How to build your dataset

Enough motivation, now let’s get to the coding part. Building your dataset from scratch is actually pretty easy. We will go through the data collection process step by step.

STEP 1 — Type your search query in Google images. Try to be as specific as possible, as it will result in better search results. For example, while searching for cars I typed “Tesla cars” rather than typing just Tesla, as typing just Tesla might result in ambiguous search results. So just be as specific as possible. Scroll down until you see “Show more results”.

STEP 2 — Type some javascript code in your browser to download the files. Press CTRL + SHIFT + J on Windows/Linux or COMMAND + OPTION + J on Mac. A javascript console will appear, type the following code into that console.

NOTE — Disable AdBlock before running this code snippet

urls = Array.from(document.querySelectorAll('.rg_di.rg_meta')).
map(el=>JSON.parse(el.textContent).ou);
window.open('data:text/csv;charset=utf-8,' + escape(urls.join('\n')));

It will generate a downloads.csv file containing all the URLs.

STEP 3 — Use the following code snippet to download the images.

## Define your classes as a list
classes = ['Tesla' , 'Mercedes' , 'BMW' , 'Toyota' , 'Ford']
for folder in classes:
file = 'download_' + folder + '.csv'
path = Path('data/cars')
dest = path/folder
dest.mkdir(parents = True, exist_ok = True)
download_images(path/file, dest, max_pics = 500)

The project structure is as follows —

U+007C-- data
U+007C-- cars
U+007C-- Tesla
U+007C-- Mercedes
U+007C-- BMW
U+007C-- Toyota
U+007C-- Ford
U+007C-- download_Tesla.csv
U+007C-- download_Mercedes.csv
U+007C-- download_BMW.csv
U+007C-- download_Toyota.csv
U+007C-- download_Ford.csv

NOTE — The project structure will vary if you choose a different dataset, however I will suggest that you follow this hierarchical structure of storing your files.

Building an Image Classifier

After collecting the required data, we will now focus on building our classifier. The first step will be to delete corrupted images from our dataset. Google displays few images which are corrupted and can’t be opened, and it is necessary to remove these images to maintain the sanity of our dataset.

## Deleting corrupted images
for c in classes:
print(c)
verify_images(path/c, delete = True, max_size = 500)

The verify_images function will check each and every image and since we have set delete = True; therefore, it will delete the images which are corrupt or can’t be opened. It is a built-in function provided by the fast.ai library.

I will explain the rest of the code line by line.

## Viewing the dataset
np.random.seed(10)
data = ImageDataBunch.from_folder(path,train='.',valid_pct=0.2, ds_tfms = get_transforms() , size = 224, num_workers = 4).normalize(imagenet_stats)data.classes
data.show_batch(rows = 3 ,figsize = (7,8))
## Training
learn = cnn_learner(data,models.resnet34, metrics = error_rate)
learn.fit_one_cycle(50)
## Saving our model
learn.save('stage-1')
## Unfreezing our model
learn.unfreeze()
## Finding the best learning rate and training the model again
learn.lr_find()
learn.recorder.plot()
learn.fit_one_cycle(10, max_lr=slice(1e-6,1e-4))
## Saving and loading the updated model
learn.save('stage-2')
learn.load('stage-2');
## Evaluation
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()
interp.most_confused(min_val=2)
## Exporting the model
learn.export()

Viewing the dataset

To view our dataset we need to first create an ImageDataBunch. If you have ever downloaded a dataset from Kaggle or any other website then you might have noticed that the dataset contains folders like train, test, etc. but in our case, we just downloaded these images from Google and hence we don’t have any sort of sub-directories. Since we don’t have any train folder, so we have passed ‘.’ to the training parameter, this means that treat the current directory as the train directory and valid_pct = 0.2 implies that 20% of the data will be reserved for validation.

The reason why we used np.random.seed(10) in the above line is to make sure that we always get the same validation set, as getting different validation set every time we run the cell can cause problems, we might not be able to tune the hyperparameters properly if our validation set keeps on changing.

Training

The training process is extremely simple, we just need to create our learner object and in this case, we are using a cnn_learner. We will pass in the ImageDataBunch object along with the model we want to use and metric to be used for evaluation. The key thing to note here is that we are using a pre-trained resnet34 model as training a model from scratch is very time-consuming. This process of using an already trained model is called transfer learning.

Saving our model

We will use the save function to save our model. By saving our model we will ensure that all the weights and parameters that our model learned during the training process are saved properly so that we can use them in the future rather than training the model again.

Unfreezing our model

This is a slightly tricky concept to wrap your head around. If you are familiar with the basic structure of a CNN then you must be aware of the fact that it contains many layers, and since we are using transfer learning here; therefore, all we did was we just added a few extra layers to the end and trained only those layers rather than training the whole model.

This approach might sound surprising but it works fairly well when we are trying to classify generic images like cars, cats, dogs, etc. By using the unfreeze function we can now train the whole model rather than just training the last few layers.

Finding the best learning rate

Now that we have decided to train the whole model, we need to find the most optimal learning rate to train our model. The main problem that we face after unfreezing the model is that if we just unfreeze the model and then train it again then we will get a higher error rate. The main reason for a higher error rate is that all the layers will be trained using the same learning rate, which is not the best approach. So what we did here is that we provided a range for learning rate i.e 1e-6 to 1e-4. The first layer will be trained with the learning rate 1e-6 and the last layer will be trained with 1e-4 and the rest of the layers will have a learning rate between these two values.

Evaluation

We will evaluate our model on the validation dataset that we kept separate from the start. We will use a confusion matrix to check our results.

We have also used a special function most_confused() to check the most frequently mismatched classes.

Exporting the model

The export function will generate a .pkl file which we will use to deploy the model using Render.

Deploying the model

We will use Render to deploy this model. You just need to fork this repository and make necessary changes according to your project. The complete setup guide is available here.

Deploying on Render U+007C fast.ai course v3

If you just want to test initial deployment on Render, the starter repo is set up to use Jeremy's bear classification…

course.fast.ai

Further Learning Resources

I would highly recommend that you check out the fast.ai deep learning course.

Practical Deep Learning for Coders, v3 U+007C fast.ai course v3

To do nearly everything in this course, you'll need access to a computer with an NVIDIA GPU (unfortunately other brands…

course.fast.ai

If you want to learn more about CNNs then I would suggest taking this course by Andrew Ng.

Coursera U+007C Online Courses From Top Universities. Join for Free

1000+ courses from schools like Stanford and Yale – no application required. Build career skills in data science…

www.coursera.org

You can find the complete code for this project here.

This brings us to the end of this article. Thanks a ton for reading it.

My Twitter and LinkedIn.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓