Computer Vision with fast.ai
Last Updated on July 25, 2023 by Editorial Team
Author(s): Dhairya Kumar
Originally published on Towards AI.
Building and Deploying an Image Classifier U+007C Towards AI
Build and deploy an image classifier using fast.ai and Render
Computer vision is ubiquitous and it has tons of real-life applications like object detection, face recognition, face verification, image classification, etc. and if you are interested in learning these concepts, but are not sure where to start, then donβt worry there are lots of people out there facing the same problem. In this article, I will try to help you out a little bit by teaching you how to build an image classifier from scratch.
We will use fast.ai, which is a library built on top of PyTorch. Fast.ai has a lot of handy functions which makes our lives easier by significantly reducing the lines of code.
The only prerequisite is basic knowledge of Python.
We will build this computer vision application from scratch, and by scratch I mean we will have absolutely nothing in the start, no dataset, no model files, nothing. I donβt want you to use the same crappy dogs, cats dataset to build an image classifier, instead, I want you to build something original.
The complete code for this project can be found on GitHub.
Table of Contents
- Data Collection
- Building an Image classifier
- Deploying the model
Data Collection
As I said in the introduction that we wonβt be using the same old cats, dogs dataset, so now the question is which dataset will we use, well guess what, I wonβt answer that, rather you will have to figure that out on your own. What I mean by figuring out on your own is that you should build your dataset from scratch rather than downloading it from Kaggle or UCI or any other source.
The rationale behind this is that you will be more motivated to complete this project if you are creating something that actually matters to you rather than classifying dogs and cats. I will give you a few examples to help you get started.
I like cars and hence I just downloaded 500 images of Tesla, BMW, Mercedes, Toyota and Ford cars from Google. (I will explain how to create your dataset later in this article)
You can do the same, just look for something that excites you. If you are a Pokemon fan, download images of various Pokemons and build a classifier, If you are a GOT fan (or were at least before season 8) then download images of various GOT characters and build an image classifier around that, the point is you should look for something that excites you as it will provide you intrinsic motivation to complete this project and even to go beyond and learn new things.
How to build your dataset
Enough motivation, now letβs get to the coding part. Building your dataset from scratch is actually pretty easy. We will go through the data collection process step by step.
STEP 1 β Type your search query in Google images. Try to be as specific as possible, as it will result in better search results. For example, while searching for cars I typed βTesla carsβ rather than typing just Tesla, as typing just Tesla might result in ambiguous search results. So just be as specific as possible. Scroll down until you see βShow more resultsβ.
STEP 2 β Type some javascript code in your browser to download the files. Press CTRL + SHIFT + J on Windows/Linux or COMMAND + OPTION + J on Mac. A javascript console will appear, type the following code into that console.
NOTE β Disable AdBlock before running this code snippet
urls = Array.from(document.querySelectorAll('.rg_di.rg_meta')).
map(el=>JSON.parse(el.textContent).ou);
window.open('data:text/csv;charset=utf-8,' + escape(urls.join('\n')));
It will generate a downloads.csv file containing all the URLs.
STEP 3 β Use the following code snippet to download the images.
## Define your classes as a list
classes = ['Tesla' , 'Mercedes' , 'BMW' , 'Toyota' , 'Ford']for folder in classes:
file = 'download_' + folder + '.csv'
path = Path('data/cars')
dest = path/folder
dest.mkdir(parents = True, exist_ok = True)
download_images(path/file, dest, max_pics = 500)
The project structure is as follows β
U+007C-- data
U+007C-- cars
U+007C-- Tesla
U+007C-- Mercedes
U+007C-- BMW
U+007C-- Toyota
U+007C-- Ford
U+007C-- download_Tesla.csv
U+007C-- download_Mercedes.csv
U+007C-- download_BMW.csv
U+007C-- download_Toyota.csv
U+007C-- download_Ford.csv
NOTE β The project structure will vary if you choose a different dataset, however I will suggest that you follow this hierarchical structure of storing your files.
Building an Image Classifier
After collecting the required data, we will now focus on building our classifier. The first step will be to delete corrupted images from our dataset. Google displays few images which are corrupted and canβt be opened, and it is necessary to remove these images to maintain the sanity of our dataset.
## Deleting corrupted images
for c in classes:
print(c)
verify_images(path/c, delete = True, max_size = 500)
The verify_images function will check each and every image and since we have set delete = True; therefore, it will delete the images which are corrupt or canβt be opened. It is a built-in function provided by the fast.ai library.
I will explain the rest of the code line by line.
## Viewing the dataset
np.random.seed(10)data = ImageDataBunch.from_folder(path,train='.',valid_pct=0.2, ds_tfms = get_transforms() , size = 224, num_workers = 4).normalize(imagenet_stats)data.classes
data.show_batch(rows = 3 ,figsize = (7,8)) ## Training
learn = cnn_learner(data,models.resnet34, metrics = error_rate)
learn.fit_one_cycle(50)## Saving our model
learn.save('stage-1')## Unfreezing our model
learn.unfreeze()## Finding the best learning rate and training the model again
learn.lr_find()
learn.recorder.plot()
learn.fit_one_cycle(10, max_lr=slice(1e-6,1e-4))## Saving and loading the updated model
learn.save('stage-2')
learn.load('stage-2');## Evaluation
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()
interp.most_confused(min_val=2)## Exporting the model
learn.export()
Viewing the dataset
To view our dataset we need to first create an ImageDataBunch. If you have ever downloaded a dataset from Kaggle or any other website then you might have noticed that the dataset contains folders like train, test, etc. but in our case, we just downloaded these images from Google and hence we donβt have any sort of sub-directories. Since we donβt have any train folder, so we have passed β.β to the training parameter, this means that treat the current directory as the train directory and valid_pct = 0.2 implies that 20% of the data will be reserved for validation.
The reason why we used np.random.seed(10) in the above line is to make sure that we always get the same validation set, as getting different validation set every time we run the cell can cause problems, we might not be able to tune the hyperparameters properly if our validation set keeps on changing.
Training
The training process is extremely simple, we just need to create our learner object and in this case, we are using a cnn_learner. We will pass in the ImageDataBunch object along with the model we want to use and metric to be used for evaluation. The key thing to note here is that we are using a pre-trained resnet34 model as training a model from scratch is very time-consuming. This process of using an already trained model is called transfer learning.
Saving our model
We will use the save function to save our model. By saving our model we will ensure that all the weights and parameters that our model learned during the training process are saved properly so that we can use them in the future rather than training the model again.
Unfreezing our model
This is a slightly tricky concept to wrap your head around. If you are familiar with the basic structure of a CNN then you must be aware of the fact that it contains many layers, and since we are using transfer learning here; therefore, all we did was we just added a few extra layers to the end and trained only those layers rather than training the whole model.
This approach might sound surprising but it works fairly well when we are trying to classify generic images like cars, cats, dogs, etc. By using the unfreeze function we can now train the whole model rather than just training the last few layers.
Finding the best learning rate
Now that we have decided to train the whole model, we need to find the most optimal learning rate to train our model. The main problem that we face after unfreezing the model is that if we just unfreeze the model and then train it again then we will get a higher error rate. The main reason for a higher error rate is that all the layers will be trained using the same learning rate, which is not the best approach. So what we did here is that we provided a range for learning rate i.e 1e-6 to 1e-4. The first layer will be trained with the learning rate 1e-6 and the last layer will be trained with 1e-4 and the rest of the layers will have a learning rate between these two values.
Evaluation
We will evaluate our model on the validation dataset that we kept separate from the start. We will use a confusion matrix to check our results.
We have also used a special function most_confused() to check the most frequently mismatched classes.
Exporting the model
The export function will generate a .pkl file which we will use to deploy the model using Render.
Deploying the model
We will use Render to deploy this model. You just need to fork this repository and make necessary changes according to your project. The complete setup guide is available here.
Deploying on Render U+007C fast.ai course v3
If you just want to test initial deployment on Render, the starter repo is set up to use Jeremy's bear classificationβ¦
course.fast.ai
Further Learning Resources
I would highly recommend that you check out the fast.ai deep learning course.
Practical Deep Learning for Coders, v3 U+007C fast.ai course v3
To do nearly everything in this course, you'll need access to a computer with an NVIDIA GPU (unfortunately other brandsβ¦
course.fast.ai
If you want to learn more about CNNs then I would suggest taking this course by Andrew Ng.
Coursera U+007C Online Courses From Top Universities. Join for Free
1000+ courses from schools like Stanford and Yale – no application required. Build career skills in data scienceβ¦
www.coursera.org
You can find the complete code for this project here.
This brings us to the end of this article. Thanks a ton for reading it.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI