Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

ResNeXt: from scratch
Computer Vision   Research

ResNeXt: from scratch

Last Updated on January 1, 2021 by Editorial Team

Author(s): Tanmay Debnath

Source: Unsplash

Computer Vision,Β Research

ResNeXt follows a simple concept of β€˜divide and conquer’. ResNeXt is often referred to as the Extended version of the β€˜ResNet’. Some of its important applications are in the field of Biomedical Engineering department and especially in the Bioimaging department. Here, I am going to explore the β€œmaking of ResNeXt: from scratch.”

Modules: PyTorch, CUDA (Optional)

If you are confused about how to install PyTorch in your system, then you might want to check out this link here. It would help you! MovingΒ forward…

ResNeXt

ResNeXt architecture is quite similar to that of the ResNet architecture. If you want to know about the ResNet architecture, then please head in this direction. It is a deep-learning-based algorithm whose main task is to understand the deeper insights of the image features. How do we get startedΒ then?…

import torch
import torch.nn as nn

This is the initial block of code that would initialize the PyTorch library in the Python environment. These are pretty hefty architectures and hence requires a lot of computation. By default, the architecture would expect the system to have good specifications (in terms of CPU and GPU capacity) to be able to complete its task in due time and with utmost accuracy.

Now, if you are new to Python and wanted to understand the basics of these large CNN architectures, then this might not be productive for you because of the fact that in this definition of β€˜ResNeXt’ a lot of inheritance and class instances have been defined which is even hard for experienced programmers. This can be quite dazzling for the newbies, and hence I would request you to first go through the basics ofΒ OOPs.

Comfortable enough with the OOPs? Let’s moveΒ forward…

class resnext_block(nn.Module):
def __init__(self, in_channels, cardinality, bwidth, idt_downsample=None, stride=1):
super(resnext_block, self).__init__()
self.expansion = 2
out_channels = cardinality * bwidth
self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=1, padding=0)
self.bn1 = nn.BatchNorm2d(out_channels)
self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, groups=cardinality, stride=stride, padding=1)
self.bn2 = nn.BatchNorm2d(out_channels)
self.conv3 = nn.Conv2d(out_channels, out_channels*self.expansion, kernel_size=1, stride=1, padding=0)
self.bn3 = nn.BatchNorm2d(out_channels*self.expansion)
self.relu = nn.ReLU()
self.identity_downsample = idt_downsample

Starting off with the definition of the layers block, in the first block, we defined the subsequent components that would be required for moving forward with the structure. This is just the initialization phase. Whenever we would be calling the class, the first thing that the class would do is to initialize these modules along with their defined specifications.

One of the things that you might relate, if you have studied ResNet and the research paper for ResNeXt is that we have cardinality and the base-width of the groups defined in the earlier function. We haven’t defined it in ResNet because now we are dividing the entire structure and stacking them side-by-side, and then analyzing everything. So, the β€˜cardinality’ would define the groups in the total architecture, and the base-width would totally define the β€˜out channels’ in the architecture.

What to doΒ next?…

def forward(self, x):
identity = x
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.conv2(x)
x = self.bn2(x)
x = self.relu(x)
x = self.conv3(x)
x = self.bn3(x)

if self.identity_downsample is not None:
identity = self.identity_downsample(identity)

x += identity
x = self.relu(x)
return x

The β€˜forward’ function would be called immediately after the initialization. This would contain all the methods in an orderly fashion, as described in the research paper. You can find the similarity of the β€˜forward’ function in the ResNetΒ blog.

Presenting you the ResNeXt architecture…

class ResNeXt(nn.Module):
def __init__(self, resnet_block, layers, cardinality, bwidth, img_channels, num_classes):
super(ResNeXt, self).__init__()
self.in_channels = 64
self.conv1 = nn.Conv2d(img_channels, 64, kernel_size=7, stride=2, padding=3)
self.bn1 = nn.BatchNorm2d(64)
self.relu = nn.ReLU()
self.cardinality = cardinality
self.bwidth = bwidth
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

# ResNeXt Layers
self.layer1 = self._layers(resnext_block, layers[0], stride=1)
self.layer2 = self._layers(resnext_block, layers[1], stride=2)
self.layer3 = self._layers(resnext_block, layers[2], stride=2)
self.layer4 = self._layers(resnext_block, layers[3], stride=2)

self.avgpool = nn.AdaptiveAvgPool2d((1,1))
self.fc = nn.Linear(self.cardinality * self.bwidth, num_classes)

We come to the formal definition of the β€˜ResNeXt’ architecture. Again, this is the initialization phase. One might notice that there is the initialization of some β€˜_layers’ method, and yet there are not instances of the same defined whatsoever. It would be defined as we go through the nextΒ steps…

def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)

x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)

x = self.avgpool(x)
x = x.reshape(x.shape[0], -1)
x = self.fc(x)
return x

Here comes the β€˜forward’ function. Officially our ResNeXt function is defined. We structure the β€˜forward’ function as it has been defined in the paper. Using the initial stages of Conv layers and then setting up the entire layer depending upon the requirements.

def _layers(self, resnext_block, no_residual_blocks, stride):
identity_downsample = None
out_channels = self.cardinality * self.bwidth
layers = []

if stride != 1 or self.in_channels != out_channels * 2:
identity_downsample = nn.Sequential(nn.Conv2d(self.in_channels, out_channels*2, kernel_size=1,
stride=stride),
nn.BatchNorm2d(out_channels*2))

layers.append(resnext_block(self.in_channels, self.cardinality, self.bwidth, identity_downsample, stride))
self.in_channels = out_channels * 2

for i in range(no_residual_blocks - 1):
layers.append(resnext_block(self.in_channels, self.cardinality, self.bwidth))

self.bwidth *= 2

return nn.Sequential(*layers)

So, we have defined everything, and it’s time for us to determine whether our speculations are correct or not. We can implement the code block given below to test the architecture.

def ResNeXt50(img_channels=3, num_classes=1000, cardinality=32, bwidth=4):
return ResNeXt(resnext_block, [3,4,6,3], cardinality, bwidth, img_channels, num_classes)

And we are done! That was a hefty job to go through. Thanks for sticking up to the last. This is an important architecture in the domains of CNN, and understanding it is difficult indeed! If you need further help, see the sectionsΒ below…

Help is on theΒ way!

If you still feel that you need the entire code, then please go to this linkΒ here.

References

Please check out the original works of the researchers.


ResNeXt: from scratch was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓