
Improved PyTorch Models in Minutes with Perforated Backpropagation — Step-by-Step Guide
Last Updated on April 29, 2025 by Editorial Team
Author(s): Dr. Rorry Brenner
Originally published on Towards AI.
Perforated Backpropagation is an optimization technique which leverages a new type of artificial neuron, bringing a long overdue update to the current model based on 1943 neuroscience. The new neuron instantiates the concept of artificial dendrites, bringing a parallel to the computation power dendrites add to neurons in biological systems. For more details on how the new neuron works check out my first article here. This article is a step-by-step guide through the short process to add this tool to your PyTorch training pipeline using the baseline MNIST example from the PyTorch official repository.
Step 1
The first step is simply to install the package. This can be done through PyPi with pip:
pip install perforatedai
Step 2
Step two, similarly simple, is to add the imports to the top of your training script:
from perforatedai import pb_globals as PBG
from perforatedai import pb_models as PBM
from perforatedai import pb_utils as PBU
Step 3
The next step is to convert the modules in your model to be wrapped in a way that allows them to add artificial dendrites. This step involves just a single line of code after the model is created.
model = Net().to(device)
model = PBU.initializePB(model)
Step 4
Step four is the first step which requires a bit more coding. Behind the scenes, Dendritic functions are handled by a Perforated Backpropagation tracker object. This object maintains records of training improvements and makes decisions about when to add Dendrites to the network’s modules. For best performance, the tracker also maintains the optimizer and scheduler. This means the following change must be made when you initially set them up.
# Original:
optimizer = optim.Adadelta(model.parameters(), lr=args.lr)
scheduler = StepLR(optimizer, step_size=1, gamma=args.gamma)
# New Format:
PBG.pbTracker.setOptimizer(optim.Adadelta)
PBG.pbTracker.setScheduler(StepLR)
optimArgs = {'params':model.parameters(),'lr':args.lr}
schedArgs = {'step_size':1, 'gamma': args.gamma}
optimizer, scheduler = PBG.pbTracker.setupOptimizer(model, optimArgs, schedArgs)
This creates the same optimizer and scheduler with the same settings, but the tracker also has a pointer to them. Additionally, optimizer.step() should remain, but scheduler.step() should be removed from the training loop.
Step 5
The final change that must be made is passing the tracker your validation scores each time a validation cycle is run. For the MNIST example, this is done by adding the following code block to the test function:
model, restructured, trainingComplete = PBG.pbTracker.addValidationScore(100. * correct / len(test_loader.dataset),
model)
model.to(device)
if(restructured):
optimArgs = {'params':model.parameters(),'lr':args.lr}
schedArgs = {'step_size':1, 'gamma': args.gamma}
optimizer, scheduler = PBG.pbTracker.setupOptimizer(model, optimArgs, schedArgs)
return model, optimizer, scheduler, trainingComplete
If Dendrites have been added model will be a new model architecture with the same weights other than the new Dendrite modules. If this has just happened, then restructured will be true. In these epochs the optimizer and scheduler must be reset to point to the parameters of the new model. The third output of the addValidationScore function is trainingComplete. This variable will be true when the tracker decides additional Dendrites will not improve performance. Lastly, all variables must be returned from the test function so they will be available for the next training epoch. Because these variables are now returned, the following two changes are required as well:
# Original test Definition:
def test(model, device, test_loader):
# New Format
def test(model, device, test_loader, optimizer, scheduler, args):
# Original training loop
for epoch in range(1, args.epochs + 1):
train(args, model, device, train_loader, optimizer, epoch)
test(model, device, test_loader)
scheduler.step()
# New Format
for epoch in range(1, args.epochs + 1):
train(args, model, device, train_loader, optimizer, epoch)
model, optimizer, scheduler, trainingComplete = test(model, device, test_loader, optimizer, scheduler, args)
if(trainingComplete):
break
Running your first experiment
With that, you are now ready to run your first experiment. After making these changes, the system will add three Dendrites, and then inform you that no problems were detected. This step is always best to perform first to ensure no problems were made with the implementation. After that success message has been received add the following line to run a full experiment:
PBG.testingDendriteCapacity = False
With this in place you can now run again and reproduce the results below, showing a 29% reduction in the remaining error of the MNIST Model.
Using Dendrites for Compression
As you may have guessed, adding Dendrites to a model does increase the parameter count. So a question I often receive is whether or not the accuracy increase is easily explained by the increased parameters. This is not the case. The new type of neurons, with the assistance of their Dendrites, are more computationally efficient than the outdated counterparts. This means adding Dendrites can also be used for model compression.
To make reduced models by adding parameters the essential step is to also start with a smaller model than the original. For the MNIST example this can be done by adding a width parameter to the definition of your model as follows:
class Net(nn.Module):
def __init__(self, num_classes, width):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, int(32*width), 3, 1)
self.conv2 = nn.Conv2d(int(32*width), int(64*width), 3, 1)
self.dropout1 = nn.Dropout(0.25)
self.dropout2 = nn.Dropout(0.5)
self.fc1 = nn.Linear(144*int(64*width), int(128*width))
self.fc2 = nn.Linear(int(128*width), num_classes)
Once the width parameter is added you can try different width settings. The following graph was produced with a width value of 0.61 after adding one Dendrite to the original Model:
As the graph implies, the reduced model performs worse on the MNIST data, but also reducing the width of a model with fully connected layers provides a quadratic reduction in parameter count. Adding Dendrites only comes with a linear increase. This is what allows the addition of one Dendrite to each neuron in the model to catch up with the original accuracy, while the final model remains 44% smaller than the original settings.
Conclusion and Call to Action
Perforated Backpropagation can be integrated into any PyTorch-based system for experimentation, and it is currently free during beta testing. The system can be used to compress models (NLP, Biotech Classification, Time Series Prediction, Edge Computer Vision) and increase accuracy (Stock Forecasting, PEFT with LoRA, Computer Vision) across nearly all data formats we’ve tried it on. If your model would benefit from up to 40% increased accuracy or 90% model compression with only minutes of coding, sign up to be a beta tester here.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI