Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!


How To Set Up and Run Cuda Operations In PyTorch

How To Set Up and Run Cuda Operations In PyTorch

Last Updated on October 4, 2022 by Editorial Team

Author(s): Muttineni Sai Rohith

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.


The advent of deep learning in recent years created a demand for computing resources and acceleration of workloads. Various operations involved in deep learning, such as matrix multiplications, tiling of the images, and processing chunks of voice samples, can be parallelized for better performance and accelerating the development of Machine learning models. Thus, many deep learning libraries like TensorFlow and Pytorch provide users with a set of functions or APIs to take advantage of their GPUs. CUDA Is one such programming model and computing platform which enables us to perform complex operations faster by parallelizing the tasks across GPUs.

This article will discuss what CUDA is and how to set up the CUDA environment and run various CUDA operations available in Pytorch.

Photo by Lucas Kepner on Unsplash

What is CUDA

CUDA (Compute Unified Device Architecture) is a programming model and parallel computing platform developed by Nvidia. Using CUDA, one can maximize the utilization of Nvidia-provided GPUs, thereby improving the computation power and performing operations away faster by parallelizing the tasks. PyTorch provides a torch.cuda library to set up and run the CUDA operations.

Using Pytorch CUDA, we can create tensors and allocate them to the device. Once allocated, we can perform operations on it, and the results are also assigned to the device.


Pytorch provides a user-friendly interface on their official website where we can select our operating system, desired programming language, and other requirements, as shown in the below figure.

Refer to this official Pytorch link — Start Locally | PyTorch and select the requirements according to our system specifications. Pytorch provides CUDA libraries for Windows and Linux Operating systems. For windows, make sure to use CUDA 11.6 because CUDA 10.2 and ROCm are no longer supported for windows. For Python programming language, we can select one in conda, pip, and source packages, whereas LibTorch is used for C++ and Java languages.

Running CUDA operations in PyTorch

Once installed successfully, we can use the torch.cuda interface to run CUDA operations in Pytorch.

To make sure whether the installation is successful, use the torch.version.cuda command as shown below:

# Importing Pytorch
import torch
# To print Cuda version
print(“Pytorch CUDA Version is “, torch.version.cuda)

If the installation is successful, the above code will show the following output –

# Output
Pytorch CUDA Version is 11.6

Before using the CUDA, we have to make sure whether CUDA is supported by our System.

Use torch.cuda.is_available() command as shown below –

# Importing Pytorch
import torch
# To check whether CUDA is supported
print(“Whether CUDA is supported by our system:”, torch.cuda.is_available())

The above command will return a Boolean Value as below –

# Output
Whether CUDA is supported by our system: True

Pytorch CUDA also provides the following functions to know about the device id and name of the device when given device ID, as shown below –

# Importing Pytorch
import torch
# To know the CUDA device ID and name of the device
Cuda_id = torch.cuda.current_device()
print(“CUDA Device ID: ”, torch.cuda.current_device())
print(“Name of the current CUDA Device: ”, torch.cuda.get_device_name(cuda_id))

The above code will show the following output –

# Output
CUDA Device ID: 0
Name of the current CUDA Device: NVIDIA GeForce FTX 1650

We can also change the default CUDA device by specifying the ID as shown below –

# Importing Pytorch
import torch
# To change the Default CUDA device

Note: While using CUDA, make sure to develop device-agnostic code because some systems might not have GPUs and will have to run on CPUs, and vice versa. That can be done by adding the following line to our code-

device = ‘cuda’ if torch.cuda.is_available() else ‘cpu’

Operating Tensors with CUDA

Generally, a Pytorch tensor is the same as a NumPy array. It is an n-dimensional array used for numerical computation. The only difference between tensor and NumPy array is tensor can run both on CPUs and GPUs.

Pytorch CUDA provides the following functions to handle tensors –

· tensor.device — returns the device name of the tensor. By default, it is “CPU”.

· — returns a new instance of the tensor on the device mentioned. “CPU” for CPU and ”cuda” for CUDA enabled GPU.

· tensor.cpu() — to transfer the tensor from the current device to CPU.

Let’s understand the usage of the above functions by creating a tensor and performing some basic operations.

We will create a sample tensor and perform a tensor operation(Squaring) on the CPU, and then we will transfer the tensor to GPU and perform the same operation again and understand the performance.

import torch

# Creating a sample tensor
x = torch.randint(1, 1000, (100, 100))
# Checking the device name: will return ‘CPU’ by default
print(“Device Name: ” , x.device)
# Applying tensor operation
res_cpu = x ** 2
# Transferring tensor to GPU
x =‘cuda’))
# Checking the device name: will return ‘cuda:0’
print(“Device Name after transferring: ”, x.device)
# Applying same tensor operation
res_gpu = x ** 2
# Transferring tensor from GPU to CPU

Running Machine Learning models with CUDA

CUDA provides the following function to transfer the machine learning model to the following device

· — returns a new instance of the Machine learning model on the device_name specified. “CPU” for CPU and ”cuda” for CUDA-enabled GPU.

To demonstrate the above function, we will import the pre-trained “Resnet-18” model from torchvision.models

# Importing Pytorch
Import torch
import torchvision.models as models
# Making the code device-agnostic
device = ‘cuda’ if torch.cuda.is_available() else ‘cpu’
# Instantiating a pre-trained model
model = models.resnet18(pretrained=True)
# Transferring the model to a CUDA-enabled GPU
model =

Once the model is transferred, we can continue the rest of the machine learning workflow on CUDA-enabled GPU.


After reading this article, one can understand how to install the PyTorch CUDA library in our system, implement basic commands of PyTorch CUDA, handling tensors and machine learning models with CUDA.

How To Set Up and Run Cuda Operations In PyTorch was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓