Google Colab 101 Tutorial with Python — Tips, Tricks, and FAQ

Last Updated on October 21, 2021 by Editorial Team

Author(s): Saniya Parveez, Roberto Iriondo

Laptop displaying Google Colab by Google, image is a derivative from original by Bongkarn Thanyakij on Pexels. — Source: Derivative from original by Bongkarn Thanyakij on Pexels

An in-depth tutorial on how to use Google Colab with Python, along with Colab’s tips, tricks, and FAQ

This tutorial’s code is available on Github and its full implementation as well on Google Colab.

Introduction
Why We Use Google Colab?
Start Google Colab
Uploading a Notebook from Github
Uploading Data from Kaggle
Read Files from Google Drive
Setting up Hardware Accelerator GPU for Runtime
Clone a GitHub Repository to Google Drive
Colab Magic
Plotting
TPU (Tensor Processing Unit) in Google Colab
Conclusion

Introduction

Google Colab is a project from Google Research, a free, Jupyter based environment that allows us to create Jupyter [programming] notebooks to write and execute Python [1](and other Python-based third-party tools and machine learning frameworks such as Pandas, PyTorch, Tensorflow, Keras, Monk, OpenCV, and others) in a web browser.

A programming notebook is a type of a shell or kernel in the form of a word processor, where we can write and execute code. The data required for processing in Google Colab can be mounted into Google Drive or imported from any source on the internet. Project Jupyter is an open-source software organization that develops and supports Jupyter notebooks for interactive computing [4].

Google Colab requires no configuration to get started and provides free access to GPUs. One of the main functionalities of Google Colab is that it allows anyone to share live code, mathematical equations, data visualizations, data processing (cleaning and transformation), numerical simulations, machine learning models, and many other projects with others.

Why We Use Google Colab?

Google Colab has unique and critical features:

It provides a free Jupyter notebook environment.
It comes with pre-installed packages.
It hosts entirely on Google Cloud.
Users do not need to set up on servers or workstations.
Notebooks save automatically on a user’s Google Drive.
It provides browser-based Jupyter notebooks.
It is completely free of cost and offers GPU and TPU power (unless you need more resources and decide to go pro with Colab Pro).
It supports Python versions 2 and 3 (however, Google Suggests migrating important notebooks to Python 3 [2] [5]).
It provides two hardware accelerators:
1. GPU (Graphical Processing Unit).
2. TPU (Tensor Processing Unit).

Start Google Colab

Python code can be executed directly on the web browser by using Colab. We can launch it with the URL below:

Google Colaboratory
Edit descriptioncolab.research.google.com

The launch window opens with a popup offering many features:

Figure 1: Screenshot of Google Colab’s start page.

It provides options to create a notebook as well as to upload and select from different sources such as:

GitHub
Google Drive
Local computer

Uploading a Notebook from GitHub

Python code can be directly uploaded from Github by using its project’s URL or by searching the organization or user. The steps below highlight how to upload a project using a Github URL:

Launch Google Colab.
Select the GitHub tab from the popup box.

Figure 2: Screenshot of Google Colab’s upload code using a Github URL.

Enter the GitHub’s project URL and search it to fetch the code

Figure 3: Screenshot showing how to upload a Github repository with Google Colab.

It will upload the complete code with one click to the Google Colab notebook.

Figure 4: Screenshot showcasing the uploaded Github repository using a URL.

Similarly, the code can be uploaded directly from Google Drive by filtering saved notebooks by name, date, owner, or modified date.

Figure 5: Screenshot showing how to upload a notebook directly from Google Drive to Google Colab.

Uploading Data from Kaggle

Data from Kaggle can be uploaded directly into Colab for processing. An API token from Kaggle is required to accomplish the data import.

Steps to generate API token from Kaggle

Open Kaggle
Go to “My Account”
Scroll down to the “API” section

Figure 6: Screenshot of Kaggle’s website showing the API section.

Click on “Expire API Token” to remove the previous token if required.
Click on “Create New API Token.” It will generate a new token and download a JSON file named “kaggle.json”
The “kaggle.json” file contains the username and key like:

Figure 7: Screenshot of IDE showing test API key for Kaggle.

Steps to upload data from Kaggle

Save the “kaggle.json” file on your local computer.

Install the Kaggle package

!pip install -q kaggle

Import packages:

from google.colab import files

Upload the local file “kaggle.json”

files.upload()

Figure 8: Screenshot of the output of the uploaded file “kaggle.json”

Check if the Colab notebook connects with Kaggle correctly.

!kaggle datasets list

Figure 9: Screenshot showing a dataset list from Kaggle.

Download any competition data from Kaggle (i.e., competition name — predict future sales)

!kaggle competitions download -c competitive-data-science-predict-future-sales

Figure 10: Screenshot showing the output of data downloads from Kaggle.

Data from Kaggle will be downloaded and uploaded in Colab, like:

Figure 11: Screenshot of Google Colab showing the uploaded data from Kaggle.

Read Files from Google Drive

Google Colab provides functionality to read data from google drive too.

Import packages

import globimport pandas as pdfrom google.colab import drive

Mount Google Drive

drive.mount('/gdrive')

This will ask Google to drive authorization code.

Figure 12: Screenshot showing Google Colab asking you to insert Google Drive’s authorization code.

Input box for the authorization code

Click on the link and generate the authorization code.

Read a CSV file from the drive.

file_path = glob.glob("/gdrive/My Drive/***.csv")for file in file_path:
    df = pd.read_csv(file)
    print(df)

Figure 13: Output from the test CSV file on Google Drive.

Setting up Hardware Accelerator GPU for Runtime

Google Colab provides a free cloud service with a GPU hardware accelerator. High configurations GPU machines are very costly and required in machine learning and deep learning to simultaneously process multiple computations.

Nvidia GPU card, image from Nana Dua on Pexels. — Source: Nana Dua on Pexels

Why are GPUs required in Machine Learning or Deep Learning?

Nowadays, GPUs are dominant in machine learning and deep learning due to the optimized capability of more compute-intensive workloads and streaming memory models.

GPUs give outstanding performance through parallelism and can launch millions of threads in one call. They function unusually better than CPUs even though GPUs may have a lower clock speed and the absence of many-core management features compared to a CPU.

Setup Hardware Accelerator GPU in Colab

Steps to setup GPU:

Go to Runtime → Change runtime type.
Select “GPU” from the popup

Figure 14: Screenshot of GPU’s accelerator selection.

Checking details about the GPU in Colab.

Import important packages

import tensorflow as tf

from tensorflow.python.client import device_lib

Check the GPU accelerator

tf.test.gpu_device_name()

Figure 15: Screenshot of GPU’s accelerator in Google Colab.

Check the hardware used for the GPU.

device_lib.list_local_devices()

Figure 16: Screenshot showcasing the details about the GPU on our session. — Figure 16: Screenshot showcasing the details about the GPU in our session.

Code Example Using a GPU

Check the number of available GPUs without selecting GPU in Runtime. Keep it set to “None.”

Figure 17: Hardware accelerator set to none.

import tensorflow as tf

no_of_gpu = len(tf.config.experimental.list_physical_devices('GPU'))

print("Total GPUS: ", no_of_gpu)

Figure 18: Hardware accelerator is None, so the GPU’s value is 0.

Select Hardware accelerator in runtime to GPU.

Figure 19: Screenshot of hardware accelerator set to GPU.

import tensorflow as tf

no_of_gpu =len(tf.config.experimental.list_physical_devices('GPU'))

print("Total GPUS: ", no_of_gpu)

Figure 20: Hardware accelerator is GPU so the value of GPU is 1.

Multiply Tensors on GPU:

try:
    with tf.device('/device:GPU:1'):
        tensor1 = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
        tensor2 = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
        result = tf.matmul(tensor1, tensor2)
        print(result)
except RuntimeError as exception:
    print(exception)

Figure 21: Tensor multiplication result.

Clone a GitHub Repository to Google Drive

GitHub repository code can be cloned and stored into local Google Drive.

Steps to clone the GitHub repository to Google Drive:

Mount Google Drive.

from google.colab import drive
drive.mount('/content/gdrive')

Figure 22: Google Drive successfully mounted in Google Colab.

Enter into Google drive and create a directory named “project.”

%cd gdrive/My Drive/

mkdir project

%cd project/

Clone the repository, i.e.:

!git clone https://github.com/saniyaparveez/youtube_video_type_prediction.git

Figure 24: Screenshot showing how to clone a Github repository.

Check the cloned project.

!ls

Colab Magic

There are some interestingly amazing tricks which Colab offers. It provides multiple commands that provide quick operations in short. These commands are used with a % prefix.

List All Magic Commands

%lsmagic

Figure 26: List of all of Google Colab’s magic commands.

List Local Directories

%ldir

Get Notebook History

%history

CPU Time

%time

How long has the system been running?

!uptime

Display available and used memory

!free -hprint("-"*100)

Display the CPU specification

!lscpu

print("-"*70)

List all running VM processes.

%%sh

echo "List all running VM processes."

ps -ef

echo "Done"

Figure 32: Display all VM running processes.

Embed HTML Text

%%html<marquee>Towards AI is a great publication platform</marquee>

Design HTML Form

#@title Personal Details

#@markdown Information.

Name = 'Peter' #@param {type: "string"}

Age = 25  #@param {type: "slider", min: 1, max: 100}

zip = 1234  #@param {type: "number"}

Date = '2020-01-26'  #@param {type: "date"}

Gender = "Male"  #@param ['Male', 'Female', 'Other']

#@markdown ---print("Submitting the form")

print(string_type, slider_value, number, date, pick_me)

print("Submitted")

Figure 33: Generating a form in Google Colab.

Plotting

Google Colab can be used for data visualization, as well. The following code and graph show a plot containing more than one polynomial, Y = X³+X²+X [3].

x = np.arange(-10,10)

y = np.power(x,3)

y1 = np.power(x,3) + np.power(x,2) + x

plt.scatter(x,y1,c="red")

plt.scatter(x,y)

Figure 35: A graph showing more than one polynomial.

The following code and map are used to graph a heat map.

import matplotlib.pyplot as plt

import numpy as np

import seaborn as sns

length = 10

data = 5 + np.random.randn(length, length)

data += np.arange(length)

data += np.reshape(np.arange(length), (length, 1))

sns.heatmap(data)

plt.show()

TPU (Tensor Processing Unit) in Google Colab

We use Tensor Processing Units (TPUs) for acceleration on a Tensorflow graph. They are an AI accelerator application-specification integrated circuit (ASIC) specially designed for the neural network machines. Google has developed this processing unit.

TPU has a wonder configuration of teraflops, floating-point performance, and others. Each TPU packs up to 180 teraflops of floating-point performance and 64 GB of high-bandwidth memory on a single board. A teraflop is the measurement of a computer’s speed. Its speed can be a trillion floating-point operations per second.

Setup TPU in Colab

Steps to setup a TPU in Google Colab:

Runtime menu → Change runtime

Figure 36: Selection of TPU hardware accelerator. — Figure 37: Selection of TPU hardware accelerator.

Check Running on TPU Hardware Accelerator

It requires the TensorFlow package. Below code, implementations check whether Colab has set TPU accelerator or not?

import tensorflow as tf

try:
    tpu = tf.distribute.cluster_resolver.TPUClusterResolver()  
    print('Running on TPU ', tpu.cluster_spec().as_dict()['worker'])
except ValueError:
    print('Exception')

Figure 38: Running on TPU hardware accelerator.

If the TPU is not set, then this code will encounter an error.

Conclusion

Google Colab or Colaboratory, from Google Research, is a Jupyter notebook environment to execute python-based code to build a machine learning or deep learning model.

It is completely free (unless you would like to go pro) and provides GPU and TPU hardware accelerators. It is effortless to use and share due to the zero-configuration features requirement.

It allows combining executable code and rich text in a single document and images, HTML, LaTex, and others. It has a vital machine learning library like TensorFlow already installed, so it is perfect for machine learning and deep learning model building. Colab is outstanding for developing neural networks.

We can achieve parallelism and execution of multiple threads by using CPU based hardware accelerator. We can share Google Colab notebooks publicly as a tutorial notebook. The insertion of HTML tags and the text’s styling builds an attractive and meaningful notebook for tutorials, and the insertion of text with code is remarkably helpful for explaining code flow and logic.

Data scientists and machine learners can harness Python libraries’ full power to analyze and visualize data, and Google Colab can import data directly from Kaggle and upload code from GitHub.

DISCLAIMER: The views expressed in this article are those of the author(s) and do not represent the views of Carnegie Mellon University nor other companies (directly or indirectly) associated with the author(s). These writings do not intend to be final products, yet rather a reflection of current thinking and being a catalyst for discussion and improvement.

All images are from the author(s) unless stated otherwise.

Published via Towards AI

Resources

Google colab implementation.

Github repository.

References

[1] Google Colab, https://colab.research.google.com/

[2] Python 2 Deprecation, Google Colab, Google, https://research.google.com/colaboratory/faq.html#python-2-deprecation

[3] Machine Learning Algorithms for Beginners with Code Examples in Python, Pratik Shukla, Roberto Iriondo, Towards AI, https://towardsai.net/p/machine-learning/machine-learning-algorithms-for-beginners-with-python-code-examples-ml-19c6afd60daa

[4] Project Jupyter, https://jupyter.org/

[5] Google Colab, FAQ, https://research.google.com/colaboratory/faq.html

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Google Colab 101 Tutorial with Python — Tips, Tricks, and FAQ

An in-depth tutorial on how to use Google Colab with Python, along with Colab’s tips, tricks, and FAQ

Table of Contents

Introduction

Why We Use Google Colab?

Start Google Colab

Uploading a Notebook from GitHub

Uploading Data from Kaggle

Read Files from Google Drive

Setting up Hardware Accelerator GPU for Runtime

Why are GPUs required in Machine Learning or Deep Learning?

Setup Hardware Accelerator GPU in Colab

Code Example Using a GPU

Clone a GitHub Repository to Google Drive

Colab Magic

Plotting

TPU (Tensor Processing Unit) in Google Colab

Conclusion

Resources

References

Roberto Iriondo

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥