
Custom dataset with Hailo AI Hat, Yolo, Raspberry PI 5, and Docker
Last Updated on April 24, 2025 by Editorial Team
Author(s): Luiz doleron | Luiz d’Oleron
Originally published on Towards AI.
The Hailo AI Hat
Depending on your setup, running Yolo on the RPI 5 CPU provides 1.5 to 8 frames per second (FPS). Even though this performance is impressive for a small device, it is not fast enough for many real-time applications. If you need higher performance, external hardware is required. One of the best options right now is the Hailo AI Hat for Raspberry Pi 5.
Hailo is a chip manufacturer focused on developing artificial intelligence hardware. For this story, I am interested in the AI Hat+, a device they have built specifically for Raspberry Pi 5:

There are two versions of this hat, one running the Hailo-8 chip, which is said to provide 26 TOPS (Tera Operations Per Second), and the other running the Hailo-8L, which offers 13 TOPS.
In this story, I will use the Hat version shipping the Hailo8 architecture.
Why Docker
Be prepared to download & install literally gigabytes of third-party libraries. We will use Docker containers as isolated environments to configure and install everything we need without modifying our host machine.
That said, we need to set up two different Docker containers:
- Yolov5 container: This container has two missions. First, we will train the model using our custom dataset. Second, we will convert the model to the ONNX format.
- Hailo container: This container is used to convert the ONNX file to the Hailo HEF format.
Any attempt to use the same Docker container for both tasks results in unnecessary pain due to conflicts of libraries such as numpy. Trust me: save your time by using Docker for what it is designed for!
The first Dockerfile
is provided by Hailo. The second Dockerfile
is available later in this story.
The dataset used in this example
I will use the Tech Zizou Labeled Mask Dataset. You can find it here.

Download the file from Kaggle and uncompress it as follows:
mkdir source
unzip -qq archive.zip -d source/
The files in source/obj
are not provided in the structure that Yolo expects. Hopefully, the code below fixes this:
import os, shutil, random
# preparing the folder structure
full_data_path = 'source/obj/'
extension_allowed = '.jpg'
split_percentage = 90
images_path = 'datasets/images/'
if os.path.exists(images_path):
shutil.rmtree(images_path)
os.mkdir(images_path)
labels_path = 'datasets/labels/'
if os.path.exists(labels_path):
shutil.rmtree(labels_path)
os.mkdir(labels_path)
training_images_path = images_path + 'train/'
validation_images_path = images_path + 'val/'
training_labels_path = labels_path + 'train/'
validation_labels_path = labels_path +'val/'
os.mkdir(training_images_path)
os.mkdir(validation_images_path)
os.mkdir(training_labels_path)
os.mkdir(validation_labels_path)
files = []
ext_len = len(extension_allowed)
for r, d, f in os.walk(full_data_path):
for file in f:
if file.endswith(extension_allowed):
strip = file[0:len(file) - ext_len]
files.append(strip)
random.shuffle(files)
size = len(files)
split = int(split_percentage * size / 100)
print("copying training data")
for i in range(split):
strip = files[i]
image_file = strip + extension_allowed
src_image = full_data_path + image_file
shutil.copy(src_image, training_images_path)
annotation_file = strip + '.txt'
src_label = full_data_path + annotation_file
shutil.copy(src_label, training_labels_path)
print("copying validation data")
for i in range(split, size):
strip = files[i]
image_file = strip + extension_allowed
src_image = full_data_path + image_file
shutil.copy(src_image, validation_images_path)
annotation_file = strip + '.txt'
src_label = full_data_path + annotation_file
shutil.copy(src_label, validation_labels_path)
print("finished")
The code assumes the data is in the source/obj/
folder and outputs the data into thedatasets
folder. Name the file astidy_data.py
and run it as follows:
mkdir datasets
python tidy_data.py

We end up with the following structure:

Some things to keep in mind here:
- This dataset has only two classes:
using mask
andwithout mask
- There are only 1,359 training images and 151 validation images
The training data is small. Training a model from scratch only using this data will produce very poor models, a scenario known as overfitting.
We are not going into the modeling details here. Anyway, to make things easier, we will use a technique called Transfer Learning, which simply consists of feeding pretrained weights into the model before the training starts. In particular, we will use the weights provided by Ultralytics, trained using the COCO database.
YOLOv5
The last Ultralytics Yolo version is 11. It is faster and more accurate than YOLOv5. But this does not mean Yolov5 is obsolete at all. Indeed, Ultralytics clearly says Yolov5 is preferable in some specific scenarios.
Check the one-to-one Yolo 11 x Yolov5 comparison here to learn more.
I have a strong reason to avoid using Yolo 11 in this story: Hailo stack does not support Yolo 11 yet.
If you really donβt want to use Yolov5, you can adapt this story to Yolo 8 with little effort.
Linux, friend, Linux!
This story uses Linux. Ubuntu LTS, being clear.
For AI development, I recommend using Ubuntu 20.04 or 22.04. LTS all the way!
Mission briefing

The whole process consists of three simple steps:
- Step 1: Training the custom model β In this step, we use our custom data plus the pretrained YOLOv5 weights to train a model to perform our detection task (in our case, detect faces with or without a facemask). The output of this step is a pytorch
best.pt
file. This file holds only the parameter values of our model. - Step 2: Convert
best.pt
to the ONNX format β ONNX is an open format for machine learning models. The output of this step is abest.onnx
file. - Step 3: Convert
best.onnx
to the HEF format β The Hailo Executable Format is a highly optimized model specifically designed to run on Hailo chips. In this step, we will convert the ONNX file into a HEF file.
Once we have the model in the .hef
format, we simply deploy it on the Raspberry PI and test it.
Step 1: Training your custom data
There is nothing really new in training the model for the Hailo architecture. You can train your model as usual.
Just skip this step if you have your model already. Otherwise, keep reading here.
First, if Docker is not on your system, install it. Also, install the NVIDIA Container Toolkit.
Hailo shares a GitHub repository with the required resources. Clone it:
git clone https://github.com/hailo-ai/hailo_model_zoo

We are looking for the YOLOv5 training Dockerfile
at thehailo_model_zoo/training/yolov5
folder. Move to this folder and build the image using the following command:
cd hailo_model_zoo/training/yolov5
docker build -t yolov5:v0 .

Now, run the container:
docker run -it --name custom_training --gpus all --ipc=host -v /home/doleron/hailo/shared:/home/hailo/shared yolov5:v0
In a nutshell, the flag-it
asks Docker to run the container in iterative mode, which is necessary to execute the commands afterwards.
If you are new to Docker, read the Docker 101 tutorial.
The parameter -v /home/doleron/hailo/shared:/home/hailo/shared
maps the folder /home/doleron/hailo/shared
on my machine in the folder /home/hailo/shared
on the container machine.
--gpus all
signals Docker to use any GPU available on the host machine.
Now, we are prompting inside the container. We can check the contents of /home/hailo/shared
to make sure our dataset files are there:
ls /home/hailo/shared/ -als

This container does not have the nano
editor. Unless you are a Vim user, I recommend installing nano
as follows:
sudo apt update
sudo apt install nano -y
After installing nano, we can move forward and set up our training. Copy the datasets
folder into the workspace
folder:
cp -r /home/hailo/shared/datasets ..

Now, write the data/dataset.yaml
file:
nano data/dataset.yaml
This is the content of data/dataset.yaml
:
train: ../datasets/images/train
val: ../datasets/images/val
nc: 2
names:
0: 'using mask'
1: 'without mask'
Type control-x
, y
, and Enter
to save the file and exit nano.

It is time to train our model! Make sure you are in the/workspace/yolov5
folder and type:
python train.py --img 640 --batch 16 --epochs 100 --data dataset.yaml --weights yolov5s.pt
If you get an error like
RuntimeError: CUDA out of memory.
try to reduce--batch 16
to--batch 8
or even less.
I hope youβre familiar with basic machine learning terminology: batches, epochs, etc. You can tweak these hyperparameters following this guide.
If everything is good, your GPU will start burning:

HOORAY!! My RTX 4070 is burning!

Keep it under 81Β° Celsius and youβre fine.

In my case, this training took roughly 40 minutes.
As reference, using another machine with GTX 1080 took around 2 hours
In the end, you have something like this:

This means the training is done. We can check the results in theruns/exp0
folder. Copy this folder to the shared area:
mkdir /home/hailo/shared/runs
cp -r runs/exp0 /home/hailo/shared/runs/
You end up with a folder like this:

We can check the training results:

Comparing the top row charts (training performance) and the second row charts (validation performance), we find that the model does not overfit.
We can also check some inference examples:

It is always worthwhile to mention that using a different validation set of instances is primary required to asses the model quality.
The model can be improved using regular machine learning engineering to achieve an even higher performance. However, this is not our focus right now.
Remember: we are focusing in learning how to use models like this on Raspberry Pi/Hailo AI Hat.
That said, letβs move to the next step!
Step 2: Converting the best.pt file to ONNX
Back to the container, the best weight file is the file runs/exp0/weights/best.pt
. We can convert it to ONNX using the following command:
python3 models/export.py --weights runs/exp0/weights/best.pt --img 640
Note that best.onnx
is generated:

Copy best.onnx
to the host machine:
cp runs/exp0/weights/best.onnx /home/hailo/shared/
We are done with this container. Exit it if you want.
Step 3: Converting ONNX to HEF
The easiest part of this tutorial was training a custom model with Yolov5 and converting the result file to ONNX. Now, it is time to compile the ONNX file into the proprietary Hailo Executable Format (HEF).
Start a new terminal anywhere and write thisDockerfile
:
# using a CUDA supported Ubuntu 22.04 image as base
FROM nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04 AS base_cuda
ENV DEBIAN_FRONTEND=noninteractive
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
RUN apt-get update && \
apt-get install -y \
# see: the Hailo DFC user guide
python3.10 \
python3.10-dev \
python3.10-venv \
python3.10-distutils \
python3-pip \
python3-tk \
graphviz \
libgraphviz-dev \
libgl1-mesa-glx \
# utilities
python-is-python3 \
build-essential \
sudo \
curl \
git \
nano && \
# clean up
rm -rf /var/lib/apt/lists/*
# update pip
RUN python3 -m pip install --upgrade pip setuptools wheel
WORKDIR /workspace
Save it asDockerfile
and build the image using the following command:
docker build -t hailo_compiler:v0 .
Once the image is built, start the container as follows:
docker run -it --name compile_onnx_file --gpus all --ipc=host -v /home/doleron/hailo/shared:/home/hailo/shared hailo_compiler:v0
This command gives us a command prompt inside the container machine:

This looks like a deja vu. We just performed similar steps in the previous section. So what?
This is the point: we are mounting a second isolated container to install the Hailo stuff without caring about conflicts with other libraries. In particular, we need to install three packages:
- Hailort: the Hailo runtime platform
- Hailort Wheel: the Hailort Python library
- Hailo DFC: Hailo Dataflow Compiler
The FOSS community is used to the open-source ecosystem. In this context, everything is eligible to install from publicly available repositories. On the other hand, Hailo is playing in the AI market, a challenging and wild business world. As a result, their software is not open-source yet. Hopefully, the Hailo software is at least free.
To use the Hailo stuff, we must create an account on Hailo Network, access the software download page, and download three packages:
- hailort_4.21.0_amd64.deb
- hailort-4.21.0-cp310-cp310-linux_x86_64.whl
- hailo_dataflow_compiler-3.31.0-py3-none-linux_x86_64.whl

Save them somewhere in the shared folder and copy them into the container:
cp /home/hailo/shared/libs/* .
Before installing the software, create a virtual environment for Python and activate it:
python python -m venv .venv
source .venv/bin/activate
Then, install the Hailo RT bundle:
dpkg -i ./hailort_4.21.0_amd64.deb

Next, install Hailo RT Python API:
pip install ./hailort-4.21.0-cp310-cp310-linux_x86_64.whl
Now, install Hailo DFC:
pip install ./hailo_dataflow_compiler-3.31.0-py3-none-linux_x86_64.whl
Note that the package versions represent the current stage of Hailo software during the writing of this story. They must match the container Python version (3.10).
We are not done yet. We must clone and install hailo_model_zoo
:
git clone https://github.com/hailo-ai/hailo_model_zoo.git
cd hailo_model_zoo
pip install -e .
Check if hailomz
is properly set:
hailomz --version

Hold on! The hardest part is now: compiling the best.onnx
file into the best.hef
file.
To make this work, we need to change the number of classes in hailo_model_zoo/cfg/postprocess_config/yolov5s_nms_config.json
to 2:

Note that there is a
hailo_model_zoo
folder inside thehailo_model_zoo
repository!
Before starting the compiler, set the USER environment variable:
export USER=hailo
Now, call hailomz
as follows:
hailomz compile --ckpt /home/hailo/shared/best.onnx --calib-path /home/hailo/shared/datasets/images/train/ --yaml hailo_model_zoo/cfg/networks/yolov5s.yaml
Take your time. Wait 10 minutes while hailomz
optimize and compile your model:

If everything is good, you get the following message:

Note that converting ONNX to HEF includes a new element: the calibration images. Calibration images are examples of the feature space used by the Hailo compiler to optimize the model. I didnβt find any documentation here, but once thehailomz
compiler warned me to use more than 1024 instances. Thus, using the same training would seem to work.
Copy yolov5s.hef
to the shared area:
cp yolov5s.hef /home/hailo/shared/
The hardest part is done. We can quit the container instance.
Deploying the model on Raspberry PI 5
Copy yolov5s.hef
to Raspberry Pi.
The details of running Hailo applications on RPI is beyond the scope of this story.
On Raspberry Pi, run the following commands:
git clone https://github.com/hailo-ai/hailo-rpi5-examples.git
cd hailo-rpi5-examples
source setup_env.sh
python basic_pipelines/detection.py --labels-json custom.json --hef-path /home/pi/Documents/yolov5s.hef --input /home/pi/Documents/videoplayback.mp4 -f
where custom.json
is:
{
"detection_threshold": 0.5,
"max_boxes":200,
"labels": [
"unlabeled",
"with mask",
"without mask"
]
}
The result of using this video is:

The object detection achieves 30 fps even on HD resolution. This is impressive! You can explore other input types, for example:
python basic_pipelines/detection.py --labels-json custom.json --hef-path /home/pi/Documents/yolov5s.hef --input usb -f
or
python basic_pipelines/detection.py --labels-json custom.json --hef-path /home/pi/Documents/yolov5s.hef --input rpi -f
Check the Hailo RPI example repository for more parameters and usage examples.
Using other Yolo versions
It is noteworthy that, at the time of this writing, the Hailo model compiler has been tested only with Yolo3, Yolo4, Yolov5, Yolov8, and YoloX.
Check the Hailo Developer Zone to know when the Hailo compiler will support the earlier Yolo versions.
Conclusion
We showed the complete sequence of steps to train a custom dataset, compile, and deploy the model on a Raspberry PI 5 using the Hailo AI Hat.
Iβm looking forward to figure out what the AI Hat can do. But this is a talk for another story.
How to contribute
Neither Hailo nor Ultralytics paid me to write this story. If you liked this tutorial and want to contribute, check a charity institution near you and provide your assistance with money and/or time. No citation is needed. You neednβt let me know either.
Useful links
The story already includes the necessary links to the reader. However, there are some links I want to cite. First, the text from @mgreiner79 was one of my first sources and really helpful:
How to run Hailo Dataflow Compiler using Docker on Windows
If youβre into computer vision and enjoy tinkering, you probably have a Raspberry Pi paired with the Hailo AI Kit β aβ¦
medium.com
My old story introducing Custom Datasets with Yolov5 also helped me to write this new story:
Training YOLOv5 custom dataset with ease
YOLOv5 is one of the most high-performing object detector out there. It is fast, has high accuracy and is incrediblyβ¦
medium.com
Have a question? Drop a message and I will reply as soon as possible.
Regards,
Luiz
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI