![Unifying Scikit-learn Pipelines and PyTorch Models with ONNX for Deployment Unifying Scikit-learn Pipelines and PyTorch Models with ONNX for Deployment](https://i2.wp.com/miro.medium.com/v2/resize:fit:512/1*63MtBc3S4aHRXWD20n3w9Q.png?w=1920&resize=1920,1920&ssl=1)
Unifying Scikit-learn Pipelines and PyTorch Models with ONNX for Deployment
Last Updated on February 10, 2025 by Editorial Team
Author(s): Fernando Nieuwveldt
Originally published on Towards AI.
I have blogged in the past about a common machine learning design pattern involving feature engineering and preprocessing when working with an scikit-learn pipeline alongside a deep learning framework where your model is built. This could be PyTorch or TensorFlow Keras. The issue with this workflow is that it creates a disconnect between preprocessing, model building, and ultimately model deployment, making it a flawed approach.
Previously, I wrote about this issue specifically in cases where the model is built in one of the major deep learning frameworks. The solution was to implement feature engineering and encoding natively within the deep learning framework of choice. You can find my posts on this here and here.
Integrating preprocessing and feature engineering into your neural network architecture definitely has advantages. For example, having a complete blueprint of your model in a PyTorch model architecture allows you to serve it efficiently using something like TorchServe or Tensorflow Serving for optimized, fast inference. This led to writing this post to explore whether thereβs a better way to handle this. Sure, fully implementing your solution with preprocessing, feature engineering natively in PyTorch or Keras has many advantages, but it also introduces a some complexity. Thatβs when I stumbled upon ONNX and how it could help us with the design pattern above. ONNX (Open Neural Network Exchange) is an open standard format for representing machine learning models, enabling interoperability between different frameworks like PyTorch, TensorFlow, and scikit-learn.
Using ONNX we can still build our feature pipelines and models in the framework of our choice for our training pipeline and have and extra bridge to ONNX for our inference pipeline. We should see Scikit-Learn and PyTorch(just in this scenario being discussed) as model-building or training frameworks β they are not built for production or inference at scale especially Scikit-Learn. This is what we will explore in this post.
What Weβll Cover
We will discuss how to leverage our favorite model-building frameworks for training while using ONNX for inference and production. Specifically, we will:
- Use the heart dataset for illustration
- Separate and decouple training from inference at the framework level, creating a more compute optimised workflow.
- Convert SKLearn preprocessing pipeline and PyTorch model to ONNX format
- Unify the preprocessing pipeline and model from ONNX format into an optimised ONNX graph
- Discuss some benefits in this approach or ONNX in general
This is the ONNX Graph we will build:
Installation and imports
!pip install pandas onnx onnxruntime torch lightning skl2onnx netron
import numpy as np
import pandas as pd
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import StandardScaler, OneHotEncoder
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.nn as nn
import pytorch_lightning as pl
from torch.utils.data import DataLoader, TensorDataset, random_split
import onnx
import onnxruntime as ort
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType, Int32TensorType
Setting up the preprocessing pipeline using SKLearn
Next we will apply scaling and one-hot encoding to our features using sklearn
file_url = "http://storage.googleapis.com/download.tensorflow.org/data/heart.csv"
dataframe = pd.read_csv(file_url)
labels = dataframe.pop("target")
NUMERICAL_FEATURES = ['age', 'trestbps', 'chol', 'thalach', 'oldpeak', 'slope']
CATEGORICAL_FEATURES = ['sex', 'cp', 'fbs', 'restecg', 'exang', 'ca']
# Define transformers for numerical and categorical features
numerical_transformer = StandardScaler()
categorical_transformer = OneHotEncoder(handle_unknown="ignore")
# Create ColumnTransformer
preprocessor = ColumnTransformer(
transformers=[
("numerical_transformer", numerical_transformer, NUMERICAL_FEATURES),
("categorical_transformer", categorical_transformer, CATEGORICAL_FEATURES),
]
)
preprocessor.fit(dataframe)
transformed_data_example = preprocessor.transform(dataframe)
Training model with PyTorch
The goal of this article is not really to train a model, but for completeness we chose torch lightning for the easiest and quickest way to get setup for training:
N_FEATURES = transformed_data_example.shape[1]
dataset = TensorDataset(
torch.from_numpy(transformed_data_example).float(),
torch.from_numpy(labels.values.reshape((-1, 1))).float()
)
train_size = int(0.8 * len(dataset))
val_size = len(dataset) - train_size
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])
train_dataloader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_dataloader = DataLoader(val_dataset, batch_size=32, shuffle=False)
class Model(pl.LightningModule):
def __init__(self, n_features):
super(Model, self).__init__()
self.fc1 = nn.Linear(n_features, 32)
self.fc2 = nn.Linear(32, 16)
self.fc3 = nn.Linear(16, 1)
def forward(self, x):
x = self.fc1(x)
x = F.relu(x)
x = self.fc2(x)
x = F.relu(x)
x = self.fc3(x)
# Apply sigmoid to obtain output probabilities
x = torch.sigmoid(x)
return x
def training_step(self, batch, batch_idx):
x, y = batch
y_hat = self(x)
# Ensure y is a float tensor and has the same shape as y_hat
loss = F.binary_cross_entropy(y_hat, y.float())
self.log('train_loss', loss, prog_bar=True)
return loss
def validation_step(self, batch, batch_idx):
x, y = batch
y_hat = self(x)
loss = F.binary_cross_entropy(y_hat, y.float())
self.log('val_loss', loss, prog_bar=True)
return loss
def configure_optimizers(self):
optimizer = torch.optim.Adam(self.parameters(), lr=1e-3)
return optimizer
model = Model(n_features=N_FEATURES)
trainer = pl.Trainer(max_epochs=20)
trainer.fit(model, train_dataloader, val_dataloader)
We now have preprocessing pipeline build with sklearn and the model built with PyTorch. We are now done with our model training and build process. In the next section we will start with an optimised unified ONNX model for model inference.
Converting to ONNX
In this section we will start the conversion process to ONNX format. This will entail three subsections:
- Converting the Preprocessing Pipeline to ONNX.
- Converting the Torch Model to ONNX.
- Unifying Preprocessing Pipeline and Models with ONNX.
Converting the Scikit-Learn Preprocessing Pipeline to ONNX
Here we will convert the scikit learn pipeline to ONNX. We will also get the input data ready for an onnx inference session.
# convert and save preprocessor
# Cast features to appropriate onnx format
# - numerical features ---> FloatTensorType
# - categorical features ---> Int32TensorType
initial_numerical_types = [
(numerical_feature, FloatTensorType([None, 1]))
for numerical_feature in NUMERICAL_FEATURES
]
initial_categorical_types = [
(categorical_feature, Int32TensorType([None, 1]))
for categorical_feature in CATEGORICAL_FEATURES
]
initial_types = initial_numerical_types + initial_categorical_types
onnx_model = convert_sklearn(preprocessor, initial_types=initial_types)
with open("preprocessing_sklearn_pipeline.onnx", "wb") as f:
f.write(onnx_model.SerializeToString())
Using the netron package we can create preprocessing subgraph
Converting the Torch Model to ONNX
In this section we will convert the PyTorch model. ONNX comes installed torch.
# use one output of the onnx preprocessing outputs to export torch model
dummy_input = torch.tensor(transformed_data_example, dtype=torch.float32)
torch.onnx.export(
model,
dummy_input,
"torch_model.onnx",
input_names=["input"],
output_names=["output"],
)
This is PyTorch model graph
Unifying the SKLearn preprocessing pipeline with our PyTorch Neural Network
We have both the scikit learn preprocessing pipeline and the torch model saved to onnx format. The last part is to unify these onnx models into one graph. This can be bit tricky.
We need to make sure:
- The input name to the torch model graph is the output of the preprocessing subgraph.
- The Pytorch model node should also explicitly use the preprocessing output.
- Set the opset version; we set it to 17.
- Create a new ONNX graph and then model.
- Saving the new model.
# Load the preprocessing ONNX model
preprocessing_model = onnx.load("preprocessing_sklearn_pipeline.onnx")
# Load the PyTorch ONNX model
pytorch_model = onnx.load("torch_model.onnx")
# Get the preprocessing model's output name
preprocessing_output_name = preprocessing_model.graph.output[0].name
# Update PyTorch model input name to match preprocessing output
pytorch_model.graph.input[0].name = preprocessing_output_name # Rename input
# Ensure the PyTorch model input correctly refers to preprocessing model output
assert (
preprocessing_output_name == pytorch_model.graph.input[0].name
), "Mismatched preprocessing output and PyTorch input!"
# Update PyTorch model nodes to explicitly use the preprocessing output
# And we only do it for the first node
for counter, node in enumerate(pytorch_model.graph.node):
if counter == 0:
node.input[0] = preprocessing_output_name
break
# Combine the two graphs
combined_graph = onnx.helper.make_graph(
nodes=list(preprocessing_model.graph.node) + list(pytorch_model.graph.node),
name="UnifiedPipeline",
inputs=preprocessing_model.graph.input,
outputs=pytorch_model.graph.output,
initializer=list(preprocessing_model.graph.initializer)
+ list(pytorch_model.graph.initializer),
)
unified_opset_version = 17 # Adjust based on your opset version
# Combine the existing opset imports from both models
combined_opset_import = [
onnx.helper.make_opsetid("ai.onnx.ml", 1), # Scikit-learn opset
onnx.helper.make_opsetid("", unified_opset_version),
]
# Create a new ONNX model with the combined graph
combined_model = onnx.helper.make_model(
combined_graph, opset_imports=combined_opset_import
)
# Save the combined model
onnx.save(combined_model, "unified_model.onnx")
This contains our full inference Graph containing preprocessing and model.
Comparisons
In this section we run the two scenarios with and without ONNX conversion and:
- Compare results
- Calculate execution time
One thing to note is that we need convert our pandas data frame to dictionaries for the onnx inference session.
# Prepare input data as dictionaries with numpy type values
numerical_input_data = {
feature: dataframe[feature].to_numpy().reshape(-1, 1).astype(np.float32)
for feature in NUMERICAL_FEATURES
}
categorical_input_data = {
feature: dataframe[feature].to_numpy().reshape(-1, 1).astype(np.int32)
for feature in CATEGORICAL_FEATURES
}
# combine numerical and categorical features into one dict
input_data = {**numerical_input_data, **categorical_input_data}
# Load the unified ONNX model
session = ort.InferenceSession("unified_model.onnx")
onnx_outputs = session.run(None, input_data)
# # calling model seperately
transformed_data = preprocessor.transform(dataframe)
non_onnx_outputs = model(torch.tensor(transformed_data, dtype=torch.float32))
# Check if arrays are nearly equal
are_equal = np.allclose(
onnx_outputs[0],
non_onnx_outputs.detach().numpy(),
rtol=1e-6, atol=1e-6
)
print(are_equal)
# True
Below we ran some speed comparisons and saw an increase of 20 to 30 fold faster on a MacBook M1
import time
onnx_times = []
for k in range(1000):
start_time = time.time()
_ = session.run(None, input_data)
onnx_times.append(time.time() - start_time)
avg_onnx_time = np.mean(onnx_times)
sequential_execution_time = []
for k in range(1000):
start_time = time.time()
transformed_data = preprocessor.transform(dataframe)
non_onnx_outputs = model(torch.tensor(transformed_data, dtype=torch.float32))
sequential_execution_time.append(time.time() - start_time)
avg_sequential_execution_time = np.mean(sequential_execution_time)
print(f"ONNX Inference Time: {avg_onnx_time:.6f} seconds")
print(f"Sequential Inference Time: {avg_sequential_execution_time:.6f} seconds")
print(f"ONNX speed increase {avg_sequential_execution_time/avg_onnx_time}")
ONNX Inference Time: 0.000096 seconds
Sequential Inference Time: 0.002689 seconds
ONNX speed increase 27.905641984838585
Advantages of Using ONNX for Unification
Below we will list some advantages of building a unified pipeline:
Single Model Deployment:
- Simplifies the deployment process by handling both models within one ONNX file.
- While we still use our favourite frameworks for model building.
Reduced External Dependencies:
- A unified model runs entirely within ONNX Runtime β meaning you donβt need additional libraries (like NumPy or separate frameworks) during inference.
- If we used something like fastapi we would have to install sklearn and pytorch with the docker image which would make it quite bulky.
- Even if we go onnx inside fastapi we would still numpy installed and the extra latency from preprocessing and than run model.
Optimized Performance:
- ONNX Runtime can apply graph-level optimizations across the entire merged pipeline, potentially reducing latency and improving inference speed.
- Unifying the graphs ensures that data seamlessly flows from preprocessing to prediction.
Easier Maintenance and Debugging:
- With a unified pipeline, thereβs a single model to monitor, version, and debug. This helps isolate issues within one cohesive framework rather than juggling multiple independent models.
Optimized Docker images:
- We only need onnxruntime at inference time which results in small docker images which can assist with quicker horizontal scaling with sudden bursts.
Conclusion
In this post we looked at a solution to the common problem and design pattern of using your model training framework for inference and production purposes as well. More specifically we looked at the combination of scikit learn feature engineering and preprocessing pipelines and how can have a more optimized deployment pattern when we combine it with a PyTorch model.
We build the preprocessing pipeline and saved it to an onnx Graph, the same for the model. The next part we unified these two graph into a single ONNX graph that will be used for Deployment or Production purposes. We gave some advantages of this approach and also a big gain in execution. One of my favourite visuals was the Graph we created showing how everything fits in and showing all computations part of the Graph.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI