Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

Time Series Forecasting using PyCaret
Latest   Machine Learning

Time Series Forecasting using PyCaret

Last Updated on January 29, 2024 by Editorial Team

Author(s): Rakesh M K

Originally published on Towards AI.

Source: pycaret — Search Images (bing.com)

PyCaret.

PyCaret is a low-code Python library for Machine Learning that automates workflows and offers efficient end-to-end experiments. Automating major steps in evaluation and comparing models makes it easier for us to examine the performance of different models and select the best one. On this page, let’s see how to approach a time series forecasting problem of familiar airline data with PyCaret.

Install and import PyCaret.

pip install pycaret
import pycaret

Prepare data.

Data available within the PyCaret library can be imported with pycaret.datasets.get_data(). Otherwise, import using pandas from the device as usual.

'''load Airline data'''

import pandas as pd
data = pd.read_excel('Airlines_data.xlsx')
data.set_index('Month',inplace=True)
# data = pycaret.datasets.get_data('airlines) # to load from pycaret library
data.head()
Airlines data

Setup Environment.

The setup() function initializes the environment for training and creates the transformation pipeline. It is important to call setup() before executing any other function. For the Time Series experiment, the following parameters should be passed to the setup function.

fh: forecasting horizon

fold: number of folds for cross validation

session_id: random seed for experiment

'''Import Time Series libraries'''
from pycaret.time_series import *

'''Initialize the training environment'''
s = setup(data, fh = 3, fold = 5, session_id = 101)

As you can see in the above table, AutoML analyzes the time series data and provides a detailed summary, especially about seasonality. Have a look into entries Seasonality Present and Significant Seasonal Period(s) in the above table generated by AutoML.

Choose API and Compare Models.

PyCaret has two API’s. Functional and OOP. Functional API follows a procedural approach and offers a straightforward interface. Since the settings are applied globally across functions, this API is convenient for quick experiments. The OPP API follows an object-oriented approach, providing a customizable and modular workflow and, hence, more suitable for running multiple experiments independently. In this experiment, I am using functional API (Codes for OOP API are commented).

The compare_model() function returns a table containing different models and performance metrics. It can be observed from the table that Exponential smoothing outperforms all other models.

'''Compare models using Functional API'''
best = compare_models()

# '''OOP API'''
# best = s.compare_models()

Plot the Forecast and Diagnose Residuals.

Now let’s plot the forecast using plot_model() function. In data_kwargs the dictionary, we can also give keywords for the plot, such as title, legend, opacity, height, width etc. as per requirement.

'''Plot forecasting using functional API'''
plot_model(best, plot = 'forecast', data_kwargs = {'fh' : 24})

# OOP API
# s.plot_model(best, plot = 'forecast', data_kwargs = {'fh' : 24})

The model’s prediction on training data (in sample) can be plotted as below.

'''Insample plot using functional API'''
plot_model(best, plot = 'insample')

# OOP API
# s.plot_model(best, plot = 'insample')

To diagnose the residuals, we can use the same plot_model() function with key word plot = ‘diagnostics’ as below.

'''Diagnose best model using functional API'''
plot_model(best, plot = 'diagnostics')

# OOP API
# s.plot_model(best, plot = 'diagnostics')

Forecast using Best Model

Since we have Exponential Smoothing as our best model, let’s forecast and plot for a horizon of 24.

'''Forecast using functional API'''
forecast = predict_model(best, fh = 24)

# OOP API
# s.predict_model(best, fh = 24)

'''Plot forecast'''
import matplotlib.pyplot as plt
plt.plot(forecast.index.strftime('%Y-%m'), forecast['y_pred'].astype(float),label = 'forecast')
plt.legend(loc=[1,1])
plt.xticks(rotation=45);

Save and Load the Best Model.

The model can be saved using save_model() function that saves the model and returns the model architecture.

'''Save model using functional API'''

finalModel = finalize_model(best)
save_model(finalModel, 'bestModel')

# OOP API
# s.save_model(finalModel, 'bestModel')

Saved model can be loaded using load_model() function as follows.

'''Load saved model using functional API'''
loaded_model = load_model('bestModel')
# OOP API
# loaded_model = s.load_model('my_final_best_model')

Summary.

We have observed on this page that PyCaret simplifies tasks such as data preprocessing, model selection, hyperparameter tuning, and model evaluation, reforming the workflow for users. The AutoML capability significantly reduces the coding effort required for developing robust machine learning models, making it an efficient and user-friendly tool for both beginners and experienced data scientists. Also, PyCaret can be used for other machine-learning tasks such as Regression, Classification, Clustering, Anomaly detection, etc.

References.

  1. PyCaret 3.0 — Docs (gitbook.io)
  2. https://medium.com/python-in-plain-english/pycaret-simplified-mastering-low-code-machine-learning-e0277faed913

U+2709️ [email protected]

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓