Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Unlock the full potential of AI with Building LLMs for Productionβ€”our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

LLM-Mixer: Multiscale Mixing in LLMs for Time Series Forecasting
Latest   Machine Learning

LLM-Mixer: Multiscale Mixing in LLMs for Time Series Forecasting

Last Updated on November 3, 2024 by Editorial Team

Author(s): Reza Yazdanfar

Originally published on Towards AI.

Time Series

Time series is crucial in different fields, for example, in finance, accurate time series forecasting helps in stock market predictions and risk management, enabling better investment decisions!

Time series forecasting in easy words is more of using the past data, and forecasting the future data; something that we humans, intelligent creatures, do it intuitively more than often (don’t believe me? go and talk with any financial analyst or trader…).

Time series forecasting is challenging because of it inherent complexities of real-world data, which often represents nonlinear, non-stationary, and multivariate characteristics. Moreover time series data features multiple time scales, with short-term fluctuations and long-term trends that traditional models struggle to capture simultaneously.

LLMs in time series

Openai showed the power of LLMs by introducing GPT3.5 and later GPT4, and ever since everything changed. Most researchers realized the power of scaling, and since then, the research in LLMs and AI in general multiplied.

One line of work has been using the power of LLMs in other domains of AI like time series forecasting. LLMs are particularly attractive for time series forecasting due to their abilities in few-shot or zero-shot transfer learning, multimodal knowledge integration, and complex reasoning.

Time series data often represents continuous and irregular patterns, unlike the discrete tokens that LLMs are typically designed to process. Not to mention that time series usually come with multiple time scales, ranging from short term fluctuations to long term trends.

The level of complexity is quite lucrative in my humble opinion, and well, consequently challenging! 🔥😄

Problem in short

The challenge of accurately forecasting time series data, which often involves complex multiscale temporal patterns.

Solution in short: LLM-Mixer

LLM-Mixer adapts LLMs for time series forecasting by decomposing the data into multiple temporal resolutions. This approach allows the LLM to better understand and model the complex patterns within the time series data, capturing both short-term and long-term dependencies effectively. Through multiscale time-series decomposition combined with LLMs, LLM-Mixer achieves competitive performance and improves forecasting accuracy across various datasets and forecasting horizons.

Figure 1: The LLM-Mixer framework for time series forecasting. Time series data is downsampled to multiple scales and enriched with embeddings. These multiscale representations are processed by the Past-Decomposable-Mixing (PDM) module and then input into a pre-trained LLM, which, guided by a textual description, generates the forecast.

LLM-Mixer in details: Architecture

1) Data Downsampling and Embedding:

We start by downsampling the time series data into multiple temporal resolutions to capture both short-term fluctuations and long-term trends.

These multiscale series are then enriched with three types of embeddings: token, temporal, and positional embeddings.

2) Token, Temporal, and Positional Embeddings:

We calculate token embeddings by 1D convolutions, temporal embeddings encode information such as day, week, and month, and positional embeddings encode the sequence positions. These embeddings transform the multiscale time series into deep feature representations.

3) Past-Decomposable-Mixing (PDM) Module:

The multiscale representations are processed by the PDM module, which mixes past information across different scales. The PDM module breaks down complex time series data into separate seasonal and trend components, allowing for targeted processing of each component.

4) Pre-trained Large Language Model (LLM) Processing:

The processed multiscale data, along with a textual prompt that provides task-specific information, is input into a frozen pre-trained LLM. The frozen LLM utilizes its semantic knowledge and the multiscale information to generate the forecast.

5) Forecast Generation:

Finally, a trainable decoder, which is a simple linear transformation, is applied to the last hidden layer of the LLM to predict the next set of future time steps. This step culminates in the output of forecasts, completing the LLM-Mixer framework processing pipeline.

Datasets

  • For long-term forecasting, the datasets include the ETT datasets (ETTh1, ETTh2, ETTm1, ETTm2), as well as the Weather, Electricity, and Traffic datasets.
  • For short-term forecasting tasks, the framework uses the PeMS dataset, which consists of four public traffic network datasets (PEMS03, PEMS04, PEMS07, PEMS08), with time series data collected at various frequencies.

This is not that important for the aim of this article, since it’s the standard for every ml paper to evaluate on the same benchmark over and over again.

Results

In short LLM-Mixer is great, lol, if you don’t think so read this section πŸ™‚

Table 1: Long-term multivariate forecasting results in terms of MSE and MAE, the lower the better. Red: the best, Blue: the second best.

For long-term multivariate forecasting, LLM-Mixer shows competitive performance, particularly excelling on the ETTh1, ETTh2, and Electricity datasets. It consistently achieves low Mean Squared Error (MSE) and Mean Absolute Error (MAE) values over multiple forecasting horizons (96, 192, 384, and 720 time steps), outperforming models like TIME-LLM, TimeMixer, and PatchTST.

Table 2: Short-term multivariate forecasting results.

For short-term multivariate forecasting, LLM-Mixer again exhibits strong performance, delivering low MSE and MAE values consistently across the PEMS datasets. It achieves competitive accuracy on datasets like PEMS03, PEMS04, and PEMS07, outperforming other models including TIME-LLM, TimeMixer, and PatchTST. On the PEMS08 dataset, LLM-Mixer delivers superior results compared to iTransformer and DLinear, emphasizing its effectiveness in capturing essential temporal dynamics for short-horizon forecasting tasks.

Table 3: Long-term univariate forecasting results.

Finally, for univariate long-term forecasting, LLM-Mixer achieves the lowest MSE and MAE values across datasets on the ETT benchmark, consistently outperforming methods such as Linear, NLinear, and FEDformer.

Thanks for reading this article, feel free to follow me on X platforms or LinkedIn. I have used Nouswise.com to write this article, you can find the original paper of this article and millions of other papers on it. (caution: you didn’t read an AI generated text 🔥😂)

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓