How Does Polars .rolling Scale With The Number of Columns?
Last Updated on December 11, 2023 by Editorial Team
Author(s): Yousef Nami
Originally published on Towards AI.
A prelude to calculating Variograms using Polars
Photo by Yiorgos from Unsplash.
For some time, Iβve been reading about Variograms [1]. This is a visualization tool used in Geostatistics to see how a specific quantity varies with space. It can act as a really good diagnostic tool, helping answer the following questions:
Is there some distance d from a point xi where we no longer gain any informational value from xi ?Is there cyclicity in the measurement as a function of distance?
Iβve been curious to apply this theory to time series data, particularly because compared with timeseries-specific methods such as autocorrelation [2], a Variogram is valid for missing or unevenly spaced data (which is a characteristic of real time series data), and can be extended to higher dimensions [3, 4].
The issue with Variograms is that they are computationally expensive. However, Iβve recently been playing around with polars and thought that the rolling [5] method and/or expression lend themselves nicely to the variogram algorithm. The tricky bit is that variogramβs scale with the number of lags, so I wanted to quickly see if there is a significant performance decrease when using Expr.rolling [6] for a large number of columns.
The algorithm for a variogram is relatively simple [1]:
Where h is the… Read the full blog for free on Medium.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI