Polars Just Got Even Faster
Last Updated on November 3, 2024 by Editorial Team
Author(s): Lazar Gugleta
Originally published on Towards AI.
This member-only story is on us. Upgrade to access all of Medium.
Recently, I wrote about Polars vs Pandas and their advantages in the current Data Science industry, but a few days ago, we got some even better news.
Even though we did not expect it, Nvidia announced RAPIDS, which accelerates Polars up to 13x faster by improving the library workflows compared to CPU usage.
Letβs break down how they did it and what changed to achieve such a massive improvement jump.
We always strive for faster and better, so Nvidia brought us this innovation by building cuDF library, which is built on top of Apache Arrow columnar format and libcudf, a blazing-fast C++/CUDA dataframe library to provide a GPU-accelerated API. (it also has a Pandas interface)
This partnership between Polars and Nvidia brings a fantastic feature to all existing users while using the familiar API. The optimization and automation done in this library ensure smooth pipeline runs.
The Figure below shows the four best speedups across a set of 22 queries from the PDS benchmark. The Polars GPU engine powered by RAPIDS cuDF offers up to 13x speedup compared to the CPU on queries with many complex groupby and join operations.
Link to image β Author:… Read the full blog for free on Medium.Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI