Benchmarking Volga’s On-Demand Compute Layer for Feature Serving: Latency, RPS, and Scalability on EKS
Author(s): Andrey Novitskiy Originally published on Towards AI. End-to-end request latencies and storage read latencies during a load test with maximum worker configuration. TL;DR Real-time machine learning systems require not only efficient models but also robust infrastructure capable of low-latency feature serving …
Volga — On-Demand Compute in Real-Time AI/ML — Overview and Architecture
Author(s): Andrey Novitskiy Originally published on Towards AI. TL;DR Volga is a real-time data processing/feature calculation engine tailored for modern AI/ML. It is designed to support various types of features, including streaming (online), batch (offline), and on-demand features, via a hybrid push+pull …