
Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases
Last Updated on March 4, 2025 by Editorial Team
Author(s): Richie Bachala
Originally published on Towards AI.
When building distributed systems in the cloud, storage performance can make or break your applicationβs success. In this post, weβll explore how different Azure disk types perform under distributed database workloads, using YugabyteDB as our distributed database. Weβll dive deep into benchmarking methodologies and reveal practical insights about Azure storage performance characteristics.
The Azure Storage Landscape
Azure offers several managed disk types, each designed for different workloads and performance requirements. Weβll focus on three key offerings:
- Premium SSD: The traditional performance-tier offering, providing consistent performance with burstable IOPS
- Premium SSD v2: A newer generation offering higher performance and more flexible scaling
- Ultra SSD: Azureβs highest-performance offering with configurable IOPS and throughput
Each of these options presents different performance characteristics and price points, making the choice non-trivial for database workloads.
Understanding Distributed Database Workloads
Before diving into performance numbers, itβs essential to understand what makes distributed database workloads unique. Unlike traditional single-node databases, distributed databases like YugabyteDB handle data differently:
- Write Operations:
- Require consensus across multiple nodes
- Need to maintain consistency across replicas
- Often involve both WAL (Write-Ahead Log) and data file writes
2. Read Operations:
- May contact multiple nodes depending on consistency requirements
- Utilize caching at various levels
- Can be affected by data locality
These characteristics mean that storage performance impacts database operations in complex ways, often not directly proportional to raw disk performance metrics.
Benchmarking Methodology
To thoroughly evaluate storage performance, we need a comprehensive testing approach. We employed two industry-standard benchmarking tools:
TPC-C Benchmark
TPC-C is a database benchmark that simulates a complete order-processing environment. Itβs valuable because:
- Models real-world business operations
- Generates mixed read-write workloads
- Tests multiple transaction types with varying complexity
- Provides insights into real-world performance expectations
Our implementation focuses on the following transactions:
- New Order: Complex write-heavy transaction
- Payment: Mixed read-write transaction
- Order Status: Read-only transaction
- Delivery: Write-heavy batch transaction
- Stock Level: Read-heavy transaction
Each of this transaction is a set of queries that are fired to carry out the business use case. For e.g. the following are the queries that are fired for New Order transaction
- Get records describing a warehouse, customer, & district
- Update the district
- Increment next available order number
- Insert record into Order and New-Order tables
- For 5β15 items, get Item record, get/update Stock record
- Insert Order-Line Record
For TPC-C, we focus primarily on NewOrder latencies as number of NewOrder transactions define the efficiency. So if the NewOrder latency is 50ms, it means it took 50ms to carry out all the queries listed above.
Sysbench
Sysbench is a micro benchmarking workload. It creates a bunch of similar tables and the workloads are uniformly distributed across all keys of all the tables. Following are the two workloads that we use most:
oltp_read_only β There are 10 selects in one transaction to random tables and random keys. So if the latency of the transaction is letβs say 10 ms, it means each select is taking 1 ms. And if the throughput is 100 ops/second, it means it is doing 1000 selects per second.
oltp_multi_insert β There are 10 inserts in one transaction to random tables and random keys. So if the latency of the transaction is letβs say 50 ms, it means each insert is taking 5 ms. And if the throughput is 100 ops/second, it means it is doing 1000 inserts per second.
While TPC-C provides a high-level view, Sysbench allows us to examine specific performance characteristics:
- Enables focused testing of individual operation types
- Provides precise control over workload parameters
- Helps isolate storage performance impacts
- Allows scaling tests with different table counts and sizes
We configured Sysbench tests to examine:
- Point selects (read performance)
- Insert operations (write performance)
- Different data set sizes (20 and 30 tables)
β TPCC git repo: https://github.com/yugabyte/tpcc/releases/tag/2.0
β Sysbench git repo : https://github.com/yugabyte/sysbench/
Azure Disk Performance Comparison Tables
Test Environment Configuration
Cluster Configuration
Benchmark Results
Benchmark Configuration Details
Key Findings and Recommendations
Based on our comprehensive testing, we can make several recommendations:
For Read-Heavy Workloads
Premium SSD v2 provides the best balance of performance and cost. The performance gap between Premium SSD v2 and Ultra SSD is minimal for read operations, making Ultra SSD harder to justify purely for read performance.
For Write-Heavy Workloads
Ultra SSD shows its value in write-intensive scenarios, particularly with larger datasets. The consistent performance and lower latencies can justify the higher cost for write-critical applications.
For Mixed Workloads
Premium SSD v2 emerges as the most cost-effective option for most mixed workloads. The performance improvements over Premium SSD are significant, while the cost remains lower than Ultra SSD.
Conclusion
Our testing reveals that Azure disk performance isnβt simply about raw IOPS and throughput numbers. The interaction between storage and distributed database workloads is complex, with CPU often becoming the limiting factor before storage performance is fully utilized.
β If the workload requires low latency, then Ultra SSD would be the best choice. If the workload requires high throughput, then Ultra SSD would also be the best choice. If the workload does not have any specific latency or throughput requirements, then Premium SSD V2 would be a good choice.
β Ultra SSD has the lowest latency and throughput of all three types of disks. However, it is also the most expensive. Premium SSD V2 is a good choice if you need high throughput and are on a budget. Premium SSD is a good choice if you do not have any specific latency or throughput requirements.
For most distributed database deployments, Premium SSD v2 provides the sweet spot of performance and cost.
Ultra SSD becomes compelling primarily for:
- Write-heavy workloads with strict latency requirements
- Large datasets with unpredictable access patterns
- Mission-critical applications requiring consistent performance
When selecting Azure disk types for your distributed database, consider:
- Your workload characteristics (read/write ratio)
- Dataset size and growth expectations
- Performance requirements and budgetary constraints
- The actual bottlenecks in your current system
Remember that storage performance is just one piece of the puzzle. A well-designed distributed database system needs to consider network topology, CPU resources, and memory configuration alongside storage performance for optimal results.
Thanks for reading
x.com
Edit description
twitter.com
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI