Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases

Last Updated on March 4, 2025 by Editorial Team

Author(s): Richie Bachala

Originally published on Towards AI.

Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases

When building distributed systems in the cloud, storage performance can make or break your application’s success. In this post, we’ll explore how different Azure disk types perform under distributed database workloads, using YugabyteDB as our distributed database. We’ll dive deep into benchmarking methodologies and reveal practical insights about Azure storage performance characteristics.

The Azure Storage Landscape

Azure offers several managed disk types, each designed for different workloads and performance requirements. We’ll focus on three key offerings:

Premium SSD: The traditional performance-tier offering, providing consistent performance with burstable IOPS
Premium SSD v2: A newer generation offering higher performance and more flexible scaling
Ultra SSD: Azure’s highest-performance offering with configurable IOPS and throughput

Each of these options presents different performance characteristics and price points, making the choice non-trivial for database workloads.

Understanding Distributed Database Workloads

Before diving into performance numbers, it’s essential to understand what makes distributed database workloads unique. Unlike traditional single-node databases, distributed databases like YugabyteDB handle data differently:

Write Operations:

Require consensus across multiple nodes
Need to maintain consistency across replicas
Often involve both WAL (Write-Ahead Log) and data file writes

2. Read Operations:

May contact multiple nodes depending on consistency requirements
Utilize caching at various levels
Can be affected by data locality

These characteristics mean that storage performance impacts database operations in complex ways, often not directly proportional to raw disk performance metrics.

Benchmarking Methodology

To thoroughly evaluate storage performance, we need a comprehensive testing approach. We employed two industry-standard benchmarking tools:

TPC-C Benchmark

TPC-C is a database benchmark that simulates a complete order-processing environment. It’s valuable because:

Models real-world business operations
Generates mixed read-write workloads
Tests multiple transaction types with varying complexity
Provides insights into real-world performance expectations

Our implementation focuses on the following transactions:

New Order: Complex write-heavy transaction
Payment: Mixed read-write transaction
Order Status: Read-only transaction
Delivery: Write-heavy batch transaction
Stock Level: Read-heavy transaction

Each of this transaction is a set of queries that are fired to carry out the business use case. For e.g. the following are the queries that are fired for New Order transaction

Get records describing a warehouse, customer, & district
Update the district
Increment next available order number
Insert record into Order and New-Order tables
For 5–15 items, get Item record, get/update Stock record
Insert Order-Line Record

For TPC-C, we focus primarily on NewOrder latencies as number of NewOrder transactions define the efficiency. So if the NewOrder latency is 50ms, it means it took 50ms to carry out all the queries listed above.

Sysbench

Sysbench is a micro benchmarking workload. It creates a bunch of similar tables and the workloads are uniformly distributed across all keys of all the tables. Following are the two workloads that we use most:

oltp_read_only — There are 10 selects in one transaction to random tables and random keys. So if the latency of the transaction is let’s say 10 ms, it means each select is taking 1 ms. And if the throughput is 100 ops/second, it means it is doing 1000 selects per second.

oltp_multi_insert — There are 10 inserts in one transaction to random tables and random keys. So if the latency of the transaction is let’s say 50 ms, it means each insert is taking 5 ms. And if the throughput is 100 ops/second, it means it is doing 1000 inserts per second.

While TPC-C provides a high-level view, Sysbench allows us to examine specific performance characteristics:

Enables focused testing of individual operation types
Provides precise control over workload parameters
Helps isolate storage performance impacts
Allows scaling tests with different table counts and sizes

We configured Sysbench tests to examine:

Point selects (read performance)
Insert operations (write performance)
Different data set sizes (20 and 30 tables)

● TPCC git repo: https://github.com/yugabyte/tpcc/releases/tag/2.0

● Sysbench git repo : https://github.com/yugabyte/sysbench/

Azure Disk Performance Comparison Tables

Test Environment Configuration

Cluster Configuration

Benchmark Results

Benchmark Configuration Details

Key Findings and Recommendations

Based on our comprehensive testing, we can make several recommendations:

For Read-Heavy Workloads

Premium SSD v2 provides the best balance of performance and cost. The performance gap between Premium SSD v2 and Ultra SSD is minimal for read operations, making Ultra SSD harder to justify purely for read performance.

For Write-Heavy Workloads

Ultra SSD shows its value in write-intensive scenarios, particularly with larger datasets. The consistent performance and lower latencies can justify the higher cost for write-critical applications.

For Mixed Workloads

Premium SSD v2 emerges as the most cost-effective option for most mixed workloads. The performance improvements over Premium SSD are significant, while the cost remains lower than Ultra SSD.

Conclusion

Our testing reveals that Azure disk performance isn’t simply about raw IOPS and throughput numbers. The interaction between storage and distributed database workloads is complex, with CPU often becoming the limiting factor before storage performance is fully utilized.

● If the workload requires low latency, then Ultra SSD would be the best choice. If the workload requires high throughput, then Ultra SSD would also be the best choice. If the workload does not have any specific latency or throughput requirements, then Premium SSD V2 would be a good choice.

● Ultra SSD has the lowest latency and throughput of all three types of disks. However, it is also the most expensive. Premium SSD V2 is a good choice if you need high throughput and are on a budget. Premium SSD is a good choice if you do not have any specific latency or throughput requirements.

For most distributed database deployments, Premium SSD v2 provides the sweet spot of performance and cost.

Ultra SSD becomes compelling primarily for:

Write-heavy workloads with strict latency requirements
Large datasets with unpredictable access patterns
Mission-critical applications requiring consistent performance

When selecting Azure disk types for your distributed database, consider:

Your workload characteristics (read/write ratio)
Dataset size and growth expectations
Performance requirements and budgetary constraints
The actual bottlenecks in your current system

Remember that storage performance is just one piece of the puzzle. A well-designed distributed database system needs to consider network topology, CPU resources, and memory configuration alongside storage performance for optimal results.

Thanks for reading

x.com

Edit description

twitter.com

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases

Author(s): Richie Bachala

The Azure Storage Landscape

Understanding Distributed Database Workloads

Benchmarking Methodology

TPC-C Benchmark

Sysbench

Azure Disk Performance Comparison Tables

Test Environment Configuration

Benchmark Results

Benchmark Configuration Details

Key Findings and Recommendations

For Read-Heavy Workloads

For Write-Heavy Workloads

For Mixed Workloads

Conclusion

x.com

Edit description

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Why Knowledge Graphs Are the Missing Piece in AI Agent API Discovery

The Complexity of Self-Driving Cars Explained Simply

Bridging Symbolic AI and Deep Learning: How Knowledge Graphs are Revolutionizing ResNets

LAI #93: Smarter Model Choices, Multi-Agent Systems, and Cutting Through AI Noise

Who Wins Purview vs Rogue AI in Data Control

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases

Author(s): Richie Bachala

The Azure Storage Landscape

Understanding Distributed Database Workloads

Benchmarking Methodology

TPC-C Benchmark

Sysbench

Azure Disk Performance Comparison Tables

Test Environment Configuration

Benchmark Results

Benchmark Configuration Details

Key Findings and Recommendations

For Read-Heavy Workloads

For Write-Heavy Workloads

For Mixed Workloads

Conclusion

x.com

Edit description

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement