Data Engineering | Towards AI

Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases

Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases

Richie Bachala

1 like

March 4, 2025

Author(s): Richie Bachala Originally published on Towards AI. When building distributed systems in the cloud, storage performance can make or break your application’s success. In this post, we’ll explore how different Azure disk types perform under distributed database workloads, using YugabyteDB as …

Data Engineering Latest Machine Learning

Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases

Richie Bachala

1 like

March 4, 2025

Author(s): Richie Bachala Originally published on Towards AI. When building distributed systems in the cloud, storage performance can make or break your application’s success. In this post, we’ll explore how different Azure disk types perform under distributed database workloads, using YugabyteDB as …

Cloud Computing Data Engineering Latest Machine Learning

When Scripts Aren’t Enough: Building Sustainable Enterprise Data Quality

Richie Bachala

1 like

February 11, 2025

Author(s): Richie Bachala Originally published on Towards AI. Beyond Scale: Data Quality for AI Infrastructure The trajectory of AI over the past decade has been driven largely by the scale of data available for training and the ability to process it with …

Data Analysis Data Engineering Data Science Latest Machine Learning

Anyone Can Build GenAI Apps

Jiazhen Zhu

1 like

February 4, 2025

Author(s): Jiazhen Zhu Originally published on Towards AI. written by Jiazhen Zhu, Michael Pfaffenberger, Wallace Dalmet, Sriram Ranganathan, Ahmed Noufel, and Anveshrithaa Sundareswaran Photo credit: Pixabay We conducted a brown bag session at the Walmart Global Tech Reston site to discuss this …

Data Analysis Data Engineering Latest Machine Learning

Conditional (Case When) Statements in SQL

Kamireddy Mahendra

1 like

January 26, 2025

Author(s): Kamireddy Mahendra Originally published on Towards AI. A Must-Know Concept to Work in the Real World Projects This member-only story is on us. Upgrade to access all of Medium. Image by author In real-world projects, It’s necessary to understand that the …

Data Engineering Latest Machine Learning

How to Build Your First Data Engineering Project Step by Step?

Nishtha Nagar

0 like

January 11, 2025

Author(s): Nishtha Nagar Originally published on Towards AI. How to Build Your First Data Engineering Project Step by Step? “Data engineering is the bridge that connects broad business goals with detailed technical implementation.” — Michael Hausenblas. Did you know that by 2026, …

Data Engineering Data Science Latest Machine Learning

Why Every Health Data Scientist Should Know About OMOP CDM

Mazen Ahmed

0 like

November 9, 2024

Author(s): Mazen Ahmed Originally published on Towards AI. Standardising Healthcare Data This member-only story is on us. Upgrade to access all of Medium. Image by Author A large issue I struggle with at work is standardising healthcare data. I gather data from …

Data Engineering Latest Machine Learning

Innovations in Analytics: Elevating Data Quality with GenAI

Jonas Dieckmann

0 like

October 31, 2024

Author(s): Jonas Dieckmann Originally published on Towards AI. Data analytics has become a key driver of commercial success in recent years. The ability to turn large data sets into actionable insights can mean the difference between a successful campaign and missed opportunities. …

Artificial Intelligence Data Engineering Latest Machine Learning

Demystifying Google’s Data Gemma

Chirag Agrawal

1 like

September 27, 2024

Author(s): Chirag Agrawal Originally published on Towards AI. Photo by Alvaro Reyes on Unsplash Discover how Google’s Data Gemma leverages the Data Commons knowledge graph to tackle AI hallucinations. In this blog post, we’ll explore how Data Gemma aims to improve the …

Data Engineering Latest Machine Learning

How I’d Learn to Become a Data Engineer in 2025.

Kamireddy Mahendra

1 like

September 23, 2024

Author(s): Kamireddy Mahendra Originally published on Towards AI. A Clear Guide, If I could start over again from the beginning. This member-only story is on us. Upgrade to access all of Medium. Photo by ThisisEngineering on Unsplash My journey into the world …

Artificial Intelligence Data Engineering Latest Machine Learning

What are Vector Databases?

ifttt-user

0 like

May 21, 2024

Author(s): Ayo Akinkugbe Originally published on Towards AI. Photo by おにぎり on Unsplash Introduction Vector databases are databases designed specifically for storing vector embeddings. If a vector is a data representation having magnitude and direction, what then are vector embeddings? Vector embeddings …

Data Engineering Data Science Latest Machine Learning

Build and Run Data Pipelines with Sagemaker Pipelines

ifttt-user

1 like

May 20, 2024

Author(s): Jake Teo Originally published on Towards AI. Leverage AWS’s MLOps Platform to run on your large data processing workloads seamlesslyImage from Amazon’s sagemaker official website [1] In this article, I will show how you can run long-running, repetitive, centrally managed and …

Artificial Intelligence Data Engineering Latest Machine Learning

Volga — Open-source Feature Engine for real-time AI — Part 2

ifttt-user

1 like

April 5, 2024

Author(s): Andrey Novitskiy Originally published on Towards AI. This is the second part of a 2-post series describing Volga’s architecture and technical details. For motivation and the problem’s background, see the first part. Volga river TL;DR Volga is an open-source real-time feature …

Artificial Intelligence Data Engineering Latest Machine Learning

Volga — Open-source Feature Engine For Real-time AI — Part 1

ifttt-user

1 like

April 5, 2024

Author(s): Andrey Novitskiy Originally published on Towards AI. This is the first part of a 2-post series describing the background and motivation behind Volga. For technical details, see the second part. Volga river TL;DR Volga is an open-source, self-serve, scalable data/feature calculation …

Data Engineering Latest Machine Learning

Unlocking the Gates to Success: Dive into SQL Interview Questions from Leading MAANG Companies

ifttt-user

0 like

February 11, 2024

Author(s): Kamireddy Mahendra Originally published on Towards AI. “Consistent practice is the key to unlocking success in clearing any coding interview.” Concepts used: Window functions, CTE, Joins, Subqueries, and GROUP BY Photo by Christian Wiediger on Unsplash Q1. Assume you’re given a …

Frequently Used, Contextual References

Resources

Category: Data Engineering

Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases

Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases

When Scripts Aren’t Enough: Building Sustainable Enterprise Data Quality

Anyone Can Build GenAI Apps

Conditional (Case When) Statements in SQL

How to Build Your First Data Engineering Project Step by Step?

Why Every Health Data Scientist Should Know About OMOP CDM

Innovations in Analytics: Elevating Data Quality with GenAI

Demystifying Google’s Data Gemma

How I’d Learn to Become a Data Engineer in 2025.

What are Vector Databases?

Build and Run Data Pipelines with Sagemaker Pipelines

Volga — Open-source Feature Engine for real-time AI — Part 2

Volga — Open-source Feature Engine For Real-time AI — Part 1

Unlocking the Gates to Success: Dive into SQL Interview Questions from Leading MAANG Companies

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

AI Agent Software: The Future of Coding Tools

Architecting Intelligent Multi-Agent AI Systems: A2A vs MCP

I Built an AI That Turns Side Projects Into Stories That Get You Hired

🧠 Building an AI Study Buddy: A Practical Guide to Developing a Simple Learning Companion

DeepSeek-V3 Part 3: Auxiliary-Loss-Free Load Balancing

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Category: Data Engineering

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥