AWS Lambda: Serverless Application Is Like Cooking Pasta With a Magic Machine!!!
Author(s): Henry Originally published on Towards AI. How AWS Lambda Powers AI & Data Engineering AWS Lambda is a serverless compute service that runs your code, so you do not need to spend extra effort to maintain the server. It is like …
Beyond Pandas: The Modern Data Analytics and Engineering Techniques With Python (Part 1)
Author(s): Gift Ojeabulu Originally published on Towards AI. Image by author Outline Introduction The Data Size Decision Framework & Comprehensive Decision Flowchart A Diagrammatic representation based on Team Syntax Preference, and Performance or Integration Requirements Real-World Examples: Log file Analysis, E-commerce instance …
How to Augment Wildfire Datasets with Historical Weather Data using Python and Google Earth Engine
Author(s): Ruiz Rivera Originally published on Towards AI. Photo by Tim Mossholder on Unsplash Picture this: Youβre a data scientist working with wildfire data, and all you have are basic fire records β location coordinates, timestamps, and maybe a unique fire ID. …
How to Build Bulletproof Data Pipelines with PySpark That Actually Scale
Author(s): Yuval Mehta Originally published on Towards AI. Photo by Claudio Schwarz on Unsplash Weβre past the era when a CSV, a Pandas DataFrame, and a single machine could handle everything you threw at them. Data is heavier now. It arrives fast, …
Machine Learning at Scale: Why PySpark MLlib Still Wins in 2025
Author(s): Yuval Mehta Originally published on Towards AI. Photo by Kevin Ku on Unsplash Machine learning may be glamorous when youβre tuning models on Kaggle datasets or demoing GPT wrappers. But in production? Itβs a grind. Youβre not just building a model. …
Take a Dive Into Delta Lake
Author(s): Disha Verma Originally published on Towards AI. Thatβs Jerry β the frustrated Data Steward! Remember the time we spoke about Data Warehouse, Data Lake and Data Lakehouse? Today, we will learn about Delta Lake that belongs to the same data architecture …
Pipelines to Prompts: Getting started with Databricks and AWS
Author(s): Devi Originally published on Towards AI. A Beginnerβs Guide to Building GenAI Applications from Raw Data Using Databricks and AWS As with my other blogs, we start with the theory, practice, and wrap up with some lessons learnt. NAVIGATION: Why Data …
The Data Stack for AI: RDBMS, Graph, and HTAP
Author(s): Vickyβs Notes Originally published on Towards AI. Earlier this year, I was working with an insurance client who was eager to adopt generative AI to improve customer engagement, but they kept hitting a wall. The AI models were fine; the real …
Beyond Pandas: The Modern Data Processing Toolkit for Data Engineering (Part 1)
Author(s): Gift Ojeabulu Originally published on Towards AI. Image by author Outline Introduction The Data Size Decision Framework & Comprehensive Decision Flowchart A Diagrammatic representation based on Team Syntax Preference, and Performance or Integration Requirements Real-World Examples: Log file Analysis, E-commerce instance …
Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases
Author(s): Richie Bachala Originally published on Towards AI. When building distributed systems in the cloud, storage performance can make or break your applicationβs success. In this post, weβll explore how different Azure disk types perform under distributed database workloads, using YugabyteDB as …
Premium SSD vs Ultra SSD: Azure Storage Performance for Distributed Databases
Author(s): Richie Bachala Originally published on Towards AI. When building distributed systems in the cloud, storage performance can make or break your applicationβs success. In this post, weβll explore how different Azure disk types perform under distributed database workloads, using YugabyteDB as …
When Scripts Arenβt Enough: Building Sustainable Enterprise Data Quality
Author(s): Richie Bachala Originally published on Towards AI. Beyond Scale: Data Quality for AI Infrastructure The trajectory of AI over the past decade has been driven largely by the scale of data available for training and the ability to process it with …
Anyone Can Build GenAI Appsβ
Author(s): Jiazhen Zhu Originally published on Towards AI. written by Jiazhen Zhu, Michael Pfaffenberger, Wallace Dalmet, Sriram Ranganathan, Ahmed Noufel, and Anveshrithaa Sundareswaran Photo credit: Pixabay We conducted a brown bag session at the Walmart Global Tech Reston site to discuss this …
Conditional (Case When) Statements in SQL
Author(s): Kamireddy Mahendra Originally published on Towards AI. A Must-Know Concept to Work in the Real World Projects This member-only story is on us. Upgrade to access all of Medium. Image by author In real-world projects, Itβs necessary to understand that the …
How to Build Your First Data Engineering Project Step by Step?
Author(s): Nishtha Nagar Originally published on Towards AI. How to Build Your First Data Engineering Project Step by Step? βData engineering is the bridge that connects broad business goals with detailed technical implementation.β β Michael Hausenblas. Did you know that by 2026, …