Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Our 15 AI experts built the most comprehensive, practical, 90+ lesson courses to master AI Engineering - we have pathways for any experience at Towards AI Academy. Cohorts still open - use COHORT10 for 10% off.

Publication

The SQL Renaissance: Why a 50-Year-Old Language is Suddenly Everywhere in the AI Era
Latest   Machine Learning

The SQL Renaissance: Why a 50-Year-Old Language is Suddenly Everywhere in the AI Era

Last Updated on October 28, 2025 by Editorial Team

Author(s): Mahathidhulipala

Originally published on Towards AI.

SQL is having its biggest moment in decades. Not despite the AI revolution — because of it.

The language that was supposed to be replaced by NoSQL databases, then Python pandas, then whatever came next, is suddenly critical to modern AI workflows. ML engineers are using it to prep training data for LLM fine-tuning. Vector databases for AI embeddings support SQL syntax. Real-time streaming platforms have adopted SQL as their query language.

This wasn’t supposed to happen. Let’s explore why it did.

The SQL Renaissance: Why a 50-Year-Old Language is Suddenly Everywhere in the AI Era

The Thing Nobody Saw Coming

The prediction was straightforward: As data got bigger and messier, SQL would fade away. Graph databases, document stores, and distributed systems would make SQL look quaint and obsolete.

The reality turned out differently. SQL got absorbed into everything.

· Vector databases for AI embeddings? They support SQL.

· Real-time streaming data? SQL on Kafka streams.

· Graph databases? Cypher queries that look suspiciously like SQL.

· Time-series data? SQL with window functions.

· Data lakes? SQL over Parquet files in S3.

The reason is uncomfortable for the “SQL is dead” crowd: Humans think in tables and relationships. We always have. We probably always will.

SQL + AI: The Unexpected Power Couple

The most interesting development isn’t that SQL survived. It’s how it’s become critical to AI workflows.

1. Training Data Preparation

Every ML model needs clean, structured training data. SQL excels at cleaning and structuring data, making it ideal for feature engineering — creating the variables that make or break your model.

 — Preparing customer churn training data
WITH customer_features AS (
SELECT
customer_id,
COUNT(DISTINCT order_id) as total_orders,
SUM(order_amount) as lifetime_value,
DATEDIFF(day, MAX(order_date), CURRENT_DATE) as days_since_last_order,
AVG(days_between_orders) as avg_purchase_frequency
FROM customer_activity
GROUP BY customer_id
)
SELECT
f.*,
CASE WHEN days_since_last_order > 90 THEN 1 ELSE 0 END as churned
FROM customer_features f;

This isn’t just data extraction. SQL’s window functions and CTEs make feature engineering transparent and reproducible.

2. Vector Database Queries

The AI boom created a new database category: vector databases for semantic search and RAG (Retrieval Augmented Generation). These databases all support SQL-like syntax.

 — Semantic search in Pinecone/Weaviate style
SELECT
document_id,
content,
metadata,
cosine_similarity(embedding, query_vector) as relevance_score
FROM documents
WHERE cosine_similarity(embedding, query_vector) > 0.7
ORDER BY relevance_score DESC
LIMIT 10;

Why SQL for vector search? Developers already know it. It composes well with other queries. It’s declarative — you say what you want, not how to get it.

3. LLM Output Structuring

Large Language Models are probabilistic and messy. SQL excels at imposing structure on messy data.

 — Analyzing sentiment from LLM-generated classifications
SELECT
CASE
WHEN llm_sentiment = ‘positive’ AND confidence > 0.8 THEN ‘high_positive’
WHEN llm_sentiment = ‘negative’ AND confidence > 0.8 THEN ‘high_negative’
ELSE ‘moderate’
END as sentiment_category,
COUNT(*) as review_count,
ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER(), 2) as pct_of_total
FROM ai_enriched_reviews
GROUP BY sentiment_category;

The dbt Revolution: When SQL Became Software Engineering

The biggest shift in SQL over the last five years isn’t a new feature. It’s dbt (data build tool).

dbt transformed SQL from “ad-hoc queries in some tool” to “version-controlled, tested, documented data transformations.”

Old way:

Some analyst’s query saved in a file somewhere
— Hope it still works
No idea who wrote it or why
SELECT * FROM staging_orders WHERE something = something;

dbt way:

 — models/customer_lifetime_value.sql
— Version-controlled, tested, documented SQL
WITH customer_orders AS (
SELECT * FROM {{ ref(‘stg_orders’
) }}
)
SELECT
customer_id,
COUNT(DISTINCT order_id) as total_orders,
SUM(order_amount) as lifetime_value,
MIN(order_date) as first_order_date,
MAX(order_date) as last_order_date
FROM customer_orders
GROUP BY customer_id;

What dbt provides:

· Version control for data transformations (Git for SQL)

· Testing (does this column have nulls? Are these values unique?)

· Documentation (what does this model do? Who uses it?)

· Dependencies (this model needs these other models to run first)

· Reusability (DRY principle for data work)

This isn’t just better SQL. It’s SQL as infrastructure.

Real-Time SQL: The Streaming Revolution

For decades, SQL was batch-oriented. Run a query, get results, done.

Technologies like Apache Flink, ksqlDB, and Materialize brought SQL to streaming data.

Real-time fraud detection with streaming SQL
SELECT
transaction_id,
user_id,
amount,
COUNT(*) OVER(
PARTITION BY user_id
ORDER BY transaction_time
RANGE BETWEEN INTERVAL10 minutes’ PRECEDING AND CURRENT ROW
) as transactions_last_10_min
FROM transactions
WHERE transactions_last_10_min > 5;

This query continuously evaluates new transactions as they arrive. No batch processing. No delays. SQL, running in real-time.

The Cloud Data Warehouse Revolution

Snowflake, BigQuery, Databricks — these platforms made SQL scalable in ways that seemed impossible a decade ago.

BigQuery’s separation of storage and compute:

 — Query petabytes without thinking about infrastructure
SELECT
country,
COUNT(*) as page_views,
AVG(time_on_page) as avg_seconds
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_*`
WHERE _TABLE_SUFFIX BETWEEN20170701AND20170731
AND totals.pageviews > 0
GROUP BY country
ORDER BY page_views DESC
LIMIT 100;

This query scans billions of rows and completes in seconds. No indexes to manage. No tables to partition manually. You write SQL; the cloud figures out how to run it.

Snowflake’s data sharing:

 — Access partner data without copying it
SELECT
our_customers.lifetime_value,
partner_data.credit_score
FROM our_database.customers as our_customers
JOIN shared_data.credit_bureau.scores as partner_data
ON our_customers.customer_id = partner_data.customer_id;

This query joins your data with someone else’s data that lives in their account. No ETL. No data movement. Just SQL across organizational boundaries.

The Analytics Engineering Role: SQL’s Career Renaissance

Five years ago, “I’m really good at SQL” didn’t open many doors. Today, it’s one of the most in-demand skills.

The Analytics Engineer role emerged from this SQL renaissance:

· Build data models (SQL in dbt)

· Ensure data quality (SQL tests)

· Create metrics layers (SQL with semantic meaning)

· Partner with data scientists (SQL for feature engineering)

· Own data transformation pipeline (SQL as code)

Sample Analytics Engineer workflow:

 — models/staging/stg_orders.sql
— Clean and standardize raw order data
WITH source AS (
SELECT * FROM {{ source(‘raw’, ‘orders’) }}
)
SELECT
order_id,
customer_id,
CAST(order_date AS DATE) as order_date,
CAST(total_amount AS DECIMAL(10,2)) as total_amount,
COALESCE(discount_amount, 0) as discount_amount
FROM source
WHERE order_id IS NOT NULL
AND order_date >
= ‘20200101’;

— models/marts/customer_metrics.sql
— Business logic layer
SELECT
customer_id,
MIN(order_date) as first_order_date,
MAX(order_date) as last_order_date,
COUNT(DISTINCT order_id) as lifetime_orders,
SUM(total_amount) as lifetime_revenue,
CASE
WHEN DATEDIFF(day, last_order_date, CURRENT_DATE) < 30 THEN ‘active’
WHEN DATEDIFF(day, last_order_date, CURRENT_DATE) < 90 THEN ‘at_risk’
ELSE ‘churned’
END as customer_status
FROM
{{ ref(‘stg_orders’) }}
WHERE status = ‘completed’
GROUP BY customer_id;

The Paradox: SQL is More Relevant Because Data Got Harder

Here’s the unintuitive truth: As data systems got more complex, SQL became more important, not less.

Why?

1. Abstraction Layer
You can write the same SQL against Postgres, Snowflake, BigQuery, or DuckDB. The underlying execution is completely different, but your query looks the same.

2. Declarative Power
You say “what” you want. The query optimizer figures out “how” to get it. As systems get smarter, this gap widens in your favor.

3. Composability
SQL queries are building blocks. They stack, nest, and reference each other. Complex systems need composable primitives.

4. Universal Language
Engineers, analysts, data scientists — everyone speaks SQL. It’s the lingua franca of data.

The Future: SQL + LLMs = Natural Language Data

The next frontier? Natural language to SQL.

User: “Show me our top products by revenue last quarter
in regions where sales grew more than 20%”

LLM generates:
WITH regional_growth AS (
SELECT
region,
SUM(CASE WHEN quarter = ‘Q4’ THEN revenue END) as q4_revenue,
SUM(CASE WHEN quarter = ‘Q3’ THEN revenue END) as q3_revenue
FROM sales
WHERE year = 2024
GROUP BY region
HAVING q4_revenue > q3_revenue * 1.2
)
SELECT
p.product_name,
s.region,
SUM(s.revenue) as q4_revenue
FROM sales s
JOIN regional_growth rg ON s.region = rg.region
JOIN products p ON s.product_id = p.id
WHERE s.quarter = ‘Q4’
GROUP BY p.product_name, s.region
ORDER BY q4_revenue DESC
LIMIT 10;

Tools like GitHub Copilot, AWS Q, and Google’s Duet AI are making this real. But here’s the catch: You still need to understand SQL to verify the AI got it right.

Why SQL Keeps Winning

After 50 years, SQL persists because it solved a fundamental problem really well: structured queries over structured data.

Every attempt to replace it has either:

1. Failed completely (the NoSQL hype cycle)

2. Added SQL support (MongoDB added SQL queries)

3. Created SQL-like syntax (Cypher, SPARQL, PromQL)

The lesson: Good abstractions are durable. SQL’s declarative model — stating what you want without specifying how to get it — is powerful enough to adapt to new paradigms.

The Honest Career Advice

If you’re wondering whether to invest time in SQL in 2025, here’s the reality:

SQL is not optional anymore. It’s not a “nice to have” skill. It’s table stakes for:

· Data analysts

· Analytics engineers

· Data engineers

· Backend developers

· ML engineers

· Product managers (increasingly)

But the SQL you need to learn isn’t just SELECT and JOIN. It’s:

· Window functions (context within groups)

· CTEs (breaking complex logic into steps)

· dbt patterns (SQL as software)

· Query optimization (making things fast)

The Meta Lesson

SQL’s resurgence teaches us something about technology trends:

Boring wins. Not because it’s exciting, but because it works. The flashiest new technology gets the headlines. The reliable, composable, well-understood technology gets used in production for decades.

SQL has been “dead” a dozen times. Each time, it absorbed what killed it and came back stronger.

That’s not a bug. That’s the sign of a truly great abstraction.

The next time someone tells you SQL is outdated, ask them what they’re using for their LLM training data pipeline.

Closing Thoughts

SQL’s story is a reminder that in technology, longevity isn’t about resisting change — it’s about adapting to it. While frameworks come and go, fundamental abstractions that solve real problems tend to stick around.

The language created in 1974 to query relational databases is now querying vector embeddings, streaming data, and distributed data lakes. It’s preparing training data for neural networks and structuring outputs from large language models. It’s not just surviving in the AI era — it’s thriving.

For those building their careers in data, the message is clear: invest in fundamentals. Master window functions, understand query optimization, learn how SQL composes with modern tools like Python and dbt. These skills compound over time because they’re built on durable abstractions.

SQL has been declared dead more times than we can count. Each obituary was premature. As we move deeper into the age of AI and increasingly complex data systems, SQL’s declarative power becomes more valuable, not less.

The renaissance isn’t coming. It’s already here.

What’s your experience with SQL in modern data workflows? Are you seeing it pop up in unexpected places? Share your thoughts in the comments.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Discover Your Dream AI Career at Towards AI Jobs

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!

Note: Content contains the views of the contributing authors and not Towards AI.