DrSwarnenduAI | Towards AI

GPT-4 Has 1.8 Trillion Parameters. It Uses 2% of Them Per Token.

59 likes

April 22, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. GPT-4 Has 1.8 Trillion Parameters. It Uses 2% of Them Per Token. DeepSeek-R1: 671 billion parameters. 37 billion active per token. DeepSeek-R1: 671 billion parameters. 37 billion active per token.The article discusses various machine learning …

Artificial Intelligence Latest Machine Learning

I Read Every Line of Anthropic’s Leaked Source Code So You Don’t Have To. Here’s What They Were Hiding.

DrSwarnenduAI

48 likes

April 2, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. 512,000 lines of TypeScript. A secret AI pet. An always-on daemon that dreams. A mode that hides from you that it’s AI. All of it now public, because someone forgot one line in a config …

Artificial Intelligence Data Science Latest Machine Learning

Meta Just Built an AI That Rewrites the Rules of How It Gets Smarter. Then It Rewrote Those Rules Too.

DrSwarnenduAI

57 likes

April 1, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. The complete breakdown of HyperAgents — what metacognitive self-modification actually means, why the old way always hits a ceiling, and the result that made the AI safety community sit up straight. Meta Just Built an …

Artificial Intelligence Latest Machine Learning

Nobody Invented Attention. A Frustrated PhD Student Ran Out of Other Options.

DrSwarnenduAI

85 likes

March 11, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. Nobody Invented Attention. A Frustrated PhD Student Ran Out of Other Options. Dzmitry Bahdanau was not trying to invent the architecture that would eventually run inside every large language model on earth. Completely gibberish at …

Does Water Break Math? DeepMind’s Physics-Informed Search for the ,000,000 Singularity

Artificial Intelligence Latest Machine Learning

Does Water Break Math? DeepMind’s Physics-Informed Search for the $1,000,000 Singularity

DrSwarnenduAI

56 likes

March 11, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. Does Water Break Math? DeepMind’s Physics-Informed Search for the $1,000,000 Singularity There is a prize. Not the proof. Not the $1 million.The article discusses how DeepMind employed a Physics-Informed Neural Network to explore the Navier-Stokes …

Artificial Intelligence Latest Machine Learning

The Footnote That Runs the World-Johan Jensen Died in 1925. He’d Never Seen a Computer. Stable Diffusion Runs His Math Every Second

DrSwarnenduAI

29 likes

March 10, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. The Footnote That Runs the World His name was Johan. Lets pay our homage today!This article explores the significant yet often unrecognized contributions of Johan Jensen, a telephone engineer whose mathematical insights have become foundational …

Artificial Intelligence Data Science Latest Machine Learning

Retrieval-Augmented Forecasting of Time-series

DrSwarnenduAI

40 likes

March 4, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. RAFT proves that time series forecasting doesn’t need bigger weights — it needs a better library card Here’s the thing about The Cheesecake Factory menu: it’s 21 pages long. New Frontier in Time seriesThe article …

Artificial Intelligence Latest Machine Learning

When AI Finally Learned That “Dog” and 🐕 Are the Same Thing, aka CLIP

DrSwarnenduAI

28 likes

March 3, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. How CLIP used 400 million internet image-caption pairs to solve the 60-year problem of connecting vision and language by making them occupy the same 512-dimensional manifold. Welcome back. I believe in coordinates and manifolds. If …

Latest Machine Learning

Google DeepMind TRecViT says Video AI Doesn’t Need to Hoard!!

DrSwarnenduAI

6 likes

February 2, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. How TRecViT proves that lightness beats brute force — the equation always wins The 175B parameter giants are mansions. TRecViT is the studio apartment that outperforms them.TRecViT outperforms massive AI models by utilizing a compact …

Latest Machine Learning

The 200-Year-Old Secret Behind Your AI Images: How Fourier’s Heat Equation Conquered Chaos

DrSwarnenduAI

15 likes

January 3, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. When Joseph Fourier solved the heat equation in 1822, he didn’t know he was writing the instruction manual for machines that would one day dream in pixels. Imagine dropping a single droplet of ink into …

Artificial Intelligence Latest Machine Learning

Your Brain Already Does Multimodal AI. It Took Us 10 Years And 7 Breakthroughs To Copy It.

DrSwarnenduAI

11 likes

January 1, 2026

Author(s): DrSwarnenduAI Originally published on Towards AI. See cat. Hear “cat”. Read “cat”. Same concept. Here’s every innovation that made GPT-4V possible. Close your eyes. I say “cat.” mimic human sensoryThe article discusses the advancements in AI, particularly focusing on the development …

Artificial Intelligence Data Science Latest Machine Learning

The Orthogonality Paradox: We’ve Been Wrong About Space

DrSwarnenduAI

19 likes

November 24, 2025

Author(s): DrSwarnenduAI Originally published on Towards AI. The trap we don’t know we’re in You think you understand space. The article discusses the implications of dimensionality in understanding space and mathematics, particularly how our intuitive grasp of lower dimensions doesn’t hold true …

Artificial Intelligence Latest Machine Learning

The Math Behind Kimi K2: How a Chinese Startup Beat Silicon Valley at 1% of the Cost

DrSwarnenduAI

27 likes

November 13, 2025

Author(s): DrSwarnenduAI Originally published on Towards AI. A complete mathematical breakdown of three architectural innovations that let $4.6M beat $500M — with proofs, intuition, and the blueprint for understanding I’ve spent the last 72 hours obsessively reverse-engineering Kimi K2’s architecture. Grab coffee. …

Artificial Intelligence Latest Machine Learning

The Proof is in the Preference: Why DPO is the New RLHF

DrSwarnenduAI

12 likes

November 10, 2025

Author(s): DrSwarnenduAI Originally published on Towards AI. The Proof is in the Preference: Why DPO is the New RLHF Stop debugging PPO. Direct Preference Optimization solved the alignment puzzle with a single, stable loss function. Stop debugging PPO. Direct Preference Optimization solved …

Artificial Intelligence Latest Machine Learning

The Two Faces of Forecasting

DrSwarnenduAI

22 likes

October 5, 2025

Author(s): DrSwarnenduAI Originally published on Towards AI. Why Amazon’s $469B Supply Chain Split the Problem in Half (And Why Your Company Should Too) The Mathematical Revolution That Transformed Chaos Into Coordination Picture this: You’re in a boardroom. The CFO demands next quarter’s …

Frequently Used, Contextual References

Resources

GPT-4 Has 1.8 Trillion Parameters. It Uses 2% of Them Per Token.

I Read Every Line of Anthropic’s Leaked Source Code So You Don’t Have To. Here’s What They Were Hiding.

Meta Just Built an AI That Rewrites the Rules of How It Gets Smarter. Then It Rewrote Those Rules Too.

Nobody Invented Attention. A Frustrated PhD Student Ran Out of Other Options.

Does Water Break Math? DeepMind’s Physics-Informed Search for the $1,000,000 Singularity

The Footnote That Runs the World-Johan Jensen Died in 1925. He’d Never Seen a Computer. Stable Diffusion Runs His Math Every Second

Retrieval-Augmented Forecasting of Time-series

When AI Finally Learned That “Dog” and 🐕 Are the Same Thing, aka CLIP

Google DeepMind TRecViT says Video AI Doesn’t Need to Hoard!!

The 200-Year-Old Secret Behind Your AI Images: How Fourier’s Heat Equation Conquered Chaos

Your Brain Already Does Multimodal AI. It Took Us 10 Years And 7 Breakthroughs To Copy It.

The Orthogonality Paradox: We’ve Been Wrong About Space

The Math Behind Kimi K2: How a Chinese Startup Beat Silicon Valley at 1% of the Cost

The Proof is in the Preference: Why DPO is the New RLHF

The Two Faces of Forecasting

Recent Posts

Crack ML Interviews with Confidence: K-Nearest Neighbors (KNN 20 Q&A)

The Event-Driven Blueprint: How I Scaled a Spring Boot System to 10 Million Kafka Messages/Day

Building Vector Search? Why FAISS Alone Isn’t Enough

TAI #202: GPT-5.5 Moves Codex Into Real Work

Machine Learning System Design -The Model Serving Triangle, With One Forward Pass Flowing Through Every Trade-off (Part3)

AI Orchestration in Action: How MuleSoft and LLMs Fuel the Future of Enterprise AI

GPT-4 Has 1.8 Trillion Parameters. It Uses 2% of Them Per Token.

Part 20: Data Manipulation in Multi-Dimensional Aggregation

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement