When AI Agents Forget What They Saw: The Goal Drift Problem in Video Research
Author(s): Kaushik Rajan Originally published on Towards AI. Why more autonomy doesn’t always mean better performance, and what the first video deep research benchmark reveals about the limits of agentic AI You’re watching a museum tour video. Someone asks: “What’s the registration …
The Prism Hypothesis: Why AI Vision Systems Have Been Looking at the World Wrong
Author(s): Kaushik Rajan Originally published on Towards AI. Vision models either understand images or generate them well. A frequency-based view dissolves the trade-off. Here’s a puzzle that has quietly haunted computer vision for years: Contrastive Language-Image Pre-training (CLIP), OpenAI’s model that learns …
Thinking with Video: The Next Leap in Multimodal AI Reasoning
Author(s): Kaushik Rajan Originally published on Towards AI. How video generation models like Sora-2 are bridging the gap between static images and dynamic understanding I still remember the first time I saw a Vision Language Model (VLM) describe a complex image. It …
Teaching AI to Say “I Don’t Know”
Author(s): Kaushik Rajan Originally published on Towards AI. A deep dive into TruthRL, a new reinforcement learning method making large language models more honest. I once asked an early AI model for a biography of a niche historical figure. It confidently spun …
LLMs Don’t Just Need to Be Smart — They Need to Be Specific. Here’s How.
Author(s): Kaushik Rajan Originally published on Towards AI. How a new technique called “Test-Time Deliberation” teaches AI to think before it speaks I spend a lot of my time wrestling with Large Language Models (LLMs). The goal is always the same: how …
Researchers put AI in a Room with Regulators and a Game of Trust. It Didn’t Go Well.
Author(s): Kaushik Rajan Originally published on Towards AI. A new study uses game theory to simulate how AI agents, developers, and users interact. I’ve spent countless hours thinking about AI safety. It’s the kind of topic that keeps you up at night. …
We’ve Been Measuring AI Reasoning All Wrong. Here’s How to Fix It.
Author(s): Kaushik Rajan Originally published on Towards AI. A new research paper reveals how we can teach language models to actually think, not just guess the right answer. Imagine a math student who consistently aces every test. You’re impressed. But one day, …
Why Your AI Is a Fluent Liar
Author(s): Kaushik Rajan Originally published on Towards AI. A deep dive into the research that explains why AI hallucinations are an inherent feature of Large Language Models, not just a bug. You’ve probably seen it before. You ask an AI chatbot a …
Beyond Text-to-Speech: The Next Wave of Generative Audio
Author(s): Kaushik Rajan Originally published on Towards AI. How Step-Audio 2 is changing the game for creators and developers For years, AI-generated audio has felt like a technology perpetually on the verge of a breakthrough. We’ve all heard it: the robotic voice …
From Pixels to Understanding: A Better Way for AI to See
Author(s): Kaushik Rajan Originally published on Towards AI. How a new “denoising” technique is making on-device computer vision faster, smarter, and ready for your next app. Computer vision on mobile devices is a quiet miracle. It powers the face-unlock on your phone, …
From Pixels to Understanding: A Better Way for AI to See
Author(s): Kaushik Rajan Originally published on Towards AI. How a new “denoising” technique is making on-device computer vision faster, smarter, and ready for your next app. Computer vision on mobile devices is a quiet miracle. It powers the face-unlock on your phone, …
I Built an AI Rock Identifier App in a Weekend With SwiftUI & Gemini
Author(s): Kaushik Rajan Originally published on Towards AI. How I took a simple idea from concept to the App Store in just 48 hours. You’re on a hike and find a stone with mesmerizing, deep-purple crystals. Is it amethyst? Fluorite? For most …
How I Built an Adaptive Concept Explainer Using Hugging Face Models
Author(s): Kaushik Rajan Originally published on Towards AI. Demystifying Complex Ideas Through Multi-Level ExplanationsCredit: Generative AI (ChatGPT 4o) Have you ever found yourself trying to wrap your head around a complex concept, or having to break down something technical for someone who …
Multi-Agent AI: From Isolated Agents to Cooperative Ecosystems
Author(s): Kaushik Rajan Originally published on Towards AI. A mechanism design framework for reducing conflict and boosting trust in multi-agent AI This member-only story is on us. Upgrade to access all of Medium. Image created by the author using Generative AI (Imagen …
How Recurrent Neural Networks (RNNs) Are Revolutionizing Decision-Making Research
Author(s): Kaushik Rajan Originally published on Towards AI. In Data Science in your pocket by Mehul Gupta Meta Large Concept Models (LCM): End of LLMs? What are LCMs and how is LCM different from LLMs 2d ago1381 Join thousands of data leaders …