Grok 3’s DeepSearch with Google’s new AI Mode (Search)
Author(s): Nehdiii Originally published on Towards AI. Generative AI is reshaping the way we search, and it’s no longer limited to tools like Perplexity or ChatGPT. Many advanced AI users I speak with regularly rely on xAI’s Grok 3 for everyday search …
What is Vibe Coding?
Author(s): Nehdiii Originally published on Towards AI. Image Source I’ve observed two intriguing trends that I believe will develop in parallel as the future of work unfolds. One reflects AI reasoning models leveraging agentic workflows to rethink traditional scientific methods like Google’s …
Is AGI merely a Silicon Valley illusion?
Author(s): Nehdiii Originally published on Towards AI. From OpenAI to DeepSeek, everyone now claims to be an AGI startup, but by 2025, the explosion of such companies is becoming overwhelming. On 14 April 2023, High-Flyer announced the start of an artificial general …
DeepSeek Explained Part 5: DeepSeek-V3-Base
Author(s): Nehdiii Originally published on Towards AI. Vegapunk №05 One Piece Character Generated with ChatGPT This article is the fifth installment of our DeepSeek series and the first to specifically highlight the training methodology of DeepSeek-V3 [1, 2]. As illustrated in the …
Llama 4: Is Meta Sounding the Alarm?
Author(s): Nehdiii Originally published on Towards AI. Image Generated by ChatGPT Llama 2 and Llama 3 marked major milestones in AI during their release years, but Llama 4 feels like a misstep. Despite bold shifts in scale, design, and tone, Meta hasn’t …
OpenAI’s o3: Over-Optimization Returns Stranger Than Ever
Author(s): Nehdiii Originally published on Towards AI. Over-optimization is a well-known issue in reinforcement learning (RL), including RL from human feedback (RLHF), which powers models like ChatGPT, and now in emerging reasoning models. Each context presents its own flavor of the problem …
DeepSeek-V3 Explained Part 4: Multi-Token Prediction
Author(s): Nehdiii Originally published on Towards AI. Vegapunk №04 One Piece Character Generated with ChatGPT This is the fourth article in our DeepSeek-V3 series, where we explain the final major architectural innovation in DeepSeek [1, 2] models: multi-token prediction. In previous articles, …
DeepSeek R1: Pioneering Research and Engineering as a Competitor to Pure Scaling Approaches
Author(s): Nehdiii Originally published on Towards AI. Dr Vegaounk from One Piece anime image generated with ChatGPT DeepSeek-R1 landed unexpectedly just as many researchers, myself included, were attempting to reverse-engineer OpenAI’s o1 model. It revealed the inner workings of o1 and dispelled …
Have o1 Models Solved Human Reasoning?
Author(s): Nehdiii Originally published on Towards AI. Image Generated By ChatGPT OpenAI made waves in the AI community with the release of their o1 models. As the excitement settles, I feel it’s the perfect time to share my thoughts on LLMs’ reasoning …
DeepSeek-V3 Part 3: Auxiliary-Loss-Free Load Balancing
Author(s): Nehdiii Originally published on Towards AI. This is the third article in our DeepSeek-V3 series, where we explore another key architectural breakthrough in DeepSeek [1, 2, 3] models related to Mixture-of-Experts (MoE): Auxiliary-Loss-Free Load Balancing [5]. Vegapunk №03 One Piece Character …
DeepSeek-V3 Part 2: DeepSeekMoE
Author(s): Nehdiii Originally published on Towards AI. This article marks the second entry in our DeepSeek-V3 series, focusing on a pivotal architectural breakthrough in the DeepSeek models [1, 2, 3]: DeepSeekMoE [4]. Vegapunk №02 One Piece Character Generated with ChatGPT In this …
DeepSeek-V3 Explained, Part 1: Understanding Multi-Head Latent Attention
Author(s): Nehdiii Originally published on Towards AI. Vegapunk No.01 One Piece Character Generated with ChatGPT This is the first article of our new series “DeepSeek-V3 Explained”, where we will try to demystify DeepSeek-V3 [1, 2], the latest model open-sourced by DeepSeek. In …
Extracting Actionable Rules from Raw Data
Author(s): Nehdiii Originally published on Towards AI. Image by DALL-E 3 When working with products, we often encounter situations where introducing certain “rules” becomes necessary. Let me clarify what I mean by “rules” through some practical examples: Imagine we’re facing a surge …
🧠 From CLIP to the Future: A Deep Dive into Vision-Language Models for Vision Tasks
Author(s): Nehdiii Originally published on Towards AI. From recognizing faces in photos to detecting objects in real-time videos, computer vision has revolutionized the way machines “see” the world. Tasks like image classification, object detection, segmentation, and even person re-identification (ReID) have seen …