From FP32 to INT8: The Science of Shrinking AI Models
Author(s): Harsh Maheshwari Originally published on Towards AI. Understanding quantization of neural network along with their implementation This member-only story is on us. Upgrade to access all of Medium. The training compute requirement for the famous AI models have become 45x in …
Understanding Agentic RAG and How Itβs Different From RAG With Code
Author(s): Harsh Maheshwari Originally published on Towards AI. This member-only story is on us. Upgrade to access all of Medium. Created using Dalle 3 In the world of Large Language Models (LLMs), Retrieval Augmented Generation (RAG) has emerged as a game-changer. Traditional …