Fat Context of RAG Drives Inference Cost Sky-high. Here’s How to Save Big on API Calls.
Author(s): Thuwarakesh Murallie Originally published on Towards AI. Try this before you dump your RAG prototype. What prevents you from deploying that RAG app you developed? Image from Lummi.aiThe article discusses how to mitigate high inference costs in Retrieval-Augmented Generation (RAG) applications …
Break The Vector Search Dependency for Truly Robust RAG Systems
Author(s): Thuwarakesh Murallie Originally published on Towards AI. Why your RAGs need more than just semantic search to succeed.Photo by Marek Piwnicki on Unsplash Vector search is undoubtedly the most remarkable retrieval technique to date. But this can be an overstatement. Before …
Resource-Efficient Fine-Tuning of DeepSeek-R1
Author(s): Thuwarakesh Murallie Originally published on Towards AI. How to make DeepSeek R1 to reason with your private data This member-only story is on us. Upgrade to access all of Medium. Photo by Dan Schiumarini on Unsplash We no longer seek validation …