Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Our 15 AI experts built the most comprehensive, practical, 90+ lesson courses to master AI Engineering - we have pathways for any experience at Towards AI Academy. Cohorts still open - use COHORT10 for 10% off.

Publication

Introduction to RAG: Basics to Mastery. 4-RAG with MCP- The Future of Dynamic Context Retrieval
Artificial Intelligence   Data Science   Latest   Machine Learning

Introduction to RAG: Basics to Mastery. 4-RAG with MCP- The Future of Dynamic Context Retrieval

Last Updated on September 4, 2025 by Editorial Team

Author(s): Taha Azizi

Originally published on Towards AI.

Part 4 of the mini-series introduction to RAG

Introduction to RAG: Basics to Mastery. 4-RAG with MCP- The Future of Dynamic Context Retrieval

Introduction

So far in this series, we’ve explored:

  1. Basic RAG with local semantic search.
  2. Hybrid RAG combining keyword + semantic search.
  3. Agentic RAG with multi-step reasoning and tool use.

In this article, we’ll dive into something cutting-edge:
RAG powered by MCP (Model Context Protocol).

MCP is an emerging standard that allows LLMs to dynamically fetch context during generation rather than preloading everything at the start. In practice, this means the model can realize “I need more info” mid-answer, call a retrieval tool, and continue naturally.

Think of MCP as the glue that connects RAG tools (retrievers, search APIs, calculators, etc.) with an agentic system in a standardized way.

Theory

The Model Context Protocol (MCP) is a modern, open-source standard (introduced by Anthropic in November 2024) that standardizes how large language models (LLMs) dynamically access external tools and data sources. Acting much like a “USB-C port for AI,” MCP enables any LLM-based agent to invoke tools — such as document retrievers, APIs, or calculators — through a unified client-server interface using JSON-RPC, regardless of the underlying system. This reduces integration complexity, tackles the “M×N connector problem,” and ensures secure, scalable access to context during generation rather than all upfront. Let’s explore the differences between previous sessions:

Traditional RAG pipeline:

Retrieve → Inject into prompt → Generate

Agentic RAG pipeline (from Part 3):

Plan → Retrieve (possibly multiple times) → Reason → Answer

MCP-powered RAG pipeline:

Generate → Realize missing info → Call MCP tool (retrieval, API, calculator) → Continue generation

Benefits of MCP:

  • Dynamic retrieval: retrieval happens “mid-thought” rather than upfront.
  • Lower memory/GPU usage: no need to preload large context.
  • Tool standardization: all tools (retrievers, APIs, calculators) are accessed through the same protocol.
  • Multi-turn friendly: works better in long conversations.

Setup

We’ll use:

  • Flask for server deployment
  • httpx for making asynchronous HTTP requests
  • fastmcp for structured context and tool integration

Install:

pip install flask httpx fastmcp

Step-by-Step Code

Flowchart of the MCP Rag solution discussed in this article / github repository

Step 1: Define MCP Server

Our MCP server exposes RAG retrieval and calculator as MCP tools.

from mcp.server.fastmcp import FastMCP
from utils.retrieval import hybrid_search
from utils.generation import generate_answer
import math


mcp = FastMCP(
name="rag_mcp_server",
version="1.0.0",
description="MCP server exposing RAG retriever and calculator tools"
)


@mcp.tool()
def rag_retrieve(query: str) -> str:
"""Retrieve relevant context using hybrid RAG."""
docs = hybrid_search(query, top_k=3)
return "\n---\n".join(docs)


@mcp.tool()
def calculator(expression: str) -> str:
"""Safe calculator for arithmetic expressions."""
try:
return str(eval(expression, {"__builtins__": {}}, {"math": math}))
except Exception as e:
return f"CALC_ERROR: {e}"


if __name__ == "__main__":
mcp.run()

Step 2: MCP Client

The client talks to the server and lets the agent call MCP tools.

from mcp.client import MCPClient


client = MCPClient("http://localhost:8000") # server address


# Example: call RAG tool
print(client.call_tool("rag_retrieve", {"query": "Germany renewable policies"}))


# Example: call calculator tool
print(client.call_tool("calculator", {"expression": "25*4+10"}))

Step 3: Agent with MCP Integration

We integrate the MCP client into our agent.

from langchain.agents import initialize_agent, AgentType
from langchain.llms import Ollama
from langchain.tools import Tool
from mcp_client import client


# Wrap MCP tools for LangChain
retrieval_tool = Tool(
name="MCP-RAG",
func=lambda q: client.call_tool("rag_retrieve", {"query": q}),
description="Retrieve info from RAG via MCP"
)


calc_tool = Tool(
name="MCP-Calculator",
func=lambda e: client.call_tool("calculator", {"expression": e}),
description="Do arithmetic via MCP"
)


llm = Ollama(model="mistral")
agent = initialize_agent(
tools=[retrieval_tool, calc_tool],
llm=llm,
agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
verbose=True
)


query = "If Germany’s renewable energy sector was 250 TWh in 2023 and is projected to grow 10% per year, what will it reach by 2030, and how does this compare with Germany’s official renewable energy targets?"
print(agent.run(query))

Expected Behavior

  1. The agent reads the query.
  2. It realizes it needs a calculation → calls MCP Calculator.
  3. It realizes it needs document retrieval → calls MCP RAG tool.
  4. It integrates both results and generates the final answer.

Thought: I need to calculate Germany’s renewable energy output in 2030,
given 250 TWh in 2023 with 10% annual growth.
Action: MCP-Calculator
Action Input: 250 * (1.1 ** 7)
Observation: 487.57

Thought: I should check Germany’s official renewable energy targets for 2030.
Action: MCP-RAG
Action Input: Germany renewable energy 2030 targets
Observation: Germany aims for ~80% renewable share in electricity by 2030.

Final Answer: At 10% annual growth, Germany’s renewable sector would reach ~488 TWh by 2030.
This aligns with Germany’s official goal of ~80% renewables in the electricity mix.

Why This Matters

MCP makes RAG dynamic and extensible:

  • Tools (retrievers, calculators, APIs) can be registered in a standardized way.
  • Agents don’t preload all knowledge — they fetch when needed.
  • Your system scales: add a search API or weather API, just expose it via MCP.

This is the natural evolution of RAG: from static retrieval → agentic planning → protocol-driven, dynamic retrieval.

Next Steps

In Article 5, we’ll push performance to the max with Advanced RAG using Approximate Nearest Neighbors (ANN) — making retrieval lightning-fast even with millions of documents.

Visit the Github page: https://github.com/Taha-azizi/RAG

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Discover Your Dream AI Career at Towards AI Jobs

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!

Note: Content contains the views of the contributing authors and not Towards AI.