Introduction to RAG: Basics to Mastery. 4-RAG with MCP- The Future of Dynamic Context Retrieval

Last Updated on September 4, 2025 by Editorial Team

Author(s): Taha Azizi

Originally published on Towards AI.

Part 4 of the mini-series introduction to RAG

Introduction

So far in this series, we’ve explored:

Basic RAG with local semantic search.
Hybrid RAG combining keyword + semantic search.
Agentic RAG with multi-step reasoning and tool use.

In this article, we’ll dive into something cutting-edge:
RAG powered by MCP (Model Context Protocol).

MCP is an emerging standard that allows LLMs to dynamically fetch context during generation rather than preloading everything at the start. In practice, this means the model can realize “I need more info” mid-answer, call a retrieval tool, and continue naturally.

Think of MCP as the glue that connects RAG tools (retrievers, search APIs, calculators, etc.) with an agentic system in a standardized way.

Theory

The Model Context Protocol (MCP) is a modern, open-source standard (introduced by Anthropic in November 2024) that standardizes how large language models (LLMs) dynamically access external tools and data sources. Acting much like a “USB-C port for AI,” MCP enables any LLM-based agent to invoke tools — such as document retrievers, APIs, or calculators — through a unified client-server interface using JSON-RPC, regardless of the underlying system. This reduces integration complexity, tackles the “M×N connector problem,” and ensures secure, scalable access to context during generation rather than all upfront. Let’s explore the differences between previous sessions:

Traditional RAG pipeline:

Retrieve → Inject into prompt → Generate

Agentic RAG pipeline (from Part 3):

Plan → Retrieve (possibly multiple times) → Reason → Answer

MCP-powered RAG pipeline:

Generate → Realize missing info → Call MCP tool (retrieval, API, calculator) → Continue generation

Benefits of MCP:

Dynamic retrieval: retrieval happens “mid-thought” rather than upfront.
Lower memory/GPU usage: no need to preload large context.
Tool standardization: all tools (retrievers, APIs, calculators) are accessed through the same protocol.
Multi-turn friendly: works better in long conversations.

Setup

We’ll use:

Flask for server deployment
httpx for making asynchronous HTTP requests
fastmcp for structured context and tool integration

Install:

pip install flask httpx fastmcp

Step-by-Step Code

Flowchart of the MCP Rag solution discussed in this article / github repository

Step 1: Define MCP Server

Our MCP server exposes RAG retrieval and calculator as MCP tools.

from mcp.server.fastmcp import FastMCP
from utils.retrieval import hybrid_search
from utils.generation import generate_answer
import math


mcp = FastMCP(
 name="rag_mcp_server",
 version="1.0.0",
 description="MCP server exposing RAG retriever and calculator tools"
)


@mcp.tool()
def rag_retrieve(query: str) -> str:
 """Retrieve relevant context using hybrid RAG."""
 docs = hybrid_search(query, top_k=3)
 return "\n---\n".join(docs)


@mcp.tool()
def calculator(expression: str) -> str:
 """Safe calculator for arithmetic expressions."""
 try:
 return str(eval(expression, {"__builtins__": {}}, {"math": math}))
 except Exception as e:
 return f"CALC_ERROR: {e}"


if __name__ == "__main__":
mcp.run()

Step 2: MCP Client

The client talks to the server and lets the agent call MCP tools.

from mcp.client import MCPClient


client = MCPClient("http://localhost:8000") # server address


# Example: call RAG tool
print(client.call_tool("rag_retrieve", {"query": "Germany renewable policies"}))


# Example: call calculator tool
print(client.call_tool("calculator", {"expression": "25*4+10"}))

Step 3: Agent with MCP Integration

We integrate the MCP client into our agent.

from langchain.agents import initialize_agent, AgentType
from langchain.llms import Ollama
from langchain.tools import Tool
from mcp_client import client


# Wrap MCP tools for LangChain
retrieval_tool = Tool(
 name="MCP-RAG",
 func=lambda q: client.call_tool("rag_retrieve", {"query": q}),
 description="Retrieve info from RAG via MCP"
)


calc_tool = Tool(
 name="MCP-Calculator",
 func=lambda e: client.call_tool("calculator", {"expression": e}),
 description="Do arithmetic via MCP"
)


llm = Ollama(model="mistral")
agent = initialize_agent(
 tools=[retrieval_tool, calc_tool],
 llm=llm,
 agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
 verbose=True
)


query = "If Germany’s renewable energy sector was 250 TWh in 2023 and is projected to grow 10% per year, what will it reach by 2030, and how does this compare with Germany’s official renewable energy targets?"
print(agent.run(query))

Expected Behavior

The agent reads the query.
It realizes it needs a calculation → calls MCP Calculator.
It realizes it needs document retrieval → calls MCP RAG tool.
It integrates both results and generates the final answer.

Thought: I need to calculate Germany’s renewable energy output in 2030,
given 250 TWh in 2023 with 10% annual growth.
Action: MCP-Calculator
Action Input: 250 * (1.1 ** 7)
Observation: 487.57

Thought: I should check Germany’s official renewable energy targets for 2030.
Action: MCP-RAG
Action Input: Germany renewable energy 2030 targets
Observation: Germany aims for ~80% renewable share in electricity by 2030.

Final Answer: At 10% annual growth, Germany’s renewable sector would reach ~488 TWh by 2030.
This aligns with Germany’s official goal of ~80% renewables in the electricity mix.

Why This Matters

MCP makes RAG dynamic and extensible:

Tools (retrievers, calculators, APIs) can be registered in a standardized way.
Agents don’t preload all knowledge — they fetch when needed.
Your system scales: add a search API or weather API, just expose it via MCP.

This is the natural evolution of RAG: from static retrieval → agentic planning → protocol-driven, dynamic retrieval.

Next Steps

In Article 5, we’ll push performance to the max with Advanced RAG using Approximate Nearest Neighbors (ANN) — making retrieval lightning-fast even with millions of documents.

Visit the Github page: https://github.com/Taha-azizi/RAG

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Introduction to RAG: Basics to Mastery. 4-RAG with MCP- The Future of Dynamic Context Retrieval

Author(s): Taha Azizi

Part 4 of the mini-series introduction to RAG

Introduction

Theory

Setup

Step-by-Step Code

Step 1: Define MCP Server

Step 2: MCP Client

Step 3: Agent with MCP Integration

Expected Behavior

Why This Matters

Next Steps

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Why Knowledge Graphs Are the Missing Piece in AI Agent API Discovery

The Complexity of Self-Driving Cars Explained Simply

Bridging Symbolic AI and Deep Learning: How Knowledge Graphs are Revolutionizing ResNets

LAI #93: Smarter Model Choices, Multi-Agent Systems, and Cutting Through AI Noise

Who Wins Purview vs Rogue AI in Data Control

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Introduction to RAG: Basics to Mastery. 4-RAG with MCP- The Future of Dynamic Context Retrieval

Author(s): Taha Azizi

Part 4 of the mini-series introduction to RAG

Introduction

Theory

Setup

Step-by-Step Code

Step 1: Define MCP Server

Step 2: MCP Client

Step 3: Agent with MCP Integration

Expected Behavior

Why This Matters

Next Steps

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement