Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-FranΓ§ois Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Introduction to RAG: Basics to Mastery.2-Hybrid RAG, Combining Semantic & Keyword Search for Better Retrieval
Artificial Intelligence   Latest   Machine Learning

Introduction to RAG: Basics to Mastery.2-Hybrid RAG, Combining Semantic & Keyword Search for Better Retrieval

Author(s):

Originally published on Towards AI.

Part 2 of the mini-series introduction to RAG

Introduction to RAG: Basics to Mastery.2-Hybrid RAG, Combining Semantic & Keyword Search for Better Retrieval

Introduction

In Article 1, we built a basic local RAG pipeline using embeddings and a vector database.
While semantic search is powerful, it can sometimes miss results that contain exact keyword matches β€” especially for rare terms, acronyms, or code snippets.

Hybrid RAG solves this problem by combining:

  • Semantic search (meaning-based)
  • Keyword search (exact term matching)

By blending the two, you get the best of both worlds: high recall and high precision.

Why Hybrid RAG Matters

  • Semantic Search captures meaning across different phrasings. β€œAI regulation laws” β‰ˆ β€œrules for artificial intelligence.”
  • Keyword Search (BM25) ensures literal matches survive. Acronyms, formulas, or code like np.dot don’t always have semantic equivalents.

💡 Think of it this way:

  • Semantic search is like a friend who understands what you mean, even if you phrase it differently.
  • Keyword search is like a literal detective, scanning for exact words or symbols.

Together, they cover each other’s blind spots.

Theory How Hybrid RAG Works

  1. Send the query to both a vector database (semantic) and a keyword index (BM25).
  2. Score results separately β€” similarity score for embeddings, relevance score for BM25.
  3. Combine scores using a weighted formula.
  4. Return the best-ranked chunks as context for the LLM.

This creates a retrieval system that is smarter, yet safer for edge cases.

Simple flowchart of a Hybrid RAG

Setup

We’ll extend the previous pipeline by adding BM25 keyword search with rank_bm25.

Install dependencies:

pip install rank-bm25

(We already have chromadb and sentence-transformers from last time.)

Step-by-Step Code

1. Load & Chunk Documents

(Same code from Article 1.)

# load_docs.py
from pathlib import Path

def load_text_files(folder_path):
texts = []
for file in Path(folder_path).glob("*.txt"):
with open(file, "r", encoding="utf-8") as f:
texts.append(f.read())
return texts
docs = load_text_files("./data")
print(f"Loaded {len(docs)} documents.")

# chunking.py
def chunk_text(text, chunk_size=500, overlap=50):
chunks = []
start = 0
while start < len(text):
end = min(start + chunk_size, len(text))
chunks.append(text[start:end])
start += chunk_size - overlap
return chunks

all_chunks = []
for doc in docs:
all_chunks.extend(chunk_text(doc))
print(f"Total chunks: {len(all_chunks)}")

2. Build BM25 Keyword Index

from rank_bm25 import BM25Okapi
import nltknltk.download('punkt')
# Tokenize chunks for BM25
tokenized_chunks = [nltk.word_tokenize(chunk.lower()) for chunk in all_chunks]
# Create BM25 index
bm25 = BM25Okapi(tokenized_chunks)

3. Semantic Search Function (from Chroma)

def semantic_search(query, top_k=5):
query_embedding = embedder.encode([query], convert_to_numpy=True)[0]
results = collection.query(
query_embeddings=[query_embedding],
n_results=top_k
)
return results["documents"][0], results["distances"][0]

4. Keyword Search Function (BM25)

def keyword_search(query, top_k=5):
tokenized_query = nltk.word_tokenize(query.lower())
scores = bm25.get_scores(tokenized_query)
ranked_ids = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:top_k]
return [all_chunks[i] for i in ranked_ids], [scores[i] for i in ranked_ids]

5. Combine Results (Hybrid Search)

def hybrid_search(query, top_k=5, weight_semantic=0.6, weight_keyword=0.4):
sem_docs, sem_scores = semantic_search(query, top_k)
key_docs, key_scores = keyword_search(query, top_k)
combined = {}
# Normalize semantic distances to similarity (1 - distance)
sem_scores = [1 - s for s in sem_scores]
for doc, score in zip(sem_docs, sem_scores):
combined[doc] = combined.get(doc, 0) + score * weight_semantic
for doc, score in zip(key_docs, key_scores):
combined[doc] = combined.get(doc, 0) + score * weight_keyword
ranked_docs = sorted(combined.items(), key=lambda x: x[1], reverse=True)
return [doc for doc, _ in ranked_docs[:top_k]]

6. Query & Generate Answer

query = "Explain the role of solar panels in renewable energy."
top_chunks = hybrid_search(query, top_k=3)
context = "\n".join(top_chunks)
prompt = f"Answer the question using only the following context:\n{context}\n\nQuestion: {query}\nAnswer:"
import subprocess
ollama_cmd = ["ollama", "run", "mistral", prompt]
response = subprocess.run(ollama_cmd, capture_output=True, text=True)
print("LLM Response:\n", response.stdout)

Expected Output

The system now returns results that match both meaning and exact words.

  • Queries with acronyms (NLP, SQL) no longer fail.
  • Queries with code snippets (like np.dot) now surface exact matches.
  • Queries with rare terms get the balance of semantic context and keyword fidelity.

This is the essence of Hybrid RAG: no more missed answers due to unusual phrasing.

Practical Applications

Hybrid retrieval is especially useful in:

  • Developer Assistants β†’ matching code examples (for loop, lambda) precisely.
  • Healthcare RAG β†’ acronyms (e.g., COPD, HbA1c) must not be β€œinterpreted away.”
  • Legal/Compliance β†’ keyword precision is critical (law references, case IDs).
  • Ecommerce Search β†’ Enhance keyword search results with context-aware retrieval and ranking.

Next Steps

In Article 3, we’ll make our RAG Agentic β€” giving it the ability to decide when and how to retrieve information and even call other tools when needed.

Please visit my Github repo for the full functional code: https://github.com/Taha-azizi/RAG

All images were generated by the author using AI tools.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓