Showcasing Different Approaches for Implementing Multilingual RAG

Last Updated on January 15, 2025 by Editorial Team

Author(s): Michael Ghaly

Originally published on Towards AI.

Retrieval-Augmented Generation (RAG)

Large language models inherently possess a significant body of factual relational knowledge [1]. However, these models still exhibit limitations in their ability to expand and manipulate this knowledge. Consequently, these models, while impressive, often suffer from hallucinations, outdated information, and opaque and untraceable reasoning processes.

Retrieval-augmented generation (RAG) is a fascinating hybrid framework that merges the generative prowess of large language models with the precision of non-parametric knowledge bases [2]. As someone deeply interested in the evolution of AI, I find it particularly compelling because it addresses some of the critical limitations of standalone language models. By leveraging RAG, we can significantly enhance the accuracy and reliability of generated content, especially in knowledge-intensive tasks, ensuring continuous knowledge updates, and being able to incorporate domain-specific information seamlessly.

The figure shows a typical RAG flow. In the document ingestion phase (marked in purple), the documents are loaded, split into chunks, embedded in a high-dimensional vector space, and stored in a vector database. When a query is submitted, a semantic similarity search is performed to obtain the relevant chunks, which are then given to the language model along with the original query for the answer generation.

Multilingual RAG: Bridging Language Barriers

Multilingual RAG leverages the inherent capabilities of large language models to support multiple languages, making it possible to build applications that facilitate natural and fluid interactions in a user’s preferred language. This not only improves user experience and accessibility but also ensures engaging conversations without the hindrance of language barriers.

This article reports the possible approaches for implementing a multilingual RAG system based on the application’s requirements.

Making Multilingual Chat Sessions a Reality

This first section presumes a monolingual knowledge base and aims to support multilingualism solely by enabling users to submit queries and receive answers in their preferred language. In this scenario, two primary approaches come to mind:

1. Multilingual Embedding Models: These models are pre-trained to handle multiple languages, enabling them to retrieve relevant information from a monolingual knowledge base regardless of the query’s language. This approach is straightforward as it maintains the same RAG architecture. However, it’s important to note that multilingual models may not perform as well as language-specific models. From my experience, this approach is a great starting point for those looking to implement multilingual capabilities quickly, though one might need to trade off some performance.

2. Query Translation: This method involves translating the user’s query into the language of the documents in the knowledge base, using a language-specific embedding model to retrieve the relevant information, and then translating the response back into the user’s preferred language. This approach relies heavily on high-quality machine translation models to preserve the accuracy and nuance of the original query and response. Alternatively, the translation task could also be assigned to the language model.

The figure illustrates how a translation module would be added in a RAG flow.

The following code snippets showcase these methods by creating two ChromaDB monolingual collections: one with the English-specific embedding model Mpnet and the other with its multilingual variant.

import chromadb
import chromadb.utils.embedding_functions as embedding_functions

# Dummy documents for testing purposes
lst_monolingual_documents = [
 "The sky is blue.",
 "Trees provide shade.",

 "Birds sing in the morning.",
 "Cats purr when they are happy.",

 "Coffee is a popular morning beverage.",
 "Exercise is good for your health."
]

# Define the embedding functions
monolingual_ef = embedding_functions.SentenceTransformerEmbeddingFunction(
 model_name="all-mpnet-base-v2")
multilingual_ef = embedding_functions.SentenceTransformerEmbeddingFunction(
 model_name="paraphrase-multilingual-mpnet-base-v2")
 
# Create the collections
chroma_client = chromadb.Client()

collection_monolingual_kb_monolingual_ef = chroma_client.create_collection(
 name="monolingual_kb_monolingual_ef",
 embedding_function=monolingual_ef
 )
collection_monolingual_kb_multilingual_ef = chroma_client.create_collection(
 name="monolingual_kb_multilingual_ef",
 embedding_function=multilingual_ef
 )
 
# Populating the collections
collection_monolingual_kb_monolingual_ef.add(
 documents=lst_monolingual_documents,
 ids=[f"id_{int_index}"
 for int_index in range(len(lst_monolingual_documents))]
)
collection_monolingual_kb_multilingual_ef.add(
 documents=lst_monolingual_documents,
 ids=[f"id_{int_index}"
 for int_index in range(len(lst_monolingual_documents))]
)

This section defines two natural language queries: one in English and the other in Italian. The goal is to support retrieval for both language on a knowledge base that contains all-English documents.

str_english_query = "What are the benefits of trees?"
str_italian_query = "Quali sono i benefici degli alberi?"

# Retrieval with the English Query and the monolingual embedding model
lst_english_query_retrieval = collection_monolingual_kb_monolingual_ef.query(
 query_texts=str_english_query, n_results=1)["documents"]

# Retrieval with the Italian Query and the monolingual embedding model
lst_italian_query_retrieval = collection_monolingual_kb_monolingual_ef.query(
 query_texts=str_italian_query, n_results=1)["documents"]

print(f"Retrieved documents for the English query: {lst_english_query_retrieval}")
print(f"Retrieved documents for the Italian query: {lst_italian_query_retrieval}")

Retrieved documents for the English query: [['Trees provide shade.']]
Retrieved documents for the Italian query: [['Coffee is a popular morning beverage.']]

As expected, retrieval for the English query was correct and Mpnet struggled to match the Italian query. However, its multilingual variant is able to match the correct document as shown below:

# Retrieval with the Italian Query and the multilingual embedding model
lst_italian_query_retrieval = collection_monolingual_kb_multilingual_ef.query(
 query_texts=str_italian_query, n_results=1)["documents"]

print(f"Retrieved documents for the Italian query: {lst_italian_query_retrieval}")

Retrieved documents for the Italian query: [['Trees provide shade.']]

Alternatively, the Italian query can be translated to English and the monolingual embedding model can then be used to obtain the same result.

# Translate the Italian query into English
str_translated_query = translate(str_italian_query)

# Retrieval with the translated query and the monolingual embedding model
lst_italian_query_retrieval = collection_monolingual_kb_monolingual_ef.query(
 query_texts=str_translated_query, n_results=1)["documents"]

print(f"Original Italian Query:\t{str_italian_query}")
print(f"Translated Query:\t{str_translated_query}")
print(f"\nRetrieved documents for the translated query: {lst_italian_query_retrieval}")

Original Italian Query: Quali sono i benefici degli alberi?
Translated Query: What are the benefits of trees?

Retrieved documents for the translated query: [['Trees provide shade.']]

Multilingual Knowledge Base: A Step Further

On the other hand, supporting multilingual knowledge bases adds another layer of complexity, as it involves ensuring that the system can handle retrieval of documents in multiple languages, but it is entirely achievable with the right approach. Here are three methods to consider and their code snippets:

# Dummy documents for testing purposes
lst_multilingual_documents = [
 "The sky is blue.",
 "Gli alberi forniscono ombra.",

 "Gli uccelli cantano al mattino.",
 "Cats purr when they are happy.",

 "Il caffè è una bevanda popolare al mattino.",
 "Exercise is good for your health."
]

# Create the collections
collection_multilingual_kb_monolingual_ef = chroma_client.create_collection(
 name="multilingual_kb_monolingual_ef",
 embedding_function=monolingual_ef
 )
collection_multilingual_kb_multilingual_ef = chroma_client.create_collection(
 name="multilingual_kb_multilingual_ef",
 embedding_function=multilingual_ef
 )
 
# Populating the collections
collection_multilingual_kb_monolingual_ef.add(
 documents=lst_multilingual_documents,
 ids=[f"id_{int_index}"
 for int_index in range(len(lst_multilingual_documents))]
)

collection_multilingual_kb_multilingual_ef.add(
 documents=lst_multilingual_documents,
 ids=[f"id_{int_index}"
 for int_index in range(len(lst_multilingual_documents))]
)

Multilingual Embedding Models: Similar to their use in chat sessions, these models can be employed to index and retrieve documents in multiple languages. This approach allows for a unified model to handle queries and documents across different languages, which helps maintain the original RAG architecture without the need for additional translation steps.

str_english_query = "What are the benefits of trees?"
str_italian_query = "Quali sono i benefici degli alberi?"

# Retrieval with the English Query and the monolingual embedding model
lst_english_query_retrieval = collection_multilingual_kb_monolingual_ef.query(
 query_texts=str_english_query, n_results=1)["documents"]

# Retrieval with the Italian Query and the monolingual embedding model
lst_italian_query_retrieval = collection_multilingual_kb_monolingual_ef.query(
 query_texts=str_italian_query, n_results=1)["documents"]

print(f"Retrieved documents for the English query: {lst_english_query_retrieval}")
print(f"Retrieved documents for the Italian query: {lst_italian_query_retrieval}")

Retrieved documents for the English query: [['Exercise is good for your health.']]
Retrieved documents for the Italian query: [['Gli alberi forniscono ombra.']]

As shown in this section, the retrieval for the English query failed, even with an English embedding model, due to the document being in Italian. Meanwhile, the Italian query matched the correct document despite the embedding model not being Italian.

The issue is resolved once a multilingual embedding model is used for the English query.

# Retrieval with the English Query and the multilingual embedding model
lst_english_query_retrieval = collection_multilingual_kb_multilingual_ef.query(
 query_texts=str_italian_query, n_results=1)["documents"]

print(f"Retrieved documents for the English query: {lst_english_query_retrieval}")

Retrieved documents for the English query: [['Gli alberi forniscono ombra.']]

Query Translation: Also similarly as before, this approach involves translating a query into each language present in the knowledge base, retrieving relevant information for each translation, and then merging the results. This method is more resource-intensive because it first requires translating each query into all the languages present in the knowledge base documents. Despite this increased computational demands, this approach ensures comprehensive retrieval across all languages. Additionally, it can help compensate for situations where the multilingual model alone is not performing well enough, enhancing the overall retrieval quality.

lst_languages_in_kb = ["English", "Italian"]
str_spanish_query = "¿Cuáles son los beneficios de los árboles?"

lst_translated_query_retrieval = []
for str_language in lst_languages_in_kb:
 str_translated_query = translate(str_spanish_query, str_language=str_language)
 lst_translated_query_retrieval.append(
 collection_multilingual_kb_multilingual_ef.query(
 query_texts=str_translated_query, n_results=1)["documents"]
 )

# Fuse the individual results using Reciprocal Rank Fusion
lst_translated_query_retrieval = rrf(lst_translated_query_retrieval)
print(f"Retrieved documents for the Spanish query: {lst_translated_query_retrieval}")

Retrieved documents for the Spanish query: [['Gli alberi forniscono ombra.']]

Chunks Translation: This approach translates the entire knowledge base into a single desired language during ingestion. Although this requires significant upfront work, as all the documents in the knowledge base must be translated in the ingestion phase, it streamlines the retrieval process by effectively simplifying the problem to a monolingual knowledge base.

lst_translated_documents = []
for str_document in lst_multilingual_documents:
 lst_translated_documents.append(translate(str_document))

collection_translated_monolingual_kb = chroma_client.create_collection(
 name="multilingual_kb_multilingual_ef",
 embedding_function=multilingual_ef
 )

collection_translated_monolingual_kb.add(
 documents=lst_translated_documents,
 ids=[f"id_{int_index}"
 for int_index in range(len(lst_multilingual_documents))]
)

Conclusion

In conclusion, exploring these options for implementing a multilingual RAG system highlights the flexibility and adaptability of the RAG framework. By sharing these insights, I hope to provide a clearer understanding of the potential paths one can take when venturing into the world of RAG. Each approach has its unique advantages and challenges, and the best choice ultimately depends on the specific requirements and constraints of the project. Whether you’re just starting or looking to refine an existing system, I encourage you to experiment with these methods and find the one that best suits your needs.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Showcasing Different Approaches for Implementing Multilingual RAG

Author(s): Michael Ghaly

Retrieval-Augmented Generation (RAG)

Multilingual RAG: Bridging Language Barriers

Making Multilingual Chat Sessions a Reality

Multilingual Knowledge Base: A Step Further

Conclusion

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Scaling Intelligence: Overcoming Infrastructure Challenges in Large Language Model Operations

From Code to Conversation: The Rise of Seamless MLOps-DevOps Fusion in Large Language Models

Why Most Task Automation Fails — and How AI Agents Can Fix It

Exploring Deep Learning Models: Comparing ANN vs CNN for Image Recognition

LAI #72: From Python Groundwork to Function Calling, ICL Theory, and Load Balancing MoEs

The World’s Leading AI and Technology Publication.

Company

CONTACT US

🔥 Recommended Articles 🔥

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Showcasing Different Approaches for Implementing Multilingual RAG

Author(s): Michael Ghaly

Retrieval-Augmented Generation (RAG)

Multilingual RAG: Bridging Language Barriers

Making Multilingual Chat Sessions a Reality

Multilingual Knowledge Base: A Step Further

Conclusion

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement

Subscribe to our AI newsletter!

🔥 Recommended Articles 🔥