AI Codebase Expert Agent: Support Projects Development Tasks With an LLM Multi Agent Powered Approach

Author(s): Michalzarnecki

Originally published on Towards AI.

AI Codebase Expert Agent: Support Projects Development Tasks With an LLM Multi Agent Powered Approach

In the ever-evolving landscape of software development, managing large code bases and efficiently resolving issues remains a significant challenge. In this article I describe AI Codebase Expert application, a tool that leverages the power of GPT-4 and LangChain agents to improve how developers interact with their code base and handle bug-fix tickets.
This application can be found in my GitHub repository. I find it as a starter project that can support in the future development of complex legacy applications.

The Challenge of Modern Codebases

Every developer knows the struggle: you’re faced with a bug report or feature request, and you need to quickly understand:

Where in the code base the issue might be?
What existing documentation is relevant?
How different components interact?
What framework features could help solve the problem?

Traditional approaches involve manually searching through code, documentation, and Stack Overflow — a time-consuming process that often leads to suboptimal solutions. In reality programmer has always dilema of how deep he can go into the codebase, testing different solutions, covering feature with test scenario and refactoring prepared code to proper design patterns. On the other side of the same coin lays project budget and application lifecycle which often limit programmer time to implement features which are experimental like Proof of Concept or limited in time like testing A/B scenarios of which only one will be finally extended further. In the end final solution can be far from being perfect and makes large projects being even wider and harder to understand. To solve problem of cognitive abilities of developer we need to support his work with tools that can understand wider context, all dependencies (even ones not pointed directly in code) and never miss related components and all implication of provided solution.

Enter AI Codebase Expert

AI Codebase Expert transforms this process by creating an intelligent system that can:

Analyze ticket descriptions and requirements
Search through project code and documentation
Understand code dependencies and relationships
Process screenshots and error messages
Propose comprehensive solutions based on all available information

At this point I would like to make it clear that this tool is rather a supporting tool for large project and is not suitable for every kind of application.
To implement and use such tool effectively we need a good understanding of AI multi-agent architecture, projects code dependencies structure and how to describe problem properly. For this we still need a good understanding of technologies used in the project and project itself.
The project I proposed in GitHub repository is a example of relatively simple working solution that uses multiple important concepts described below. I invite you to get familiar withe these ideas and try to create or adjust it to your projects needs especially if you are dealing with large legacy projects it could be worth to setup AI-based support with access to project documentation and codebase.

Let’s dive deep into how this system works and how you can implement it in your development workflow.

Architecture and Components

The system is built using several key components that work together to provide intelligent code analysis and solution generation:

1. Vector Database for Code Storage

The system uses PGVector to store and retrieve code snippets and documentation efficiently:

class VectorStore:
 def get_vector_store(collection_name: str)->PGVector:
 connection = "postgresql+psycopg://project_solver:project_solver@localhost:6024/project_solver"
 embedding_model = utils.configure_embedding_model()

 vector_db = PGVector(
 embeddings=embedding_model,
 collection_name=collection_name,
 connection=connection,
 use_jsonb=True,
 )
 return vector_db

2. Code Graph Analysis

One of the most powerful features is the code dependency graph that helps understand relationships between different parts of the codebase:

class CodeGraph:
 def __init__(self):
 self.graph = nx.DiGraph()

 def get_relations(self, file_path):
 node = self.graph.nodes.get(file_path, {})
 methods = node.get('methods', [])
 if not methods:
 methods = []
 all_related = list(nx.descendants(self.graph, file_path)) + list(methods)
 all_related.sort()
 return {
 'parent': node.get('parent_class'),
 'dependencies': node.get('dependencies', []),
 'methods': methods,
 'all_related': all_related
 }

LangGraph: The Foundation of Multi-Agent Orchestration

One of the most innovative aspects of AI Codebase Expert is its use of LangGraph for coordinating multiple AI agents. LangGraph is a framework for building stateful, multi-agent applications with LLMs.

Understanding LangGraph Workflow

The system uses LangGraph’s StateGraph to create a sophisticated workflow:

class AgentState(TypedDict):
 ticket: str
 code: str
 messages: Annotated[list, add_messages]
 iteration_count: int

This state management allows agents to:

Maintain context across multiple interactions
Track the progress of solution development
Share information between different agents
Control the flow of the problem-solving process

Building the Agent Workflow

The workflow is constructed using LangGraph’s StateGraph:

def build_system(self, ticket: str, proj_dir_structure: str, code: str, image_description: str) -> CompiledStateGraph:
 solver = self._create_agent(
 "Solver",
 self.prompt_template_provider.get_prompt_template_message(ticket, code, image_description)
 )

 analyzer = self._create_agent_with_tools(
 "Analyzer",
 self.prompt_template_provider.get_prompt_template_message(ticket, code, proj_dir_structure)
 )

 critic = self._create_agent(
 "Critic",
 self.prompt_template_provider.get_critic_prompt_message()
 )

 self.workflow.add_node("Solver", solver)
 self.workflow.add_node("Analyzer", analyzer)
 self.workflow.add_node("Critic", critic)

Multi-Agent Architecture Deep Dive

The multi-agent system in AI Codebase Expert consists of three specialized agents working together to solve development tasks:

1. Solver Agent

The Solver Agent is responsible for proposing initial solutions:

def _create_agent(self, role: str, template: str):
 def agent_node(state: AgentState):
 prompt = ChatPromptTemplate.from_messages([
 SystemMessage(content=template),
 HumanMessage(content=state["ticket"] + "\n\nRelated Code:\n" + state["code"])
 ])
 chain = prompt | self.llm
 response = chain.invoke({})

 new_state = {
 "messages": (state["messages"] + [response])[-5:],
 "ticket": state["ticket"],
 "code": state["code"],
 "iteration_count": state["iteration_count"] + 1
 }
 return new_state

 return RunnableLambda(agent_node, name=role)

2. Analyzer Agent

The Analyzer Agent is equipped with special tools to investigate code relationships:

def _create_agent_with_tools(self, role: str, template: str):
 def agent_node(state: AgentState):
 agent = ChatAgent.from_llm_and_tools(self.llm, self.tools, verbose=True)
 executor = AgentExecutor.from_agent_and_tools(
 agent=agent,
 tools=self.tools,
 handle_parsing_errors=True,
 max_execution_time=180,
 return_source_documents=True,
 early_stopping_method="generate",
 verbose=True
 )

 input_query = {
 "input": f"{template}\n\nRelated Code:\n{state['code']}",
 "closure": "",
 "main": ""
 }

 result = executor.invoke(input_query)
 return self._update_state(state, result)

3. Critic Agent

The Critic Agent evaluates proposed solutions:

def get_critic_prompt_message(self) -> str:
 return """You are a senior code reviewer. Evaluate the proposed solution.
 Write exact and only word APPROVED if solution is acceptable and complete."""

Agent Interaction Flow

The agents interact in a structured workflow:

Initial Analysis:

def solver_decision(state: AgentState):
 last_msg = state["messages"][-1].content
 if "MISSING_INFORMATION" in last_msg:
 return "Analyzer"
 return "Critic"

2. Solution Refinement:

def decide_next_step(state: AgentState):
 last_msg = state["messages"][-1].content
 if "APPROVED" in last_msg:
 return END
 if state["iteration_count"] >= MAX_ITERATIONS:
 return END
 return "Solver"

Advanced Features

Retriever Tools Integration

The system incorporates multiple retriever tools for comprehensive code analysis:

def get_retrievers(self):
 vectordb_code = VectorStore.get_vector_store(EnumDocsCollection.CODE.value)
 base_retriever = vectordb_code.as_retriever(
 search_type='mmr',
 search_kwargs={'k': 8, "lambda_mult": 0.5},
 return_source_documents=True,
 )

 graph_retriever = CustomGraphRetriever(
 base_retriever=base_retriever,
 enhancer=RunnableLambda(self._enhance_documents),
 )

 vectordb_docs = VectorStore.get_vector_store(EnumDocsCollection.DOCUMENTATION.value)
 retriever_docs = vectordb_docs.as_retriever(
 search_type='mmr',
 search_kwargs={'k': 8, "lambda_mult": 0.5},
 return_source_documents=True,
 )

 return graph_retriever, retriever_docs

Document Enhancement

The system enhances code snippets with contextual information:

def _enhance_documents(self, docs):
 enhanced = []
 for doc in docs:
 metadata = doc.metadata
 relations = self.code_graph.get_relations(metadata['file_path'])
 
 enhanced_metadata = {
 **metadata,
 'parent_file': relations['parent'],
 'dependency_files': relations['dependencies'],
 'all_related_files': relations['all_related']
 }
 
 enhanced_content = (
 f"File: {metadata['file_path']}\n"
 f"Parent: {relations['parent'] or 'None'}\n"
 f"Dependencies: {', '.join(relations['dependencies']) or 'None'}\n"
 f"Content:\n{doc.page_content}"
 )
 
 enhanced.append(Document(
 page_content=enhanced_content,
 metadata=enhanced_metadata
 ))
 return enhanced

Practical Implementation Examples

Handling a Bug Fix Ticket

Let’s look at how the system processes a typical bug fix ticket:
Here we have some duplication of data rendered in timeline widget.

1. Ticket Input:

ticket = Ticket(form)
image_description = self._describe_attached_image(ticket)
rag_output = self._get_concepts_with_RAG_for_the_task(ticket, image_description)

2. Solution Generation:

result = self._provide_solution_with_selected_LLM_chain(
 form,
 ticket,
 rag_output,
 image_description
)

2.1. Concepts prompt

First we get a list of concepts that need to be checked in order to resolve the problem.

You are a chatbot tasked with solving software project issues.
You can be also supplied with code solution proposal to review.
The project is {self.project_description} and it's written in {self.programming_language} language using {self.framework} framework.

Prepare message that will be used for semantic search in database for project code and project documentation.
This message should contain some code if possible to match also files with code in vector db.
Prepare message based on issue description below. Say which files should be checked.
Prepare list of information and concepts that are relevant to answering for the problem described below. Take also into consideration directory structure of the project


{ticket}\n\n{proj_dir_structure}

2.2. Solution prompt

Next we provide generated concepts and code parts matched with semantic search combined with solution prompt.

You are a Senior Software Engineer with expertise in code analysis.
You have a strong ability to troubleshoot and resolve issues based on the information provided.
If you are uncertain about the answer, simply state that you do not know.

The project is a {self.project_description}. 
It is built using {self.programming_language} and the {self.framework} framework.
 
When analyzing the provided code context, carefully evaluate:
 
 The structure and connections between different code components
 Key implementation details and coding patterns used
 The file paths and locations of each code snippet
 The type of code element (e.g., class, method, function, etc.)
 The name and purpose of each code segment
...
{ticket}

Generated solution propose adding helper array that will store only unique values which is correct solution for the given problem.

Performance Optimization

The system includes several optimizations to ensure efficient operation:

1. Memory Management

def _update_state(self, state: AgentState, result: dict):
 return {
 "messages": (state["messages"] + [result["output"]])[-5:],
 "ticket": state["ticket"],
 "code": state["code"],
 "iteration_count": state["iteration_count"] + 1
 }

2. Iteration Control

MAX_ITERATIONS = 5 # Limit the loop to avoid infinite execution

def check_iteration_limit(self, state: AgentState):
 if state["iteration_count"] >= MAX_ITERATIONS:
 return self._generate_timeout_response()
 return None

Future Developments

The project is actively developing new features:

Enhanced Agent Capabilities:

More specialized agents for specific tasks
Improved inter-agent communication
Better context awareness

Advanced Code Analysis:

Deep learning-based code understanding
Automated test generation
Performance impact analysis

Integration Improvements:

Support for more version control systems
CI/CD pipeline integration
Automated documentation updates

Conclusion

AI Codebase Expert represents a significant advancement in software development tooling. By combining LangGraph’s powerful orchestration capabilities with specialized AI agents, it provides developers with an intelligent assistant that can significantly improve their productivity and code quality.

The multi-agent architecture, powered by LangGraph, enables sophisticated problem-solving that more closely mirrors human development teams’ collaborative work. This approach allows for more thorough analysis, better solution validation, and more reliable code modifications.

For more information or to contribute to the project, visit the GitHub repository. If you find this project helpful please leave a star — it motivates me to further develop it further 🙂

Contact me on email michal@zarnecki.pl.

Resources and References

LangGraph Documentation: [Link]
LangChain Documentation: [Link]
Vector Database Setup Guide: [Link]

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication