Building a Multi AI Agents application using the Amazon Gen AI Dream Team (Bedrock, Strands, AgentCore, and Q Developer)

Last Updated on October 4, 2025 by Editorial Team

Author(s): Luis Parraguez

Originally published on Towards AI.

Building a Multi AI Agents application using the Amazon Gen AI Dream Team (Bedrock, Strands, AgentCore, and Q Developer)

In my last article: Boost productivity and achieve greater success with an entire team of AI agents at your service! (Part 2) | by Luis Parraguez | AWS Tip, I started my journey building a Services Proposal Advisory Team as a practical example of multi-agent collaboration. This initial version was built using Amazon Bedrock Agents.

As we know, after the launching of Amazon Bedrock Agents, we had a very fast development and launch of new Generative AI tools and features including the launch of Strands Agents, AWS CDK framework to develop AI agents and more recently the launch of Amazon Bedrock AgentCore in the AWS Summit in NY as a suite of services aimed to enable the transition of an agentic application built with your choice of development framework (including CrewAI, LangGraph, LlamaIndex, and Strands Agents) and foundation model (in or outside of Amazon Bedrock) from POC to a production ready application.

The initial version was designed to work with 05 AI agents:

Advisor agent: Supervisor Agent responsible for the consulting services proposal preparation including Executive Summary, Scope, Solution, Approach and Timeline, Team, Pricing and additional questions for proposal completion.
Requirements agent: Agent that will review and understand the Client’s demand, identify and/or recommend requirements for the project and identify relevant questions needed for the proposal preparation.
Solution agent: Agent that will define the Methodology and Solution Architecture required to attend the Client’s demands.
Approach agent: Agent that will prepare the project implementation approach, timeline and team to deliver the client demand using the defined methodology and solution architecture.
Pricing agent: Agent that will prepare the services pricing considering the previous project implementation approach and effort estimation.

The objective for the next version was to enhance and extend the solution looking to add the following features:

Implement a defined and controlled orchestration workflow allowing to support the creation and update of a proposal through a cycle of interactions with the user (enhanced governance).
Implement an agent responsible to assess the quality of the inputs provided for the proposal generation, acting as a gate keeper before the actual creation or update workflow is started (quality assurance and cost efficiency).
Implement a proposal approach review cycle through an improvement loop conducted by a team of approach creator and approach reviewer (critic) agents looking to enhance this key process output (quality assurance).
Enhance the context sharing among agents looking to provide the proper amount of information to enhance the agent’s productivity during the workflow execution (process and cost efficiency).
Enhance and extend the tools access for the agents going beyond internal developed tools to MCP (Model Context Protocol) tools offered, for example, by AWS MCP servers like AWS Knowledge MCP Server | AWS MCP Servers (knowledge and tools quality).
Implement observability (logging, tracing and metrics) as a critical solution component to support testing and continuous improvement of the solution (quality assurance).
Deploy this Agentic solution as a production ready endpoint that can be consumed by an Application Frontend built to showcase the proposal generation workflow and outputs (production readiness).

Looking at the topics above, I took the decision to leverage AWS Strands Agents to gain the development flexibility required to implement the workflow and tools enhancements and Amazon Bedrock AgentCore to deploy the production ready endpoint with observability. The agents use Amazon Bedrock models, combining Amazon Nova Models and Anthropic Claude as needed.

In the following solution architecture diagram, you can see the detail of all the services used to build the solution:

Now applying the “Working Backwards” approach, let me show you how I built this new solution version. Let’s start seeing a video showing the expected customer experience and content result when creating a new proposal using the solution:

Now let’s go over the key lessons learned during this journey:

Structuring the development project for the agentic solution

Looking at different samples related to Strands Agents and Bedrock AgentCore, I defined the project structure composed by the following folders:

‘agents’: To include the python modules related to the agent’s code. Each agent represented by its individual function, recommended also to implement ‘agent as tools’ methods, if needed.
‘agents_shared’: To include the python modules related to shared functionality among the agents including constants (global variables), logging, tracing and metrics. We’ll come back to this.
‘prompts’: To include the files actually containing the system prompts to be dynamically assigned to each agent, along the workflow, with variations depending on the scenario (example: requirements creation versus update). This prompts breakdown is aligned with the process efficiency objective.
‘tools’: To include the python modules containing the custom-built tools for the agents.
‘root folder’: where we have the multi-agent orchestrator module responsible to manage the workflow and scenarios (proposal creation versus update). This module was also defined as the Bedrock AgentCore Entry Point since our final outcome (proposal created or updated) is the result of the multi-agent team workflow. This is an important decision that you need to make when working with Strands and AgentCore and will depend on your use case (multi-agent team or individual agent).

The root folder is also where AgentCore will create configuration files required to create the docker container image to be used by the AgentCore runtime that will provide the endpoint to call your agentic application.

Implement a defined and controlled orchestration workflow

This is why I chose to build the solution with AWS Strands, looking to have complete control of the workflow and treatment of the scenarios to be supported. When building this workflow there are some key drivers that I will suggest considering:

Concentrate the workflow complexity (inputs definition, outputs validations, context preparation) in the multi-agent orchestrator module looking to focus the Agents to receive their inputs (context), execute their tasks (according to the system prompt) and produce their expected results (with proper formatting). This will help you to design and build agents that can be reused in other applications or even be used as tools by other agents.
Focus the power of the LLMs that you are using in value-added tasks. As we know the LLMs are growing in capabilities each day and that may influence us to have them handling all sort of activities. We need to be very critic about the cost-benefit of having the model executing a task versus resolving it using ‘regular’ tools and code.
When building a multi-agent application also design the workflow taking into account the key intermediate milestones that you need to track before the final outcome generation. Include in your workflow a mechanism that will allow you to test the different workflow stages. As an example, this application allows to define a “target stage” that starts at proposal inputs validation and progress to requirements, solution, approach and timeline, pricing and complete proposal generation.

Implement an agent responsible to assess the quality of the inputs

Specially in multi-agent applications, always evaluate the cost/benefit of implement checkpoints (gates) before launching the actual agents workflow. As we know the quality of the model's output is directly related to the quality of the inputs (context) that is provided and therefore all actions that we can take to improve those inputs or even not allow the agentic process to start is going to be valid.

In the case of this application, I decided to implement a Validator Agent to evaluate the inputs provided to generate the proposal before the engagement of following eight agents that participate in the workflow. This is also aligned with the cost-efficiency goal because as we know the generative models will ALWAYS generate a response, consuming input and output tokens, even if it will be a hallucination. Depending on your use case the validation gate(s) can be implemented using tools and code.

Implement a review cycle through an improvement loop

I wanted to highlight this particular pattern because initially tried to implement this loop using prompt techniques. The reality, at least in my experience, is that the generative models are not able to manage this loop pattern in a consistent way even with complex system prompts. This was another strong reason for the adoption of Strands in order to implement, through code, the quality control loop and its conditions to start, maintain and exit the loop. Since we are working with generative models the exit conditions are critical to establish a limit of tokens consumption in an iterative refinement process.

Enhance the context sharing among agents

Context sharing is a key success factor in this type of applications and along the development process it is important to identify opportunities about how the workflow orchestrator can contribute to narrow the context sent to the agents to the information really needed by them instead of sharing broader information that will cause both unnecessary input tokens consumption as well as deviation of the agent’s attention to details not related with their actual objectives. This context “cleaning” also should include the agents “thinking process” that are included in their responses (even including instructions to avoid them). The thinking process is very valuable for testing and observability purposes but can complicate the process if left in a response that will be used in a next step.

Enhance and extend the tools access for the agents

This was certainly a very valuable addition for the solution looking to start leveraging the tools universe through Model Context Protocol (MCP). In the case of this application this was a direct opportunity since we are talking about proposals generation involving technology services. In case the proposal involves AWS services, I am leveraging the AWS Knowledge MCP Server to provide to the Requirements and Solution Agents access to an updated AWS knowledge base complementing their custom-built tools. As an example, see the following code:

# Including here the code to enable the utilization of AWS Knowledge MCP Server

streamable_http_client = MCPClient(lambda: streamablehttp_client("https://knowledge-mcp.global.api.aws"))

# Create an agent with MCP tools
with streamable_http_client:
 # Get the tools from the MCP server
 mcp_tools = streamable_http_client.list_tools_sync()

 # Combine MCP tools with others tool (if needed)
 all_tools = mcp_tools + [http_request, ConvertDateFormat]

 # rest of your code...

MCP support was another strong feature of AWS Strands as you can see above. Important fact here is that when combining AWS Strands with AgentCore we needed to use an HTTP-based MCP server to avoid the “docker on docker” issue when using MCP servers running on docker. AWS Strands supports diverse MCP connection scenarios depending on your use case and also offers a broad set of prebuilt tools that you can leverage for your agents.

Implement observability (logging, tracing and metrics) as a critical solution component

This was another critical addition and is mandatory for any agentic application. Logging and tracing are fundamental tools during the development and testing of your agentic workflow, otherwise you will be working with a black box. Here you can also leverage the powerful combination of AWS Strands and AgentCore working together along the development, testing and deployment phases of your project. Based on my experience, my recommendations are the following:

Logging: Implemented as a shared service for the workflow orchestrator and every agent involved in the application. For Deployment and Monitoring in Production using AgentCore the logging information will be available through CloudWatch.

Visualizing information in **Log Group** related to the **AgentCore Runtime** hosting our application

Metrics (Development and Testing): AWS Strands provides a very rich set of Agent Metrics that you can leverage during your development and testing process. You can add custom atributes (like workflow stage, agent involved, etc.) that can allow you to gain visibility of the metrics in the context of your workflow. The recommendation is to build a module in your application to generate this quantitative metrics, based on AWS Strands information, to have a reference to compare with metrics generated by third party observability tools like Langfuse, Arize, etc. that may produce different results depending on how they process the information produced by AWS Strands.
Metrics (Deployment and Monitoring in Production): For this moment, AgentCore will provide Observability metrics for monitoring. Using the GenAI Observability dashboard in CloudWatch, we can visualize metrics related to the Bedrock AgentCore runtime(s) that are hosting our agent(s).

Using the **GenAI Observability dashboard** in **CloudWatch** to visualize AgentCore Runtime Metrics

Tracing (Development and Testing): Implement tracing for your agentic applications is equivalent to really know what is happening with your workflow. AWS Strands comes already prepared for tracing generation following the OTEL (Open Telemetry) standard and this enables its easy integration with observability third party tools. For this project, I selected Langfuse to support development and testing and working together with Strands is a very valuable tool to improve agent’s outcomes. Through the understanding of the model’s reasoning, we can identify prompt and tools adjustments required to guide the agent’s behavior in the direction that we need. The recommendation is to build a module in your application to handle the integration with your observability tool for local development and testing including a setting to enable or disable it. This is recommended when we will work with AgentCore in production because at this moment AgentCore does not support OTEL tracing distribution to CloudWatch and another third-party observability tool(s) in parallel.

Working with **AWS Strands Tracing information** in **Langfuse** for Agent’s refinement

Tracing (Deployment and Monitoring in Production): Similar as logs and metrics, for this case, we can use AgentCore Observability to inspect our agents tracing information by each session and apply the insights with the same objectives before with focus on continuous improvement and quality control of our application.

Using the **GenAI Observability dashboard** to visualize **session details** and **associated Traces**

Using the **GenAI Observability dashboard** to drill down a **Trace Timeline**

Using the **GenAI Observability dashboard** to drill down the **Trace Spans**

In the right side of this last image, you can see the sequence of Strands Agents invocations (09 in the case of this application workflow) and for each one of them you are able to see the events that contain the information about the actions taken by the model to generate its response (including thinking and tool usage) following the Agent Loop concept.

Deploy the Agentic solution as a production ready endpoint

For this final topic, we used the resources provided by AgentCore Runtime. To speed up the configuration steps, I decided to leverage the Amazon Bedrock AgentCore starter toolkit to prepare and deploy the agentic application to AWS. Using this process, AgentCore will generate the required configuration files to automatically create the IAM execution role, container image, and Amazon Elastic Container Registry repository needed to host the agent in AgentCore Runtime. After launching, your Agent(s) will be ready to be called using the AWS SDK InvokeAgentRuntime operation.

The frontend application that I have presented at the beginning is using that pattern to invoke the agentic application and interact with its responses.

You may be wondering where is here Amazon Q Developer? Amazon Q Developer was used to help me build from zero the Streamlit application that we are using to demonstrate the action of the AI agents. That will be subject of future post including its own lessons learned!

I do recommend you select a use case and build your first agent(s) using the Amazon Gen AI Dream Team. Several companies are already using this technology to build specialized agent teams to support them on their business processes and with that boost productivity and achieve greater success!

Hope this article will motivate you to experiment and, as always, you can reach out to me in case you want to discuss more about this exciting subject!

Let’s meet again in my next post and please feel free to share your feedback and comments!

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Building a Multi AI Agents application using the Amazon Gen AI Dream Team (Bedrock, Strands, AgentCore, and Q Developer)

Author(s): Luis Parraguez

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Recent Posts

Crack ML Interviews with Confidence: K-Nearest Neighbors (KNN 20 Q&A)

The Event-Driven Blueprint: How I Scaled a Spring Boot System to 10 Million Kafka Messages/Day

Building Vector Search? Why FAISS Alone Isn’t Enough

TAI #202: GPT-5.5 Moves Codex Into Real Work

Machine Learning System Design -The Model Serving Triangle, With One Forward Pass Flowing Through Every Trade-off (Part3)

AI Orchestration in Action: How MuleSoft and LLMs Fuel the Future of Enterprise AI

GPT-4 Has 1.8 Trillion Parameters. It Uses 2% of Them Per Token.

Part 20: Data Manipulation in Multi-Dimensional Aggregation

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Building a Multi AI Agents application using the Amazon Gen AI Dream Team (Bedrock, Strands, AgentCore, and Q Developer)

Author(s): Luis Parraguez

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Related posts

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement