Agentic Intelligence in Action: Developing an Agentic Intelligent Document Summarizer!

Last Updated on January 15, 2025 by Editorial Team

Author(s): Harshit Dawar

Originally published on Towards AI.

**Source: Image by** **GoogleDeepMind** by **Unsplash**

So here we are, in the world of Agents, where everything has started moving towards Agentic AI. Even Mr. Satya Nadella predicted that “agent-based softwares will replace SaaS applications.”

Hence, it's of utmost importance to understand the agents & gain the capability to develop our agents to be relevant in this rapidly ever-evolving world. This article will help you to develop your own “Agentic Intelligent Document Summarizer”.

Why invest your time in this article/blog?

This blog will help you understand all the steps required to develop your own “Agentic Intelligent Document Summarizer” that not only can take any document as input & its content can be summarized using an intelligent agent, but it can be deployed anywhere (because of containerization).

Every step’s code is explained at its respective place in its blog, & at the last of this article, the complete code link for the project is mentioned.

The application will be developed using the following tools & technologies:

AWS Bedrock
AWS Textract
AWS S3
Docker
FastAPI
Python
REST API concepts
Generative AI
Agents

To understand this article/blog in its full extent, it’s required to be at least familiar with the above-mentioned topics. I will highly recommend thoroughly reading the below-mentioned article; it will make you familiar with many of the above-mentioned concepts & boost your confidence. It will also explain how to get access to a few AWS Bedrock models that are only available once requested.

Developing a personalized meal informer through RAG using AWS Bedrock!

This blog aims to explain the service of AWS Bedrock in complete detail, why AWS Bedrock?, what RAG is?, its…

harshitdawar.medium.com

Don’t be overwhelmed by the names mentioned above; this blog will explain everything in the easiest way possible to make you comfortable with everything & impart enough knowledge to you so that you can create your own custom agents to solve even the most complex use cases.

Prerequisites for developing this application!

To run the application I developed, or to run the application you will be developing, you require the following:

AWS account keys with sufficient privileges: To use the AWS services
An AWS S3 bucket: To store the document that needs to be summarized
Any AWS Bedrock Text Model: To summarize the content & to be used as an LLM in the agent.
Any container execution/orchestration tool: If you want your application to be containerized, then this is required; otherwise, you can directly run your application using Python. If you choose to containerize the application, then you can use Docker, Podman, Kubernetes, OpenShift, or any other equivalent cloud-based offering.

With all the prerequisites mentioned, let’s start the development of the “Agentic Intelligent Document Summarizer application”.

Agentic Intelligent Document Summarizer!

By following the best practices & to simplify the application development, its code is divided into modules. Let’s code each of the modules.

Defining AWS Clients!

To access every AWS service, its respective AWS Client needs to be defined.

The above code is creating the client for the following AWS services:

AWS S3: To upload the document that needs to be summarized.
AWS Textract: To extract the content from the document.
AWS Bedrock: To summarize the content extracted from the document.

AWS keys are passed using the environment variables (which is a best practice) in the above code. You can use any of the methods you are aware of for doing so; the simplest one is hard-coding, but it's the worst method from the security point of view. If you are containerizing the application using Docker or an equivalent, you can pass the keys in the Dockerfile, or as environment variables while running the container (recommended from a security point of view). In case you are using Kubernetes or an equivalent, then you can use secrets as well.

Defining Tools!

Tools are the helpers that an agent leverages to perform an operation/action on its own. Every tool docstring must contain the purpose of that tool because this is the only thing based on which the agent will decide whether to use a particular tool for an operation or not.

Note 1: All the tools will be defined using the “@tool” decorator present in the langchain module, which is a best practice to define a tool.

Note 2: Do not worry about the libraries that need to be imported to run the code. The motive of this article is to explain the important code, not the boilerplate code or the libraries required to run the code. The complete code GitHub repository link (including everything) is mentioned at the end of the blog.

AWS S3 Tool!

The above code will be creating a tool to upload a file to S3, which will be leveraged by the agent we will be defining further in this article.

Any LangChain tool cannot have more than 1 argument; however, this tool will require 2 arguments: “file_path” & “object_name” (target file name in S3) to upload a file to S3. Hence, to make it possible, a trick of prompt has been used where the input format required is defined in the docstring of the tool, based on which the agent will pass the arguments as described. Since the format of arguments is the same as mentioned in the prompt, both the required arguments to upload a document to S3 are fetched using the split() method, & the document is uploaded to AWS S3.

AWS Textract Tool!

The above code will be extracting the content using AWS Textract from a document present in an AWS S3 bucket. This tool will be leveraged as well by the agent we will be defining further in this article.

Similar to the above tool (AWS S3 Tool), this tool also requires 2 arguments, & the exact same problem & its solution have been used in this tool as well as mentioned for the above tool.

AWS Bedrock Tool!

The above code will be summarizing the content using AWS Bedrock, which is extracted from a document present in an AWS S3 bucket. This tool will be leveraged as well by the agent we will be defining further in this article.

Defining the Agent!

Agent is the smart entity that will decide which actions to take & in which order to fulfill a use case/goal by using their intelligence.

The above code is defining the agent to meet our goal, i.e., summarizing a document. It is leveraging all the tools that have been created, & the agent will automatically decide when to use which tool, & it will fulfill our use case.

The LLM being used in the agent is the Llama 3 instruct variation with 8 billion parameters, which is available on AWS Bedrock.

Defining the main application route!

Here, the main application route is defined that will be responsible for taking the document as input, performing all the required things to meet the goal, & then at the end, returning the summary of the document.

The above code is defining the main application route, which is doing the following things in order:

Taking the document as input
Creating a directory with the name “media_files” in the filesystem that is being used by the application. If you are running it in Docker, then it will create the directory in the Docker container.
Save a file (having the name as the document name) with the document content locally.
Call the method to initialize the Agent.
Invoke the Agent with the instructions required to meet our goal.
The Agent will perform all the necessary actions by leveraging the tools defined.
The summary of the document is returned by the application.

This concludes the development of our “Agentic Intelligent Document Summarizer”

Example Run of the Agent!

Sample content on Generative AI is taken from Wikipedia, & a PDF document is created from that content, then that document is passed to the application; all the steps for the same are mentioned below.

Note: The Postman tool is being used for interacting with the application’s API.

**Input request to the Application’s API—Image by Author!**

**Agent Execution Step 1: AWS S3 Tool leveraged — Image by Author!**

**Agent Execution Step 2: AWS Textract Tool leveraged—Image by Author!**

**Agent Execution Step 3: AWS Bedrock Tool leveraged—Image by Author!**

**Agent Execution Step 4: Output of AWS Bedrock Tool, Final Summary—Image by Author!**

**Output of the Application’s API—Image by Author!**

Complete Link of Application Code!

The GitHub repository link of the complete application is mentioned below; you can take the code from there for quick testing, or you can customize it for your own use case if required.

GitHub – HarshitDawar55/agentic__intelligent_document_summarizer

Contribute to HarshitDawar55/agentic__intelligent_document_summarizer development by creating an account on GitHub.

github.com

Application’s plug-&-play container Image!

If you want to use the application directly or want to quickly test it without thinking of the code, then the link to my container image of this application is mentioned below.

https://hub.docker.com/r/harshitdawar/agentic-intelligent-document-summariser

To create a container using the image, you need to provide the few details that are:

AWS Keys
AWS S3 Bucket

Command to use:

docker run -dit -p <your system available port>:80 -e AWS_ACCESS_KEY="<your aws access key>" -e AWS_SECRET_ACCESS_KEY="<your aws secret access key>" -e S3_BUCKET="<S3 Bucket name to use>" harshitdawar/agentic-intelligent-document-summariser:latest

Once your application is running using the above command, then you call the application endpoint, which will be “<your application IP/DNS>:<port number that you used in the above command>/”. Then pass the document as “form-data”, an example for the same is showcased in the above image with the caption: “Input request to the application’s API—Image by Author!”

This concludes this amazing blog. I hope you enjoyed it a lot. Do let me know your thoughts in the comments, and don’t forget to follow me. Also, if you want me to write an article on some of the topics, do reach out to me on LinkedIn or comment on any of my articles. I will be extremely happy to do the same.

I hope my article explains each and everything related to the topic with all the detailed concepts and explanations. Thank you so much for investing your time in reading my blog & boosting your knowledge. If you like my work, then I request you to applaud this blog & follow me on Medium, GitHub, & LinkedIn for more amazing content on multiple technologies and their integration!

Also, subscribe to me on Medium to get updates on all my blogs!

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Agentic Intelligence in Action: Developing an Agentic Intelligent Document Summarizer!

Author(s): Harshit Dawar

Why invest your time in this article/blog?

Developing a personalized meal informer through RAG using AWS Bedrock!

This blog aims to explain the service of AWS Bedrock in complete detail, why AWS Bedrock?, what RAG is?, its…

Prerequisites for developing this application!

Agentic Intelligent Document Summarizer!

Defining AWS Clients!

Defining Tools!

Defining the Agent!

Defining the main application route!

Example Run of the Agent!

Complete Link of Application Code!

GitHub – HarshitDawar55/agentic__intelligent_document_summarizer

Contribute to HarshitDawar55/agentic__intelligent_document_summarizer development by creating an account on GitHub.

Application’s plug-&-play container Image!

Feedback ↓ Cancel reply

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Mistral AI Launches New Mistral OCR API

NN#11 — Neural Networks Decoded: Concepts Over Code

OpenAI Planning to Launch Specialized AI Agents

AI Solutions Are Creating Artificial Needs

OpenAI Invests $50M in NextGenAI Research Consortium

The World’s Leading AI and Technology Publication.

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Agentic Intelligence in Action: Developing an Agentic Intelligent Document Summarizer!

Author(s): Harshit Dawar

Why invest your time in this article/blog?

Developing a personalized meal informer through RAG using AWS Bedrock!

This blog aims to explain the service of AWS Bedrock in complete detail, why AWS Bedrock?, what RAG is?, its…

Prerequisites for developing this application!

Agentic Intelligent Document Summarizer!

Defining AWS Clients!

Defining Tools!

Defining the Agent!

Defining the main application route!

Example Run of the Agent!

Complete Link of Application Code!

GitHub – HarshitDawar55/agentic__intelligent_document_summarizer

Contribute to HarshitDawar55/agentic__intelligent_document_summarizer development by creating an account on GitHub.

Application’s plug-&-play container Image!

Related posts

Feedback ↓ Cancel reply

Popular posts

Updates

Recent Posts

The World’s Leading AI and Technology Publication.

Company

CONTACT US

GDPR CCPA Statement