Inside AGENTS: The New Open Source Framework for Building Semi-Autonomous LLM Agents
Last Updated on November 5, 2023 by Editorial Team
Author(s): Jesus Rodriguez
Originally published on Towards AI.
I recently started an AI-focused educational newsletter, that already has over 160,000 subscribers. TheSequence is a no-BS (meaning no hype, no news, etc) ML-oriented newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers, and concepts. Please give it a try by subscribing below:
TheSequence U+007C Jesus Rodriguez U+007C Substack
The best source to stay up-to-date with the developments in the machine learning, artificial intelligence, and dataβ¦
thesequence.substack.com
Autonomous agents is one of the most popular topics in the foundation model ecosystem. The early iteration of projects such as AutoGPT or BabyAGI sparked developers' imagination about the possibilities of autonomously solving tasks using large language models(LLMs). Many researchers believe that autonomous agents are one of the next frontiers in foundation models. However, the definition of what constitutes an agent is very loose today. Recently, researchers from AIWaves Zhejiang University and ETH ZΓΌrich published a paper detailing AGENTS, a framework for the creation of LLM-powered agents.
The core idea behind AGENTS is to expand this concept beyond the confines of research circles and onto a more mainstream audience. AGENTS tries to incorporate important building blocks such as planning, memory, tool usage, multi-agent communication, and symbolic control under a single programming model. The paper comes accompanied by an open-source release, which is quite easy to use.
Letβs dive in:
The AGENTS Principles
The AGENT's framework stands out as an open-source platform tailored for language agents powered by LLMs. Its core tenet is to streamline the process of customizing, deploying, and fine-tuning language agents.
Designed to be user-friendly for beginners, AGENTS is based on a series of core principles:
1. Long-short-term memory: AGENTS recognize the importance of memory in autonomous agents. While conventional machine learning models react to single inputs, autonomous agents interact continuously with environments or other agents. To address this, AGENTS has incorporated memory components, as mentioned. It boasts capabilities like storing long-term memories using VectorDB, enabling semantic searches, and updating short-term memories using a dedicated scratchpad.
2. Tool usage & Web navigation: Language agents often need to move beyond mere linguistic interactions. They need the capacity to utilize external tools and explore the internet. AGENTS provides integration with popular external APIs and an adaptable class for the addition of more tools. It further empowers agents to search and browse the web through specialized API interfaces.
3. Multi-agent communication: AGENTS isnβt just about individual agent capabilities. It ventures into the realm of multi-agent systems, useful for various domains like gaming, social experiments, and software development. An innovative feature within this realm is the βdynamic schedulingβ approach. Rather than relying on static rules for agent activities, AGENTS allows for a controller agent β a βmoderatorβ of sorts β to decide the subsequent actions of agents, keeping in mind their roles and past activities.
4. Human-agent interaction: A shortcoming in many agent frameworks is their limited scope of interaction with humans, especially in multi-agent setups. AGENTS effortlessly bridges this gap. It champions interactions between humans and agents, regardless of whether itβs a singular or multi-agent environment.
5. Controllability: Conventional frameworks often restrict agent behavior to system prompts. AGENTS introduces the concept of standard operating procedures (SOPs). These SOPs, similar to real-world applications, are thorough step-by-step guides dictating agent tasks and actions. Such detailed plans can be produced by an LLM and subsequently modified by users.
These are quite a few concepts! However, the initial implementation of AGENTS makes building intelligent agents that use these capabilities quite simple.
Programming with AGENTS
The AGENTS framework elegantly structures itself around three principal classes: Agent, SOP (Standard Operating Procedure), and Environment. These classes are conveniently initialized via a config file crafted in simple plain text. Letβs delve deeper into the architecture of AGENTS and its coding foundation.
- Agent Class: Serving as the essence of the AGENTS framework, the Agent class encapsulates the features and behaviors of a language agent. As visualized in Figure 1βs UML diagram, the agent manages its intricate long-short term memory. Within this class, methods enable the agent to:
- Engage with its surroundings (agent._observe(environment))
- Take actions based on its prevailing state (agent._act())
- Revise its memory data (agent._update_memory()).
For a simplified experience, the above functionalities are integrated into the agent.step() method.
2. SOP (Standard Operating Procedure) Class: This class paints a broader picture, charting out the agentβs state progression. Each state in the SOP class maps to a specific sub-goal or sub-task that agents need to achieve. Every state, constructed as a βStateβ class object, houses specialized prompts aiding the agent to harness the capabilities of an LLM. Additionally, it provides a toolkit of APIs to be utilized within that state.
3. Environment Class: Acting as the backdrop for the agents, this class offers a representation of the external conditions that agents operate within. The class primarily splits into two functional aspects:
- The environment._observed() function illustrates the environmentβs effect on the agentβs actions, outlining the transferable information upon observation.
- The environment.update() function defines the repercussions of the agentβs actions on the environment.
def main ()
# agents is a dict of one or multiple agents .
agents = Agent . from_config ("./ config . json ")
sop = SOP. from_config ("./ config . json ")
environment = Environment . from_config ("./ config . json ")
run (agents ,sop , environment )
The AGENTS framework not only embodies the foundational principles discussed, such as tool incorporation and multi-agent communication, but it also gives prominence to the human-agent interface. Notably, by merely adjusting the βis_humanβ attribute in the config file to βTrueβ, AGENTS offers the flexibility for users to step into an agentβs shoes. Such an arrangement allows for dynamic interaction between human users and other language agents within the given environment, all via a dedicated console interface.
In terms of deployment, AGENTS champions FastAPI as its preferred route. Moreover, the framework is versatile enough to cater to both individual and multi-agent configurations.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI