Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Inference Wars: Agentic Flows vs Large Content Windows
Latest   Machine Learning

Inference Wars: Agentic Flows vs Large Content Windows

Last Updated on January 3, 2025 by Editorial Team

Author(s): Claudio Mazzoni

Originally published on Towards AI.

Generated by author via DALLe.

The two schools of thought are battling it out, and the outcome will define how we interact with AI for years to come.

One hundred years ago, Thomas Edison and Serbian-born, Nicola Tesla, in a series of claims, argued that the type of current they each pioneered was the best. Edison sponsored Direct Current (DC) and Tesla Alternative Current (AC). The debate about which was best sparked a race, not only to prove their point but to smear and de-legitimize the other party, often in vicious ways. In the end Teslas Alternative Current system won, as it was capable of traveling for longer distances with less energy loss. The conclusion was cemented when the organizers of the 1893 Chicago’s Worlds Fair chose AC to lighten the event.

We know this period as the Current Wars.

Today, in the groundbreaking world of AI and LLMs, we are experiencing something similar, albeit with less toxicity.

The main trends of AI research are aiming at addressing two fundamental factors:

Content size.

Logic & Reasoning capabilities of LLM models.

These two metrics fundamentally shapes how we use LLMs.

Do we use them as, with context, question and answering tools? Or as an assistant, an automaton capable of using logic and reasoning to come up with insights based on our data and tasks.

Giants like Microsoft, Google, Nvidia, and OpenAI believe the future lies with large content windows. They believe that models trained with billions of parameters, on as much data as possible, fine-tuned with the help of fleets of experts giving the model feedback and engineer to retain precision on larger and larger bodies of text, allowing them to recall even the most minute piece of information, as in finding a needle in a haystack is the future.

On the other hand, thought leaders like Andrew Ng, Harrison Chase (creator of β€˜Langchain’ and β€˜LangGraph,’ the most popular LLM frameworks today), and JoΓ£o Moura (creator of the leading agentic framework β€˜CrewAI’) believe in the power of automated assistants. They advocate for an assembly line-like approach, where tasks are broken down and handled using prompts and retrieved content through Retrieval-Augmented Generation (RAG). This agentic method, they argue, delivers superior results for complex tasks.

Understanding LLM Agents and How They Work

Agent is a term used to describe role-induced LLM inferences designed to perform specific tasks autonomously by leveraging large language models (LLMs). These agents work by breaking down tasks into smaller, manageable components and then executing each individually step-by-step, sometimes checking itself, often using tools (for example, web search engines) and or predefined prompts and external information retrieval mechanisms.

Both Agentic approaches and Large Content window advocates are investing millions of dollars in research and are constantly innovating the field of Artificial Intelligence.

The current development trends in AI are indeed way too fast to summarize in this article. Other folks do way better, yet still struggle to outline all of it as it is released. Having said that I want to shift the attention elsewhere.

At the end of the day, most of us don’t really think about these inference wars being fought to claim our attention. However, depending on your goals, one approach will be better than others for your tasks. How do you go about figuring out which one it is?

In this article, we will take a deeper look at what type of LLM application architecture is right for you.

Choosing the Right LLM Application Architecture

When it comes to selecting the appropriate LLM application architecture, understanding your specific needs and goals is crucial. Here are some key considerations to help you make an informed decision:

Task Complexity:

  • If your tasks require question-answering capabilities or straightforward information retrieval, a large content window LLM might be more suitable for the task. Models with smaller content windows will require more engineering and will be more bridle for the task. For example, retrieving a small subset of information from a large body of documents can easily be done using models like Gemini that have a 1 million token content window, where an agentic approach might require RAG along with engineering to do the same, increasing complexity and latency.
  • For complex tasks that demand logical reasoning, problem-solving, and multi-step processes, an agentic approach along with RAG might be more effective and less prone to errors. This approach allows for more dynamic and adaptable problem-solving strategies, which sometimes are required, especially when dealing with multi-hop tasks. For example, finding how old was the founder of McDonald when he opened his first restaurant would require multiple steps to solve; First find out when he was born, then when he founded McDonald and finally, calculate the difference.

Scalability:

  • Large content window LLMs are designed to handle vast amounts of data, making them ideal for applications that require extensive knowledge bases and high scalability. However, this might come at a seep price if not carefully managing its use.
  • Agentic frameworks, on the other hand, offer flexibility and modularity, making it easier to scale specific tasks or integrate new functionalities without overhauling the entire system.

Customization and Adaptability:

  • If your application demands high levels of customization and adaptability, agentic frameworks like CrewAI or LangGraph provide the tools to create tailored solutions that can evolve with your needs. For example, you can create new β€˜Agents’ to handle new tasks.
  • Large content window models, while powerful, may require significant effort to fine-tune and adapt to specific requirements, limiting their flexibility.

Resource Availability:

  • Consider the resources at your disposal. Large content window models require substantial computational power, and the organizations providing these models typically charge based on the number of input and output tokens (units of text processed by the model). This means you could incur significant costs, potentially leading to a large bill by the end of the day.
  • Agentic frameworks, while still resource-intensive due to its multiple generation and self-correcting nature, may offer more cost-effective solutions for certain applications, especially when engineered to leverage previous content and retrievals.

In the end, just like in 1893 with the Current Wars, the universal solution to our LLM needs is still defined by your own use case. Whether you opt for the extensive knowledge capabilities of large content window models or the dynamic adaptability of agentic frameworks, the key is to stay informed and agile in this rapidly advancing field.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓