Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Our 15 AI experts built the most comprehensive, practical, 90+ lesson courses to master AI Engineering - we have pathways for any experience at Towards AI Academy. Cohorts still open - use COHORT10 for 10% off.

Publication

AI Agents in Enterprise: A Journey Into AI-Generated Hell
Latest   Machine Learning

AI Agents in Enterprise: A Journey Into AI-Generated Hell

Author(s): ravindu somawansa

Originally published on Towards AI.

AI Agents in Enterprise: A Journey Into AI-Generated Hell

When I first got asked to build AI agents at my company, I thought most of my time would be about designing and coding the best agents possible. Instead, I got never-ending meetings, impossible expectations, and endless debates about workflow and organization.

Two years later, with lots of AI agents and painful experience behind me, here are some learnings for you.

Welcome to my journey through AI-generated hell.

The Illusions of Rapid Products

Using all available frameworks (MCP, A2A, LangChain, LlamaIndex, CrewAI…), AI providers (OpenAI, Anthropic, Mistral, Groq…), and cloud providers (GCP, AWS, Azure…), it is extremely easy to create a Proof of Concept (PoC). If you know what you’re doing, you can literally build a Multi-Agent AI system or a RAG app in 5 minutes and deploy it on the cloud in 10.

But that does not mean you have a product. You just have something to make nice demos.

To have a product, you need:

  • User Management: You need to integrate with your company’s authentication system (Okta, Azure AD…) to manage users and permissions. There’s no skipping one-button auth.
  • Security & Monitoring: Your product must follow proper security guidelines (pentests…), handle personal information (PII) correctly, and monitor everything happening. Even if you think you don’t need it, you will.
  • Scalability & Pricing: The product must handle all potential users, whether hundreds, thousands, or millions. That means lots of load tests. You’ll constantly be asked, “How much will it cost?” until it haunts your dreams.
  • User Feedback: Your product needs to record all user feedback, because you always need to listen to users’ complaints.
  • Support: You need to handle new users, clearly specify responsibilities, and set SLAs (Yes, we hate that!).
  • Onboarding: Ensure your users know how to use your tool. Trust me, they know less than you imagine.

All of this will take significantly longer (10x, 20x, or even 100x) than the PoC.And these are just the basics any product should have.

Now let’s dive into specifics for AI products.

The Shackles of Evaluation

When using public AI tools like ChatGPT, have you ever wondered if the answers were wrong? No, you just assume it works, right? This is normal, but it can be disastrous when naive users ask critical questions, assuming your product perfectly understands the company’s data and business model.

That’s why you need an essential component: evaluation.

Evaluation measures different metrics of the product. Many metrics exist, but ultimately, the goal is measuring answer quality. For accurate evaluation, you need two things:

  • Gold Standard: A set of questions and answers validated by business owners, covering all use cases. It can essentially be your bot’s specification. More questions mean better evaluation, and this standard should evolve with new documents and requirements.
  • Evaluation Metrics: These metrics validate your product’s answers. Many exist, depending on usage, such as LLM-as-Judge (using an LLM to assess correctness), answer relevance, faithfulness (ensuring no hallucinated facts), and more. (You can find a good list online.)

Evaluation itself isn’t difficult, but convincing business owners why it’s necessary and generating the gold standard can be extremely painful (weeks of tears sometimes).

Moreover, business owners must own this evaluation and gold standard, as only they can update it to ensure any changes to the data corpus function correctly.

The Constant Wrestling with Uncertainty

Imagine a new project to build AI agents (RAG or otherwise). You must analyze needs, propose an architecture, and provide a quote.

But how do you ensure your approach and quotation are accurate? GenAI isn’t mature enough yet, meaning there’s insufficient experience. A use case might seem similar, but differing data, additional languages, or minor use-case differences can break everything. You’ll build something, realize it doesn’t work, and have to rebuild it completely.

This isn’t a problem for engineers, but Project Managers, Product Owners, or PMs hate uncertainty. They want a clear quotation and timeline (preferably short) and want to stick to it. But AI inherently has tons of uncertainty.

One solution is short experiments (or spikes) early in the project to test and validate uncertainties. At the end of these spikes, you usually know if something works or needs more time.

The Weight of Project Dynamics

To manage AI projects, new roles and processes have emerged.

We discussed evaluation and uncertainty handling, but these must integrate into your project dynamics. This means explaining new processes not just to the team but to business owners and even clients.

Regarding new roles, many appeared, often with “AI” in their titles. Some are temporary/hyped (like “Prompt Engineer”), while others will stay. One critical role is the AI Engineer.

This person implements the AI parts into products. They must understand how LLMs really work (beyond theory), RAG, Agentic, and be experienced with various frameworks, AI, and cloud providers. They must know AI tools and their limits. You don’t wake up one morning as an AI Engineer; it requires self-learning, hands-on implementation, and constant curiosity since AI evolves quickly.

But you have numerous roles: full-stack developer, DevOps, data engineer, ML engineer, data scientist, and now AI engineer. Determining role boundaries (often more person-dependent than role-dependent) and allocating workloads correctly is essential.

All these new roles and processes highlight the core problem with AI products: the real pain isn’t the technology, but organizational change (as usual).

Conclusion

After two years navigating this AI-generated hell, I can confidently say building AI agents for enterprise isn’t just technically challenging — it’s a multi-front battle. From rapid PoC illusions, tedious yet crucial evaluations, wrestling with architectural uncertainty, to shifting organizational dynamics — nothing is simple.

If you’re starting this journey, prepare yourself. Remember, the real challenge isn’t coding AI agents — it’s handling people, from your team’s demands, business owners’ expectations, to client needs.

In short, building enterprise-ready AI products is as much about human challenges as it is about AI.

Welcome to AI-generated hell!

👉 If you enjoyed this article and want to read more about AI, reasoning models, and Multi-Agent systems, follow me here on Medium or connect with me directly on LinkedIn!

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Discover Your Dream AI Career at Towards AI Jobs

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!

Note: Content contains the views of the contributing authors and not Towards AI.