Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

The Hidden Step Before AGI Will Change Everything
Artificial Intelligence   Latest   Machine Learning

The Hidden Step Before AGI Will Change Everything

Last Updated on January 14, 2025 by Editorial Team

Author(s): Frederik Bussler

Originally published on Towards AI.

Photo by Lindsay Henwood on Unsplash

In 2022, when OpenAI released ChatGPT, tech forecasters around the world were consumed by predictions of artificial general intelligence. The race to build machines that could think like humans consumed the imagination of Silicon Valley and beyond. Over two years and countless AI breakthroughs later, that race continues.

Yet last week, when H2O.ai’s agentic AI h2oGPTe achieved 65% of human-level performance on complex business tasks β€” outperforming both Google and Microsoft β€” it exposed an overlooked truth about AI. Before machines can think like humans, they need to learn to work like them. A chat-based AI can be useful, sure. But what’s a lot more useful is an AI that can actually do work for you. This evolutionary step, largely ignored in the rush toward AGI, could reshape enterprise technology faster than anyone expected.

The Real World Test

The GAIA benchmark, released by researchers from Meta AI and Hugging Face in late 2023, was designed to measure something deceptively simple β€” can AI systems handle the kind of multi-step tasks that fill our workdays? Not theoretical problems or multiple-choice tests, but the messy reality of analyzing data, researching complex topics, and synthesizing information across different formats and sources.

For months, the results were middling. Even advanced AI systems from tech giants struggled to achieve more than 40% of human-level performance. Google’s latest attempt reached 49%. Microsoft’s most advanced agent, using OpenAI’s cutting-edge models, managed only 38%. These scores showed that despite all the AGI hype, AI still couldn’t handle the basic tasks that human knowledge workers perform every day.

That’s what makes H2O.ai’s recent 65% breakthrough so significant. Using a combination of Anthropic’s language models and specialized tools for tasks like code execution and data analysis, their AI agent demonstrated unprecedented capability in handling real-world complexity. More importantly, it did so without the need for complex orchestration or specialized workflows, suggesting that AI agents are finally ready to move from research labs into practical business applications.

Breaking Through

This evolution in AI capability goes far beyond benchmark scores. Major enterprises are already seeing the impact. β€œAgentic AI is eating SaaS and with h2oGPTe Agentic AI now being generally available, all our enterprise customers can solve a wide range of sophisticated business and research problems,” explains Sri Ambati, H2O.ai’s CEO. This perspective challenges the traditional narrative around artificial intelligence, which has largely focused on the distant promise of AGI rather than the immediate potential of capable AI agents.

The implications extend beyond individual companies. Microsoft’s Nadella has begun describing a future where traditional software applications are replaced by AI agents that can work across multiple tools and databases. Imagine asking an AI to β€œanalyze our Q4 performance, identify concerning trends, and prepare a board presentation,” and having it actually execute each step competently. With current agent technology reaching 65% of human capability, this future is less science fiction and more like an inevitable next step in enterprise evolution.

The Next Wave of AI

For enterprise leaders, this inflection point in AI agent capability raises pressing strategic questions. While the path from today’s 65% human-level performance to AGI remains uncertain, the immediate impact of competent AI agents is becoming clear. Companies that successfully integrate these systems could gain significant advantages in operational efficiency and decision-making speed.

Consider the evolution of enterprise software. Traditional SaaS applications essentially function as β€œCRUD databases with business logic,” as Nadella puts it. But AI agents can work across these siloed systems, orchestrating complex workflows that previously required multiple specialized tools and human intervention. A single AI agent might pull data from Salesforce, analyze it using internal business intelligence tools, cross-reference findings against market research databases, and synthesize everything into actionable recommendations, all while maintaining context and adapting to changing requirements.

The implications for workforce transformation are equally profound. Rather than replacing knowledge workers outright, AI agents are emerging as powerful collaborators that can handle the routine aspects of complex tasks. This frees human workers to focus on higher-level strategy, creativity, and relationship building.

The Path Forward

The race toward AGI will undoubtedly continue. But H2O.ai’s breakthrough on the GAIA benchmark reveals a more immediate revolution: AI agents that can actually handle the complexity of real-world work. As these systems improve from today’s 65% mark toward human-level performance on specific tasks (and beyond), they’ll reshape how enterprises operate, how software is built, and how knowledge work gets done.

This transformation won’t happen overnight. But unlike the nebulous timeline for AGI, the path forward for AI agents is clear and measurable. Companies that recognize this β€œhidden step” in AI’s evolution will be better positioned for the future, regardless of when true AGI arrives. The next few years will be less about chasing sci-fi dreams of human-like AI and more about building practical systems that can work alongside humans effectively.

For enterprise leaders watching the AI space, 2025 may well be remembered as the year when AI agents finally proved they could do the work, long before they learned to think.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.

Published via Towards AI

Feedback ↓