The Hidden Step Before AGI Will Change Everything
Last Updated on January 14, 2025 by Editorial Team
Author(s): Frederik Bussler
Originally published on Towards AI.
In 2022, when OpenAI released ChatGPT, tech forecasters around the world were consumed by predictions of artificial general intelligence. The race to build machines that could think like humans consumed the imagination of Silicon Valley and beyond. Over two years and countless AI breakthroughs later, that race continues.
Yet last week, when H2O.aiβs agentic AI h2oGPTe achieved 65% of human-level performance on complex business tasks β outperforming both Google and Microsoft β it exposed an overlooked truth about AI. Before machines can think like humans, they need to learn to work like them. A chat-based AI can be useful, sure. But whatβs a lot more useful is an AI that can actually do work for you. This evolutionary step, largely ignored in the rush toward AGI, could reshape enterprise technology faster than anyone expected.
The Real World Test
The GAIA benchmark, released by researchers from Meta AI and Hugging Face in late 2023, was designed to measure something deceptively simple β can AI systems handle the kind of multi-step tasks that fill our workdays? Not theoretical problems or multiple-choice tests, but the messy reality of analyzing data, researching complex topics, and synthesizing information across different formats and sources.
For months, the results were middling. Even advanced AI systems from tech giants struggled to achieve more than 40% of human-level performance. Googleβs latest attempt reached 49%. Microsoftβs most advanced agent, using OpenAIβs cutting-edge models, managed only 38%. These scores showed that despite all the AGI hype, AI still couldnβt handle the basic tasks that human knowledge workers perform every day.
Thatβs what makes H2O.aiβs recent 65% breakthrough so significant. Using a combination of Anthropicβs language models and specialized tools for tasks like code execution and data analysis, their AI agent demonstrated unprecedented capability in handling real-world complexity. More importantly, it did so without the need for complex orchestration or specialized workflows, suggesting that AI agents are finally ready to move from research labs into practical business applications.
Breaking Through
This evolution in AI capability goes far beyond benchmark scores. Major enterprises are already seeing the impact. βAgentic AI is eating SaaS and with h2oGPTe Agentic AI now being generally available, all our enterprise customers can solve a wide range of sophisticated business and research problems,β explains Sri Ambati, H2O.aiβs CEO. This perspective challenges the traditional narrative around artificial intelligence, which has largely focused on the distant promise of AGI rather than the immediate potential of capable AI agents.
The implications extend beyond individual companies. Microsoftβs Nadella has begun describing a future where traditional software applications are replaced by AI agents that can work across multiple tools and databases. Imagine asking an AI to βanalyze our Q4 performance, identify concerning trends, and prepare a board presentation,β and having it actually execute each step competently. With current agent technology reaching 65% of human capability, this future is less science fiction and more like an inevitable next step in enterprise evolution.
The Next Wave of AI
For enterprise leaders, this inflection point in AI agent capability raises pressing strategic questions. While the path from todayβs 65% human-level performance to AGI remains uncertain, the immediate impact of competent AI agents is becoming clear. Companies that successfully integrate these systems could gain significant advantages in operational efficiency and decision-making speed.
Consider the evolution of enterprise software. Traditional SaaS applications essentially function as βCRUD databases with business logic,β as Nadella puts it. But AI agents can work across these siloed systems, orchestrating complex workflows that previously required multiple specialized tools and human intervention. A single AI agent might pull data from Salesforce, analyze it using internal business intelligence tools, cross-reference findings against market research databases, and synthesize everything into actionable recommendations, all while maintaining context and adapting to changing requirements.
The implications for workforce transformation are equally profound. Rather than replacing knowledge workers outright, AI agents are emerging as powerful collaborators that can handle the routine aspects of complex tasks. This frees human workers to focus on higher-level strategy, creativity, and relationship building.
The Path Forward
The race toward AGI will undoubtedly continue. But H2O.aiβs breakthrough on the GAIA benchmark reveals a more immediate revolution: AI agents that can actually handle the complexity of real-world work. As these systems improve from todayβs 65% mark toward human-level performance on specific tasks (and beyond), theyβll reshape how enterprises operate, how software is built, and how knowledge work gets done.
This transformation wonβt happen overnight. But unlike the nebulous timeline for AGI, the path forward for AI agents is clear and measurable. Companies that recognize this βhidden stepβ in AIβs evolution will be better positioned for the future, regardless of when true AGI arrives. The next few years will be less about chasing sci-fi dreams of human-like AI and more about building practical systems that can work alongside humans effectively.
For enterprise leaders watching the AI space, 2025 may well be remembered as the year when AI agents finally proved they could do the work, long before they learned to think.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI