Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Free: 6-day Agentic AI Engineering Email Guide.
Learnings from Towards AI's hands-on work with real clients.
The AI Industry Is Eating Itself: Nvidia’s B Power Play, the End of Scaling, and the  Trillion Question
Artificial Intelligence   Latest   Machine Learning

The AI Industry Is Eating Itself: Nvidia’s $20B Power Play, the End of Scaling, and the $2 Trillion Question

Last Updated on December 29, 2025 by Editorial Team

Author(s): Zoom In AI

Originally published on Towards AI.

Two stories from the past few weeks reveal an AI ecosystem at war with itself. The chips are winning. The economics might not.

The AI Industry Is Eating Itself: Nvidia’s $20B Power Play, the End of Scaling, and the $2 Trillion Question

The Setup: What Just Happened

On Christmas Eve 2025, while most of the world switched off, Nvidia did the opposite.

Multiple reports say Nvidia agreed to a deal worth around $20 billion to license technology from Groq, an AI chip startup, and hire its founder Jonathan Ross along with president Sunny Madra. Not acquire — license. On paper, Groq remains an independent company under a new CEO. In practice, Nvidia now controls its most interesting ideas and its key people.

At almost the same time, a different kind of shock hit the AI world.

Ilya Sutskever, OpenAI’s co-founder and former chief scientist, declared that “pre-training as we know it will unquestionably end” because we are running out of high-quality data. Geoffrey Hinton and Yann LeCun, two of the three “Godfathers of AI,” have become increasingly blunt: simply scaling large language models (LLMs) on internet text is a dead end.

Taken together, these stories point in the same direction:

The industry is over-rotated on a single bet (infinite LLM scaling), over-exposed to a single vendor (Nvidia), and under-prepared for what happens if both assumptions fail.

Nvidia’s Groq deal is about securing the next phase of AI hardware. The scaling debate is about whether there will be software and business models worth running on it.

Nvidia’s $20 Billion Non-Acquisition

What Nvidia Actually Bought

Based on current reporting, Nvidia effectively acquired three things from Groq:

  • A non-exclusive license to Groq’s LPU (Language Processing Unit) designs
  • The founder and senior leadership, who are moving to Nvidia
  • Time — the ability to integrate inference-optimized ideas into Nvidia’s roadmap faster than if it built them in-house

What Nvidia did not buy is Groq Inc. itself. Formally, Groq remains an independent entity with a new CEO.

Why structure the deal this way?

Nvidia already dominates the AI accelerator market. A full acquisition of a high-profile challenger would have been a flashing red light for antitrust regulators in the U.S. and Europe. A licensing deal plus key hires gives Nvidia what it wants — IP and talent — while keeping Groq technically alive.

From a regulatory perspective, it maintains the appearance of competition. From a strategic perspective, it pulls a differentiated architecture inside the Nvidia orbit before it can mature into something truly threatening.

Why Groq Was Different

Most AI accelerators — including Nvidia’s — are descendants of GPU architectures originally designed for graphics and later adapted for training large neural networks.

Groq took a different path: a deterministic, inference-first architecture.

A few of the design ideas, simplified:

  • Inference-optimized: Built to run trained models at predictable, low latency rather than maximize training throughput
  • Massive on-chip bandwidth: Heavy use of on-chip SRAM with extremely high bandwidth, reducing dependence on slower external memory
  • Deterministic execution: The same latency, every time — critical for real-time systems such as trading, autonomous control, and interactive agents

Groq was not about taking Nvidia’s training crown. It was about owning real-time AI inference just as inference economics are starting to matter more than training.

That is the context for the reported $20B price tag. Groq’s last private valuation was around $6.9B. Nvidia is effectively paying a triple to neutralize a differentiated competitor and fold its approach into the Nvidia stack.

It is not a classic acquisition. It is a pre-emptive absorption.

The Inference War Has Started

Public discussion of AI hardware tends to fixate on training: the eye-catching “it cost $100M to train this model” numbers.

But the unit economics of AI are dominated by inference:

  • Training a frontier model is expensive, but done a handful of times
  • Running that model for millions of users is recurring and scales with usage

Estimates of OpenAI’s daily inference spend already run into the hundreds of thousands of dollars, and that was before the latest wave of products. As usage grows, inference will account for the majority of compute spend.

That is why inference-optimized hardware matters. It is also why every major cloud provider is building its own chips to reduce reliance on Nvidia.

How the Hyperscalers Are Responding

A high-level snapshot of the custom silicon landscape:

  • Google — TPU v7 “Ironwood”
    Google’s latest TPU generation advertises performance competitive with, and in some workloads slightly ahead of, Nvidia’s B-series GPUs. Anthropic has committed to using up to one million or more TPUs in a multi-year deal described as being worth “tens of billions of dollars.” Google also runs its own models at scale on TPUs, which creates a tight feedback loop between hardware and software.
  • Amazon — Trainium 2 and 3
    Trainium 2 is generally available; Trainium 3, built on a 3nm process, promises significant performance and efficiency gains. AWS claims its in-house accelerators already underpin a multibillion-dollar business. How much of that is displacement of Nvidia vs. incremental demand remains an open question.
  • Microsoft — Maia
    Maia 200 has reportedly slipped to 2026, with some earlier plans scaled back or reworked. Public reporting suggests friction in pinning down requirements as OpenAI’s needs evolved.
  • Startups — Cerebras, SambaNova, Tenstorrent, others
    Architecturally interesting, but facing long sales cycles, capital constraints, and the gravitational pull of Nvidia’s mature ecosystem (CUDA, cuDNN, libraries, tooling).

The consistent pattern: escaping Nvidia’s orbit quickly is extremely hard. The software ecosystem, tooling, and accumulated developer familiarity are formidable moats.

Nvidia understands this. The Groq transaction is best seen as a move to close a high-quality “escape hatch” before customers can migrate meaningful inference workloads through it.

The Civil War Over Scaling

Underneath all of this is a deeper conflict: does simply scaling up LLMs continue to work, or are we approaching structural limits?

There are three underlying questions:

  1. Do we have enough high-quality training data left to keep scaling in the same way?
  2. Do we still get economically meaningful returns from more compute?
  3. Can synthetic data (AI training on AI-generated content) safely replace human data at scale?

Different leaders are giving starkly different answers.

The “Scaling Is Hitting Limits” View

  • Ilya Sutskever (co-founder and former chief scientist, OpenAI) has said:
    “Pre-training as we know it will unquestionably end. We have but one internet. The data is not growing.”
  • Yann LeCun (long-time Meta Chief AI Scientist, now pursuing alternative architectures) describes large language models as a “dead end” for building truly intelligent systems. They can manipulate language but do not acquire robust models of the physical world from text alone.
  • Geoffrey Hinton (Nobel laureate, former Google) has stressed the need for other learning paradigms — self-play, world models, richer sensory input — and assigns a non-trivial chance (10–20%) that advanced AI systems could lead to catastrophic outcomes if misaligned.

Empirical work supports the data-constraint side of this argument. Forecasts by groups like Epoch AI suggest that high-quality, human-generated text suitable for pre-training will be effectively exhausted between 2026 and 2032, depending on what one counts as “usable.”

Synthetic data helps, but introduces “model collapse” risks as systems train on their own outputs and propagate their own errors.

The “There Is No Wall” View

  • Sam Altman (CEO, OpenAI) has argued that “there is no wall,” and that spending more money continues to yield predictable capability gains.
  • Dario Amodei (CEO, Anthropic) is similarly bullish, arguing that synthetic data pipelines can generate effectively unlimited training data and that AI could compress multiple decades of scientific progress into a single decade.
  • Demis Hassabis (CEO, Google DeepMind) advocates pushing scaling “to the maximum,” but also acknowledges that current systems lack robust reasoning, planning, and long-term memory, and will need architectural innovation on top of raw scale.

In this view, the scaling curve may bend but has not broken. You keep pushing model size, training time, and data volume, and new capabilities continue to emerge.

What the Evidence Suggests So Far

The emerging picture is mixed:

  • Scaling still tends to improve performance, but with more sharply diminishing returns than earlier waves
  • Data constraints are real at frontier scale
  • Synthetic data can help, but only with careful quality control and feedback mechanisms
  • Safety, reliability, and robustness do not automatically improve with size

If the “no wall” thesis is overstated, the implications for hardware demand, capex, and valuations are substantial.

The $500 Billion Capex Problem

Big Tech’s capital expenditure numbers are in their own category now.

Across 2025–2027, major hyperscalers (Amazon, Microsoft, Alphabet, Meta, and others) are projected to spend in the neighborhood of $1.15 trillion on AI-driven infrastructure. Year-by-year figures differ by source, but the direction is consistent: spending is accelerating sharply.

Indicative ranges based on current guidance and analyst estimates:

  • Amazon: roughly $120–130 billion in 2025 capex, up around 60% year-over-year
  • Microsoft: roughly $80 billion, up about 50%
  • Alphabet: roughly $75–90 billion, up over 40%
  • Meta: around $70 billion, up close to 80–90%

Several analyses, including work from Bain & Company, estimate that to justify this build-out, the ecosystem will need on the order of $2 trillion in annual AI-related revenue by 2030.

For comparison, the combined 2024 revenue of Amazon, Apple, Alphabet, Microsoft, Meta, and Nvidia was below that number.

The ROI Gap

The most worrying datapoint is not on the spending side but on the return side.

An MIT study in 2025 found that roughly 95% of surveyed organizations reported no measurable ROI from their generative AI deployments, despite collectively investing tens of billions of dollars in pilots, proofs-of-concept, and tooling.

At the same time, insiders are not exactly complacent:

  • Sam Altman has acknowledged that investors are likely “overexcited” about AI right now.
  • Bret Taylor, OpenAI’s board chair, has said it is both true that AI will transform the economy and that we are in a bubble where “a lot of people will lose a lot of money.”

The risk is not that AI fails to matter. The risk is that the timing and distribution of returns do not match the capital being deployed in this cycle.

Circular Demand

A further complication is the circular nature of some spending flows:

  • Nvidia invests in AI labs and startups
  • Those labs use that capital to buy Nvidia GPUs
  • Nvidia books the revenue and reinforces its growth narrative
  • Higher valuations make it easier to invest in more customers

Nothing about this structure is inherently improper. But it does blur the line between true end-customer demand and ecosystem-financed demand. When credit conditions tighten or growth expectations reset, circular flows can unwind very quickly.

The US–China Compute Divide

All of this plays out inside a geopolitical race that is increasingly focused on compute capacity.

A 5:1 Compute Advantage — For Now

Export controls on high-end AI accelerators give the U.S. and its allies an estimated five-to-one advantage in frontier compute over China, at least through the mid-2020s. Some analyses put the U.S. share of advanced AI compute around 70–75%, with China at roughly 14–15%.

Huawei’s Ascend line is gradually improving:

  • The Ascend 910C is estimated to reach a significant fraction of Nvidia H100 performance on some workloads
  • Memory configurations now include 128GB HBM3
  • Yield issues appear to be improving, though domestic fabrication still trails leading-edge nodes

China is responding with policy rather than parity:

  • State-funded data centers have been instructed to phase out foreign AI accelerators
  • Some cities and provinces have introduced local content requirements (e.g., 50–70% domestic chips)
  • Beijing continues to pour capital into domestic GPU and accelerator efforts

Nvidia’s data center share in China has already fallen from near-total dominance to something closer to half, driven by both export controls and local substitution.

Rare Earths and Materials Risk

In late 2025, China imposed export licensing requirements on products containing Chinese rare earths, explicitly including advanced semiconductors. China currently dominates both production and processing of these materials.

The immediate consequences:

  • Stockpiling and hedging by equipment makers and chip designers
  • Price spikes in several critical inputs
  • Renewed calls in the U.S., EU, Japan, and elsewhere to diversify supply chains

This is not just macro background noise. It directly affects whether the capex being poured into AI infrastructure runs into physical and political constraints that optimistic adoption models do not fully capture.

If your AI strategy assumes cheap, abundant, and geopolitically neutral compute, that assumption deserves scrutiny.

Three Ways This Could Play Out

These are rough probabilities, not precise forecasts, but they are a useful way to think about the range of outcomes.

1. Bull Case — The Virtuous Cycle Holds (≈ 40%)

  • Scaling continues to deliver meaningful capability gains
  • Synthetic data and architectural improvements offset data and performance limits
  • Inference efficiency improves enough to make economics work at scale
  • “Must-have” AI applications emerge that justify the infrastructure build-out

Winners: Nvidia, the hyperscalers, and a small number of AI-native companies with real moats and clear business models.

2. Base Case — The Long Plateau (≈ 45%)

  • Scaling yields diminishing but still positive returns
  • Many enterprises struggle to move from pilot to production and to quantify ROI
  • Capex continues for a time, then is moderated as CFOs push for discipline
  • AI proves transformational in some verticals, incremental in many others
  • Revenue eventually catches up, but with a multi-year lag

Winners: diversified players that can absorb a messy middle period and companies that design for efficiency and domain fit rather than pure hype.

3. Bear Case — The Great Correction (≈ 15%)

  • Scaling walls prove more severe than the optimists expected
  • Synthetic data and new architectures do not change the trajectory quickly enough
  • Capex cuts hit GPU orders hard; Nvidia’s growth narrative breaks
  • Large write-downs cascade through balance sheets across the stack
  • AI re-rates from “new electricity” to an over-built utility, at least for a cycle

Winners: patient capital, value investors, and whoever owns the right assets when valuations reset.

What This Means in Practice

For Builders

  • Assume inference won’t be cheap by default. Design pricing and business models that work under conservative compute-cost assumptions.
  • Avoid hard lock-in where you can’t afford it. Even if you standardize on Nvidia today, keep an eye on portability at the software layer.
  • Think beyond “call the API.” Retrieval, distillation, compression, and task-specific models are as important as access to a frontier endpoint.

For Investors

  • Separate GPU-driven growth from genuine product-market fit. “We raised a round to buy more GPUs” is not a business model on its own.
  • Look at realized revenue and retention, not just headline partnerships. Multi-year “up to” commits are optionality, not cash.
  • Watch for circular demand. Follow the money all the way to the end customer.

For Engineers

  • Lean into efficiency. Skills in quantization, pruning, distillation, caching, and retrieval-augmented design will age well.
  • Expect platform volatility. Hardware, runtimes, and model providers will keep shifting. Build abstractions, but build them carefully.
  • Stay grounded in fundamentals. Systems, networks, data infrastructure, and distributed computing will matter as much as model tinkering.

Closing Thoughts

The AI industry is, at the same time:

  • Building what may become the most important technology platform of this century
  • Potentially recreating the largest over-investment cycle since the dot-com era
  • Engaged in a U.S.–China compute race with no clear endpoint
  • Arguing internally about whether its core scaling thesis is even correct

Nvidia’s Groq move is a defensive masterstroke. It pulls a differentiated inference architecture into Nvidia’s sphere without triggering an obvious antitrust chokepoint.

The scaling debate is not academic. If Sutskever, LeCun, and Hinton are closer to right than wrong, a meaningful share of today’s capex cycle is mispriced.

The U.S.–China split is accelerating. By 2030, we may be looking at two partially incompatible AI ecosystems with different hardware, standards, and regulatory regimes.

The $2 trillion question — whether AI revenues can justify the infrastructure being built — remains unanswered.

The next two to three years will determine whether this period looks, in hindsight, more like the early internet (massive over-building that eventually paid off) or the pure-bubble end of 1999.

The gap between those two outcomes is measured in trillions of dollars.

This analysis is independent research synthesizing public financial filings, analyst reports, and verified news sources. It is not financial advice.

References & Sources

Nvidia–Groq Deal

  1. CNBC — “Nvidia buying AI chip startup Groq’s assets for about $20 billion in its largest deal on record”
    https://www.cnbc.com/2025/12/24/nvidia-buying-ai-chip-startup-groq-for-about-20-billion-biggest-deal.html
  2. TechCrunch — “Nvidia to license AI chip challenger Groq’s tech and hire its CEO”
    https://techcrunch.com/2025/12/24/nvidia-acquires-ai-chip-challenger-groq-for-20b-report-says/
  3. Yahoo Finance — “Breaking down Nvidia’s unusual $20 billion deal with Groq”
    https://finance.yahoo.com/news/nvidia-acquire-groq-20-billion-214927907.html
  4. SiliconANGLE — “Nvidia to license technology from inference chip startup Groq in reported $20B deal”
    https://siliconangle.com/2025/12/24/nvidia-license-technology-inference-chip-startup-groq-reported-20b-deal/

Scaling Debate & AI Leaders

  1. Sam Altman (X/Twitter) — “there is no wall”
    https://x.com/sama/status/1856941766915641580
  2. Business Insider — “Sam Altman says ‘there is no wall’ in an apparent response to fears of an AI slowdown”
    https://www.businessinsider.com/sam-altman-there-is-no-wall-ai-slowdown-2024-11
  3. DeepNewz — “Ilya Sutskever Declares ‘Pre-Training as We Know It Will End’ at NeurIPS 2024”
    https://deepnewz.com/ai-modeling/ilya-sutskever-declares-pre-training-we-know-end-neurips-2024-citing-peak-data-18418711
  4. 36Kr — “Turing Award Winner Yann LeCun: Large Models a ‘Dead End’”
    https://eu.36kr.com/en/p/3571987975018880
  5. The Decoder — “The case against predicting tokens to build AGI”
    https://the-decoder.com/the-case-against-predicting-tokens-to-build-agi/
  6. The Information Bottleneck Podcast — EP20: Yann LeCun
    https://www.the-information-bottleneck.com/ep20-yann-lecun/
  7. Wikipedia — Geoffrey Hinton
    https://en.wikipedia.org/wiki/Geoffrey_Hinton

Hyperscaler Capex & AI Investment

  1. Goldman Sachs — “Why AI Companies May Invest More than $500 Billion in 2026”
    https://www.goldmansachs.com/insights/articles/why-ai-companies-may-invest-more-than-500-billion-in-2026
  2. CreditSights — “Technology: Hyperscaler Capex 2026 Estimates”
    https://know.creditsights.com/insights/technology-hyperscaler-capex-2026-estimates/
  3. Invezz — “Looking ahead to 2026: why hyperscalers can’t slow spending without losing the AI war”
    https://invezz.com/news/2025/12/26/looking-ahead-to-2026-why-hyperscalers-cant-slow-spending-without-losing-the-ai-war/
  4. IO Fund — “Big Tech’s $405B Bet: Why AI Stocks Are Set Up for a Strong 2026”
    https://io-fund.com/ai-stocks/ai-platforms/big-techs-405b-bet
  5. CNBC — “How the AI market could splinter in 2026”
    https://www.cnbc.com/2025/12/25/how-the-ai-market-could-splinter-in-2026-.html

AI Bubble and ROI Concerns

  1. Longbridge — “OpenAI Board Chairman: We are indeed in an ‘AI bubble’”
    https://longbridge.com/en/news/257277234
  2. DigitrendZ — “OpenAI Chair Bret Taylor: We’re in an AI Bubble (And That’s Okay)”
    https://digitrendz.blog/newswire/artificial-intelligence/46023/openai-chair-bret-taylor-were-in-an-ai-bubble-and-thats-okay/
  3. IEEE ComSoc Techblog — “AI spending boom accelerates: Big tech to invest an aggregate of $400 billion in 2025”
    https://techblog.comsoc.org/2025/11/01/ai-spending-boom-accelerates-big-tech-to-invest-invest-an-aggregate-of-400-billion-in-2025-more-in-2026/
  4. MIT / Axios / Entrepreneur coverage of enterprise AI ROI
    (e.g., Axios: https://www.axios.com/2025/08/21/ai-wall-street-big-tech)

Nvidia Strategy & “Virtuous Cycle”

  1. CNBC — “Nvidia CEO Jensen Huang says AI is in a ‘virtuous cycle’”
    https://www.cnbc.com/2025/10/31/nvidia-ceo-jensen-huang-says-ai-has-reached-a-virtuous-cycle.html
  2. Bloomberg via Yahoo Finance — “Nvidia CEO Downplays AI Bubble Fears as He Enlists New Partners”
    https://finance.yahoo.com/news/nvidia-ceo-rebuts-fears-ai-194608223.html

US–China AI Chip Competition

  1. Council on Foreign Relations — “China’s AI Chip Deficit: Why Huawei Can’t Catch Nvidia and U.S. Export Controls Should Remain”
    https://www.cfr.org/article/chinas-ai-chip-deficit-why-huawei-cant-catch-nvidia-and-us-export-controls-should-remain
  2. CSIS — “DeepSeek, Huawei, Export Controls, and the Future of the U.S.–China AI Race”
    https://www.csis.org/analysis/deepseek-huawei-export-controls-and-future-us-china-ai-race
  3. RAND — “Leashing Chinese AI Needs Smart Chip Controls”
    https://www.rand.org/pubs/commentary/2025/08/leashing-chinese-ai-needs-smart-chip-controls.html
  4. SemiAnalysis — “Huawei Ascend Production Ramp: Die Banks, TSMC Continued Production, HBM is The Bottleneck”
    https://semianalysis.com/2025/09/08/huawei-ascend-production-ramp/
  5. Georgetown CSET — “Pushing the Limits: Huawei’s AI Chip Tests U.S. Export Controls”
    https://cset.georgetown.edu/publication/pushing-the-limits-huaweis-ai-chip-tests-u-s-export-controls/
  6. Institute for Progress — “The H20 Problem: Inference, Supercomputers, and US Export Control Gaps”
    https://ifp.org/the-h20-problem/

AI Chip Competitors

  1. AWS — Trainium
    https://aws.amazon.com/ai/machine-learning/trainium/
  2. Google Cloud Blog — “3 things to know about Ironwood, Google’s latest TPU”
    https://blog.google/products/google-cloud/ironwood-google-tpu-things-to-know/
  3. TechCrunch — “Andy Jassy says Amazon’s Nvidia competitor chip is already a multibillion-dollar business”
    https://techcrunch.com/2025/12/03/andy-jassy-says-amazons-nvidia-competitor-chip-is-already-a-multi-billion-dollar-business/

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

15 engineers. 100,000+ students. Towards AI Academy teaches what actually survives production.

Start free — no commitment:

6-Day Agentic AI Engineering Email Guide — one practical lesson per day

Agents Architecture Cheatsheet — 3 years of architecture decisions in 6 pages

Our courses:

AI Engineering Certification — 90+ lessons from project selection to deployed product. The most comprehensive practical LLM course out there.

Agent Engineering Course — Hands on with production agent architectures, memory, routing, and eval frameworks — built from real enterprise engagements.

AI for Work — Understand, evaluate, and apply AI for complex work tasks.

Note: Article content contains the views of the contributing authors and not Towards AI.