Zendoric
← Back to the day · June 25, 2026

Groq bets $650 million that the money in AI is in running, not training

🕒 Published on Zendoric: June 25, 2026 · 09:00

The LPU chip company closes a $650 million round with a surgical focus: to be the world's leading inference cloud. And one detail changes how to read it all: NVIDIA has licensed its technology rather than replicate it.

In the artificial intelligence economy, the talk for years has been almost exclusively about training: the giant models, the massive clusters, the record compute figures. Groq has just put 650 million dollars —a round led by the funds Disruptive and Infinitum— behind the opposite thesis: that the bulk of long-term value is not in building the model, but in running it millions of times a day. It is a bet on radical specialization, and it has the virtue of being easy to state and devilishly hard to execute.

The figures the company itself shares give substance to that ambition: 13 operational data centers across four regions of the world, more than five million developers on GroqCloud and trillions of tokens processed each week. Behind this is a decision made in 2016 that seemed to go against the grain at the time: to design its own chip from scratch, the LPU (Language Processing Unit), optimized for inference instead of adapting GPUs designed for something else. That this bet took years to reach operational scale is not a flaw, but the usual pattern of truly differential hardware plays.

The turning point deserves to be read carefully and without overinterpretation. In December 2025 Groq signed a non-exclusive licensing agreement with NVIDIA, and at GTC 2026 the chip giant unveiled its LPX platform incorporating Groq's inference technology. A move like this could have been read as surrender; the announcement presents it as exactly the opposite, and the interpretation has a basis. That the absolute leader of the accelerator market chooses to license an architecture rather than copy it internally is, normally, the most eloquent possible market signal about the value of that technology. It is worth remembering, however, that it is a non-exclusive agreement: Groq gains validation and reach, but not protection against the competition.

The restructuring of the management team reinforces the hypothesis of an offensive into the enterprise segment. Alongside internal leaders such as CEO Adam Winter and CFO Matt Eng come very specific profiles: Alan Rice as COO, with experience at xAI, Meta Datacenters and, earlier, in the U.S. Navy's nuclear submarine operations; and starting in July, Sinclair Schuller (founder of Apprenda and Nuvalence) as CTO and Rakesh Malhotra, a decade in Microsoft's cloud, as CPO. The combination is not decorative: managing an inference cloud that competes on availability, latency and cost per token resembles operating critical infrastructure far more than training a model, and the hires point exactly there.

The industrial plan is ambitious but deliberately bounded: to scale to 200 megawatts before the end of 2027, equipping the already operational centers —including the new LPX system— rather than building infrastructure from scratch. That choice reduces time-to-market and execution risk, a discipline that is welcome in a sector prone to pharaonic announcements.

The underlying thesis is summed up by John Yetimoglu, of Infinitum: inference will be the largest technology infrastructure market, and could demand between 15 and 20 times more compute than training. The figure is not attributed in the text to a specific external source and should be taken as an interested estimate, not as a settled fact. But the direction is solid: as AI moves from the laboratory to everyday use, the economic center of gravity shifts toward whoever runs it cheaper and faster. Groq has decided to bet everything on that card.

Sources & references