Groq Raised $650M for Inference After Nvidia Took Its Founder and Core IP

Groq raised $650M on June 22, 2026 to scale its AI inference cloud, months after Nvidia's $20B deal took its founder and core IP. Why the bet is on inference, not chips.

Amon Taboi

26 Jun 2026 — 5 min read

TL;DR

Groq raised $650 million on June 22, 2026 to scale its AI inference cloud, led by Disruptive and the hedge fund Infinitum, with existing investors reinvesting (Groq newsroom).
It comes roughly six months after Nvidia's $20 billion December 2025 deal that licensed Groq's LPU technology and hired away senior staff, including founder and CEO Jonathan Ross, per reporting at the time (TechCrunch).
Groq now runs 13 data centers across four regions, claims 5 million-plus developers and "trillions of tokens" weekly, and is targeting 200 megawatts of capacity by the end of 2027 (Groq newsroom).
The round's valuation was not disclosed; Groq was last valued at $6.9 billion in a $750M round in September 2025, per prior reporting.

A chip company just raised $650 million after its biggest competitor walked off with its founder, its CEO, and a license to its core technology. That is the strange, instructive shape of Groq's new round, announced June 22: not a comeback story about better silicon, but a bet that the inference layer, the part of AI that actually answers your queries, is now a large enough market to fund a company even after its original team is gone.

What actually happened

Groq said it closed $650 million in growth capital to expand its AI inference cloud, in a round led by the investment firm Disruptive and the hedge fund Infinitum, with reinvestment from existing backers (Groq newsroom). The company frames the money around deployment: pushing its latest inference hardware across its existing 13 data centers in North America, Europe, the Middle East, and Asia-Pacific, and scaling toward 200 megawatts of capacity by the end of 2027.

The announcement did not state a new valuation. For context, Groq said it was valued at $6.9 billion in a September 2025 round that raised $750 million, roughly double its $2.8 billion valuation from August 2024, according to prior reporting. So the headline number ($650M) is actually smaller than the round before it, which is the first hint that this raise is about operations, not hype.

The real story: inference is the prize now

For three years the AI infrastructure narrative was about training, the giant clusters that build a model. Groq's raise is one more data point that the durable money is moving to inference, the cost of running that model billions of times a day once it exists. Infinitum's John Yetimoglu put the thesis bluntly: "We believe inference will become the largest infrastructure market in technology" (Groq newsroom).

That matters because inference economics are the opposite of training economics. Training is a capital event; inference is a utility bill that never stops. Every chatbot reply, every agent step, every retrieval call is an inference cost, and as products move from demos to production, that line item is what decides whether an AI feature is profitable. We covered the brutal version of this in OpenAI's losses increasing nearly 8x: the bill for serving models is the story of the industry right now.

The substance: what a "neocloud" actually sells

Groq is positioning as an inference "neocloud", a specialized cloud that does one thing (fast, cheap token generation) rather than competing with AWS across everything. Its pitch rests on the LPU, a chip designed specifically for the sequential math of inference rather than the parallel math of training. The claimed advantages are latency and cost per token, not raw model quality.

The numbers Groq offers as proof of scale: more than 5 million developers on the platform and "trillions of tokens" processed weekly (Groq newsroom). Those are usage figures, not revenue, and they're worth reading skeptically, a large share of developer signups on any free inference tier never become paying volume. But the 200-megawatt target by end of 2027 is the figure that actually signals intent: power, not chips, is the binding constraint on inference at scale, and committing to 200 MW is a commitment to a real, physical buildout.

Who wins, who loses

Wins , buyers of inference. A funded, independent inference cloud is good news for any team choosing a provider. More credible competition to Nvidia-on-the-hyperscalers means pricing pressure on tokens.
Wins , Nvidia, oddly. Nvidia got Groq's IP and talent for $20B in December and announced an LPX platform at GTC that incorporates Groq's inference technology (per reporting). Now an independent Groq will deploy that lineage at scale. Nvidia benefits whether you buy from it or from Groq.
Loses , pure-play training-chip narratives. Capital voting for inference is capital not voting for the next training-cluster startup.
Uncertain , Groq's own margins. Running a neocloud is a low-margin, capital-heavy infrastructure business, closer to a utility than to software. Raising $650M to fund 200 MW is a sign of how expensive this path is.

The non-obvious angle most coverage missed

Here is the part the funding headlines skip. Groq is rebuilding as an inference cloud at least partly on the same technology it licensed to Nvidia, and it's deploying Nvidia's new LPX system, which itself incorporates Groq's inference IP, inside its own data centers (per reporting). In other words, the company sold its crown jewels to its largest competitor, then raised money to operate a cloud that runs on a productized version of those same jewels.

That is clever and fragile at once. Clever, because Groq converts a one-time $20B licensing windfall into an operating business without having to keep winning the chip-design arms race against Nvidia. Fragile, because a neocloud whose hardware roadmap now partly depends on its biggest rival has surrendered the one thing, proprietary silicon, that made it special. The bet the new investors are making is not "Groq builds the best chip." It's "inference demand is so large that even a Groq without its founder, and partly dependent on Nvidia, is a good business." That's a bet on the market, not the company, and it tells you exactly how big the inference market has become.

What this means for you

If you ship AI features: treat inference as a sourcing decision, not a default. A capitalized independent like Groq is leverage in your pricing conversations with incumbents. Benchmark cost-per-token and latency on your real workloads before committing.
If you're planning AI budget: model inference as a recurring utility that grows with usage, not a fixed build cost. The companies winning on AI margins are the ones who treated token spend as a first-class line item early.
If you're an investor or operator watching the space: the tell to track is power, not chips. Watch who actually lands megawatts (Groq's 200 MW by 2027), because that, not benchmark slides, is what constrains real inference supply.

Frequently asked questions

How much did Groq raise and who led the round?

Groq raised $650 million, announced June 22, 2026, led by Disruptive and the hedge fund Infinitum, with existing investors reinvesting, according to Groq's newsroom.

What was Groq's valuation?

The new round's valuation was not disclosed. Groq was last reported at a $6.9 billion valuation in a $750 million round in September 2025, up from $2.8 billion in August 2024, per prior reporting.

Didn't Nvidia already take over Groq?

Not as an acquisition. In December 2025 Nvidia struck a roughly $20 billion deal to license Groq's LPU technology and hire senior staff, including founder Jonathan Ross, per reporting. Nvidia stated it had not acquired Groq. Groq remains an independent company and has since brought in new leadership.

What is an inference "neocloud"?

A cloud specialized for running (not training) AI models, optimized for low-latency, low-cost token generation. Groq sells access to its LPU-based inference rather than competing as a general-purpose cloud.

Groq Raised $650M for Inference After Nvidia Took Its Founder and Core IP

Amon Taboi

TL;DR

What actually happened

The real story: inference is the prize now

The substance: what a "neocloud" actually sells

Who wins, who loses

The non-obvious angle most coverage missed

What this means for you

Frequently asked questions

How much did Groq raise and who led the round?

What was Groq's valuation?

Didn't Nvidia already take over Groq?

What is an inference "neocloud"?

Sources

Read more

White House's Aliens.gov Site Brags That ICE Arrested More Than 700 US Citizens

Gemini 3.5 Pro and the Announcement-to-Shipping Gap Costing Google

ChatGPT Just Lost Its Majority. The Real Story Is Ads, Not Decline.

Real-time LLM Inference on Standard GPUs: 3k tokens/s per request

TL;DR

What actually happened

The real story: inference is the prize now

The substance: what a "neocloud" actually sells

Who wins, who loses

The non-obvious angle most coverage missed

What this means for you

Frequently asked questions

How much did Groq raise and who led the round?

What was Groq's valuation?

Didn't Nvidia already take over Groq?

What is an inference "neocloud"?

Sources

Related reading

Read more

White House's Aliens.gov Site Brags That ICE Arrested More Than 700 US Citizens

Gemini 3.5 Pro and the Announcement-to-Shipping Gap Costing Google

ChatGPT Just Lost Its Majority. The Real Story Is Ads, Not Decline.

Real-time LLM Inference on Standard GPUs: 3k tokens/s per request