OpenAI's AI Inference Shift Tests Nvidia's Dominance, Sparks Chip Industry Ripples

By Sophia Reynolds | Financial Markets Editor

By Max A. Cherney, Krystal Hu and Deepa Seetharaman

SAN FRANCISCO, Feb 2 (Reuters) – The artificial intelligence boom's most celebrated partnership is showing signs of strain. OpenAI, the creator of ChatGPT, has grown dissatisfied with certain Nvidia chips for critical tasks and has been exploring alternatives since last year, eight sources with knowledge of the situation told Reuters. The move signals a pivotal shift in the AI hardware landscape and complicates the relationship between two of the sector's most influential players.

The core of OpenAI's strategic reassessment lies in the growing importance of inference—the process where an AI model generates responses to user queries. While Nvidia's graphics processing units (GPUs) remain the undisputed champion for training massive models, the race for faster, more efficient inference chips has become the industry's new battleground. OpenAI's search for specialized inference hardware represents a direct challenge to Nvidia's end-to-end dominance.

This divergence emerges against the backdrop of protracted investment talks between the companies. A reported deal, which would have seen Nvidia invest up to $100 billion for a stake in OpenAI, has stalled for months. During this period, OpenAI has diversified its supplier base, striking deals with rivals like AMD. A person familiar with the negotiations said OpenAI's evolving product needs, particularly for inference, have contributed to the delays.

Nvidia CEO Jensen Huang recently dismissed reports of tension as "nonsense," reaffirming the chipmaker's commitment to a major investment in OpenAI. In a statement, Nvidia emphasized that customers choose its products for inference due to "the best performance and total cost of ownership at scale." An OpenAI spokesperson separately acknowledged relying on Nvidia for most of its inference needs, calling its offerings the best value.

However, sources indicate a specific performance gap. Seven people said OpenAI is unsatisfied with the speed of Nvidia's hardware in generating answers for specialized tasks like software development and AI-to-software communication. One source said the company seeks new hardware to eventually handle about 10% of its future inference computing load.

OpenAI's quest led it to startups like Cerebras and Groq, which design chips with large amounts of fast, on-chip memory (SRAM)—a architecture advantageous for high-speed inference. However, Nvidia moved to secure key technology, licensing Groq's IP in a deal that effectively ended OpenAI's talks with the startup, one source said. Chip industry executives viewed Nvidia's hiring of Groq's chip designers as a move to bolster its own inference portfolio.

The inference challenge became acute for products like OpenAI's Codex, a tool for generating computer code. Internally, some staff attributed performance limitations to the underlying Nvidia hardware, a source added. CEO Sam Altman noted in a recent call that customers of its coding models "will put a big premium on speed," a demand he said would be partly met through the new deal with Cerebras.

Competitors are already capitalizing on specialized inference hardware. Anthropic's Claude and Google's Gemini benefit from heavy use of Google's in-house Tensor Processing Units (TPUs), designed specifically for inference calculations and often offering performance edges over general-purpose GPUs.

As OpenAI signaled its reservations, Nvidia itself approached several SRAM-focused chipmakers about potential acquisitions, the people said. While Cerebras declined and instead partnered with OpenAI, Nvidia successfully licensed Groq's technology, redirecting the startup's focus toward cloud software.

Industry Voices:

"This is a natural maturation of the market," said Dr. Anya Sharma, a semiconductor analyst at TechInsight. "Training put Nvidia on top, but inference at scale has different technical demands. OpenAI's actions validate that niche players with novel architectures can find openings, even in a dominated field."

"It's a wake-up call, but not a death knell," commented Michael Torres, a venture capitalist at Sierra Foundry. "Nvidia's ecosystem and software moat are immense. This might spur them to accelerate their own inference-optimized silicon, potentially through acquisitions. The Groq deal is just the first move."

"The hypocrisy is staggering," argued Leo Grant, a former engineer at a rival AI lab, in a sharply critical post on social platform X. "OpenAI's entire empire was built on Nvidia's backs. Now, at the first sign of a bottleneck for their premium products, they're shopping around and jeopardizing a lifeline investment. It shows a ruthless, transactional approach to partnerships that will come back to haunt them."

"For developers, this could be great news," said Priya Chen, CTO of a coding startup. "If competition drives down inference latency and cost for tools like Codex, it directly improves our productivity. We don't care whose chip it is—we just need the answers faster and cheaper."

(Reporting by Max A. Cherney, Krystal Hu and Deepa Seetharaman in San Francisco; editing by Kenneth Li, Peter Henderson and Nick Zieminski)

Share:

This Post Has 0 Comments

No comments yet. Be the first to comment!

Leave a Reply