OpenAI is unsatisfied with some of Nvidia’s latest artificial intelligence chips, and it has sought alternatives since last year, eight sources familiar with the matter said, potentially complicating the relationship between the two highest-profile players in the AI boom.
The ChatGPT-maker’s shift in strategy, the details of which are first reported here, is over an increasing emphasis on chips used to perform specific elements of AI inference, the process by which an AI model, such as the one that powers the ChatGPT app, responds to customer queries and requests.
NVIDIA remains dominant in chips for training large AI models, while inference has become a new front in the competition.
This decision by OpenAI and others to seek out alternatives in the inference chip market marks a significant test of Nvidia’s AI dominance and comes as the two companies are in investment talks.
In September, Nvidia said it intended to pour as much as $100 billion into OpenAI as part of a deal that gave the chipmaker a stake in the startup and gave OpenAI the cash it needed to buy the advanced chips.
The deal had been expected to close within weeks, Reuters reported. Instead, negotiations have dragged on for months. During that time, OpenAI has struck deals with AMD and others for GPUs built to rival Nvidia’s.
But its shifting product road map also has changed the kind of computational resources it requires and bogged down talks with Nvidia, a person familiar with the matter said.
On Saturday, Nvidia CEO Jensen Huang brushed off a report of tension with OpenAI, saying the idea was “nonsense” and that Nvidia planned a huge investment in OpenAI.
“Customers continue to choose NVIDIA for inference because we deliver the best performance and total cost of ownership at scale,” Nvidia said in a statement.
A spokesperson for OpenAI in a separate statement said the company relies on Nvidia to power the vast majority of its inference fleet and that Nvidia delivers the best performance per dollar for inference.
After the Reuters story was published, OpenAI Chief Executive Sam Altman wrote in a post on X that Nvidia makes “the best AI chips in the world” and that OpenAI hoped to remain a “gigantic customer for a very long time”.
Seven sources said that OpenAI is not satisfied with the speed at which Nvidia’s hardware can spit out answers to ChatGPT users for specific types of problems, such as software development and AI communicating with other software.
It needs new hardware that would eventually provide about 10% of OpenAI’s inference computing needs in the future, one of the sources told Reuters.
The ChatGPT maker has discussed working with startups, including Cerebras and Groq, to provide chips for faster inference, two sources said. But Nvidia struck a $20-billion licensing deal with Groq that shut down OpenAI’s talks, one of the sources told Reuters.
NVIDIA’s decision to snap up Groq looked like an effort to shore up a portfolio of technology to better compete in a rapidly changing AI industry, chip industry executives said.
NVIDIA, in a statement, said that Groq’s intellectual property was highly complementary to NVIDIA’s product roadmap.
NVIDIA’s graphics processing chips are well-suited for the massive data crunching necessary to train large AI models like ChatGPT that have underpinned the explosive growth of AI globally to date.
But AI advancements increasingly focus on using trained models for inference and reasoning, which could be a new, bigger stage of AI, inspiring OpenAI’s efforts.
The ChatGPT-maker’s search for GPU alternatives since last year focused on companies building chips with large amounts of memory embedded in the same piece of silicon as the rest of the chip, called SRAM.
Squishing as much costly SRAM as possible onto each chip can offer speed advantages for chatbots and other AI systems as they crunch requests from millions of users.
Inference requires more memory than training because the chip needs to spend relatively more time fetching data from memory than performing mathematical operations.
NVIDIA and AMD GPU technology rely on external memory, which adds processing time and slows down how quickly users can interact with a chatbot.
Inside OpenAI, the issue became particularly visible in Codex, its product for creating computer code, which the company has been aggressively marketing, one of the sources added.
OpenAI staff attributed some of Codex’s weaknesses to Nvidia’s GPU-based hardware, one source said.
In a January 30 call with reporters, Altman said that customers using OpenAI’s coding models will “put a big premium on speed for coding work.”
One way OpenAI will meet that demand is through its recent deal with Cerebras, Altman said, adding that speed is less of an imperative for casual ChatGPT users.
Competing products such as Anthropic’s Claude and Google’s Gemini benefit from deployments that rely more heavily on the chips Google made in-house, called tensor processing units, or TPUs, which are designed for the sort of calculations required for inference and can offer performance advantages over general-purpose AI chips like the Nvidia-designed GPUs.
As OpenAI made clear its reservations about Nvidia technology, Nvidia approached companies working on SRAM-heavy chips, including Cerebras and Groq, about a potential acquisition, the people said. Cerebras declined and struck a commercial deal with OpenAI, announced last month.
Groq held talks with OpenAI for a deal to provide computing power and received investor interest to fund the company at a valuation of roughly $14 billion, according to people familiar with the discussions.
But by December, Nvidia moved to license Groq’s tech in a non-exclusive all-cash deal, the sources said.
Although the deal would allow other companies to license Groq’s technology, the company is now focusing on selling cloud-based software, as Nvidia hired away Groq’s chip designers.