Cloud providers are seeing a spike in demand for Nvidia’s H200 chips after Chinese AI company DeepSeek launched its latest foundation model.
While the stock market reacted negatively, sending Nvidia’s shares down 16% on Monday, AI researchers and developers have tracked DeepSeek’s progress for months. The company first released its V2 model in May 2024, but its V3 model in December gained significant attention in the AI community.
When DeepSeek launched its reasoning model R1 in January, demand for Nvidia H200s skyrocketed. Robert Brooks, a founding team member at cloud provider Lambda, confirmed the trend.
“The launch of DeepSeek R1 has dramatically increased H200 demand. Enterprises are now pre-purchasing H200 capacity even before it becomes publicly available,” Brooks stated.
DeepSeek’s Efficient Models Disrupt AI Hardware Market
DeepSeek’s open-source models make AI more affordable, but they still require hardware or cloud services to operate at scale. Semiconductor analysts at Seminanalysis reported that DeepSeek’s rise had already impacted H100 and H200 pricing.
Nvidia has now sold H200 GPUs worth tens of billions, according to CFO Colette Kress during the company’s November earnings call.
DeepSeek’s efficiency also startled AI investors. Unlike Meta, OpenAI, and Microsoft, which have spent billions on infrastructure, DeepSeek trained its models using less powerful hardware. Investors now wonder whether these billion-dollar investments in AI infrastructure will remain necessary.
Cloud providers confirm that running DeepSeek’s models still requires heavy compute power. “It’s not easy to run,” noted Srivastava, a cloud computing expert. Companies avoid using larger AI models due to higher costs and slower performance.
DeepSeek offers smaller models, but even its most powerful version is cheaper to operate than competitors like Llama 3. This cost advantage excites companies seeking full-model capabilities without excessive compute costs.
Scarcity of H200 Chips Creates Challenges for AI Deployment
Nvidia’s H200 chips remain the only widely available hardware capable of running DeepSeek’s V3 model in full on a single node. The model requires eight GPUs to function properly.
While companies can distribute the workload across weaker GPUs, this method demands greater technical expertise and risks performance issues. Adding complexity slows down processing speed, Srivastava warned.
DeepSeek’s largest model contains 678 billion parameters, significantly fewer than OpenAI’s ChatGPT-4 (1.76 trillion parameters) but more than Meta’s Llama (405 billion parameters).
Nvidia’s upcoming Blackwell chips will support DeepSeek’s V3 model, but shipments have only just begun in 2024. With demand soaring, companies struggle to secure enough H200s to run V3 and R1 models efficiently.
Companies like Baseten optimize AI models for faster inference speeds, enabling real-time AI interactions. Although Baseten does not own GPUs, it leases capacity and fine-tunes software connections for smoother AI operations.
DeepSeek’s open-source accessibility combined with Nvidia’s powerful GPUs is reshaping the AI computing landscape. Companies eager to leverage high-performance AI at lower costs are now racing to secure the necessary hardware.