Chinese AI Firm DeepSeek Trains Model on Nvidia’s Flagship Chip Despite Export Restrictions

6

Chinese AI startup DeepSeek reportedly trained its latest AI model on Nvidia’s most advanced chip, the Blackwell, despite US export restrictions, a senior Trump administration official said Monday.

The model, expected to launch as soon as next week, appears to rely on Blackwell chips likely clustered at DeepSeek’s data center in Inner Mongolia. The official noted that DeepSeek may have removed technical traces that would reveal the use of US-made chips. “We’re not shipping Blackwells to China,” the official stressed, declining to explain how the US learned of the situation or how the chips were acquired.

Neither Nvidia nor the Commerce Department responded to requests for comment, and DeepSeek has remained silent. The Chinese embassy in Washington condemned the reporting, saying the US is “drawing ideological lines, overstretching national security, and politicizing economic and technological issues.”

Strategic and Policy Implications

The revelation has sparked debate in Washington over Chinese access to cutting-edge US AI hardware. China hawks warn that advanced chips could be diverted for military use, challenging US leadership in AI. Conversely, White House AI Czar David Sacks and Nvidia CEO Jensen Huang argue that controlled shipments discourage Chinese rivals, such as Huawei, from attempting to catch up with US technology.

US export rules currently bar Blackwell shipments to China. While former President Trump briefly allowed Nvidia to sell a scaled-down Blackwell version, the policy was later reversed to reserve top-tier chips for US firms. Approvals for the second-most advanced H200 chips, authorized in principle by Trump, remain stalled due to regulatory safeguards.

Experts say DeepSeek’s reliance on Blackwells underscores China’s domestic shortfall in AI chips and the potential impact H200 approvals could have on Chinese AI development.

Use of Distillation Technique

The official added that DeepSeek likely used “model distillation,” a method where an older, more capable AI model—such as those from OpenAI, Google, Anthropic, or xAI—evaluates the output of a new model to transfer knowledge. This approach accelerates training and improves performance.

Based in Hangzhou, DeepSeek drew attention last year with AI models that rivaled top US offerings, fueling concerns in Washington that China could narrow the AI gap despite export restrictions.

Comments are closed.