DeepSeek launches 1.6 trillion parameter V4 on Huawei chips as U.S. escalates AI theft accusations — U.S. gov't alleges IP theft by DeepSeek and other Chinese AI firms
U.S. State Department warns embassies worldwide about Chinese model distillation.
DeepSeek on Friday released a preview of its V4 large language model, the Hangzhou-based startup's most powerful to date, with 1.6 trillion parameters and a 1 million token context window. The model is the first major frontier release optimized for Huawei's Ascend AI processors rather than Nvidia hardware, and it arrived on the same day Reuters reported that the U.S. State Department had sent a diplomatic cable to embassies worldwide instructing staff to warn foreign governments about alleged IP theft by DeepSeek and other Chinese AI firms.
V4 comes in two variants: V4-Pro, the flagship, which costs $3.48 per million output tokens, and V4-Flash, a smaller 284 billion parameter version, which costs $0.28. OpenAI currently charges $30 per million output tokens for GPT-5.4, and Anthropic charges $25 for Claude Opus 4.6. DeepSeek, however, acknowledges V4 “falls marginally short” of those closed-source models by roughly three to six months of development, but outperforms every other open-source competitor in agentic coding and reasoning benchmarks.
DeepSeek trained its earlier V3 model on 2,048 Nvidia H800 GPUs, and the company has faced multiple investigations over whether it acquired restricted Nvidia hardware through intermediaries in Singapore.
V4 sidesteps that supply chain entirely by training on domestic Ascend chips. Huawei confirmed day-zero compatibility across its full Ascend SuperNode product line, including its latest 950 series processors, and DeepSeek said V4-Pro pricing could fall further once Huawei scales up Ascend 950 production in the second half of this year.
The diplomatic cable, per Reuters, instructed embassy staff to speak to their foreign counterparts about “concerns over adversaries’ extraction and distillation” of U.S. models, naming DeepSeek alongside Moonshot AI and MiniMax. Two days earlier, the White House Office of Science and Technology Policy published a memo accusing Chinese entities of running "deliberate, industrial-scale campaigns" to distill American frontier AI systems.
Those accusations build on claims Anthropic made in February, when the company said DeepSeek, Moonshot, and MiniMax had used 24,000 fraudulent accounts to make 16 million exchanges with its Claude model. OpenAI has also accused DeepSeek of distilling its models.
China's foreign ministry called the accusations "groundless," according to Reuters, and DeepSeek has previously said its V3 model relied on naturally occurring data collected through web crawling and didn’t intentionally use synthetic data generated by OpenAI. The diplomatic cable and the V4 launch both come just weeks before President Trump is scheduled to visit Chinese President Xi Jinping in Beijing for a summit expected to cover semiconductor export controls and IP disputes.
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

Luke James is a freelance writer and journalist. Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory.
-
jp7189 I hope they'll release an open source/weights version. Things are really heating up in that space. Openai's entry was really good, then gemma4 dropped and blew that away while being far more tunable than gemma3. Glm 4.7 flash is doing quite well though I wish 5 were a bit lighter weight.. but the real star is qwen 3.6 both their moe and dense models are stunning. All of this has happened within the last couple of weeks. Really exciting.Reply
After all that noise xai made.. they have been absent from the race. -
usertests You could write an article about the hardware requirements alone. Here's one: https://apidog.com/blog/how-to-run-deepseek-v4-locally/Reply
You might be able to get away with running V4-Flash on a single H100 80 GB. I think you need workstation levels of RAM (e.g. >256 GB) because of how MOE works.
The models are multimodal and can take video input. -
hotaru251 every major model is made from IP theft. you can't let some do it w/o issue then have an issue when china does it.Reply -
alan.campbell99 Reply
Yup. using 'IP theft' and genAI together in these kinds of accusatory statements is always interesting to say the least.hotaru251 said:every major model is made from IP theft. you can't let some do it w/o issue then have an issue when china does it. -
zsydeepsky Not really all trained on Huawei chips.Reply
There are 2 models:
1.6T v4-pro is probably trained on Nvidia chips
285B v4-flash is probably trained on Huawei chips
The guess was that DeepSeek had the big v4-pro earlier, while Huawei hadn't prepared their chips; later, they trained the small v4-flash to test the new platform.
Both can do inference on Huawei chips, though, and probably already serve as API on Huawei Ascend clusters.
And it's kinda hard to estimate their inference efficiency overall, since the API pricing has been constantly dropping every day since launch...quite unexpected.
The latest api pricing is (2026/04/26):
input (cache hit) / million tokens: (v4-flash)$0.0028 (v4-pro)$0.003625
input (cache miss) / million tokens: (v4-flash)$0.14 (v4-pro)$0.435
output / million tokens: (v4-flash)$0.28 (v4-pro)$0.87
Usually, DeepSeek sells API quota with a profit margin, thus they are the only lab trained their models with no outside investment money. So if this v4 api price is still profitable for them...then it's nuts for me, like it's already 30-80 times cheaper than GPT/Claude counterparts.
and DeepSeek has already open-sourced the weights under the MIT license:
https://huggingface.co/deepseek-ai/DeepSeek-V4-Flashhttps://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
along with a comprehensive tech report:
https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf
So...you can evaluate the "IP-theft" allegations with the context above, and get your own conclusions. -
justrudi I logged-in just to say how ridiculous it sounds.Reply
You do nothing with IP theft by AI companies and buddy with their CEOs.
Someone steals your "stolen-made IP" and you accuse of stealing.
Clean your own house first, please. Or get over consequences. -
jp7189 Reply
Thanks for that link. According to it, on a single 80GB card, the model would have to be cut down to int4... whereas qwen3.6 27B dense runs unquantized in half that much vram. I'll have to try to find some benchmarks, but personally have never found an int4 model that I was impressed with, so I'm skeptical.usertests said:You could write an article about the hardware requirements alone. Here's one: https://apidog.com/blog/how-to-run-deepseek-v4-locally/
You might be able to get away with running V4-Flash on a single H100 80 GB. I think you need workstation levels of RAM (e.g. >256 GB) because of how MOE works.
The models are multimodal and can take video input. -
usertests Reply
Most of the comments have dismissed the allegations under an "AI models are already theft" argument.zsydeepsky said:So...you can evaluate the "IP-theft" allegations with the context above, and get your own conclusions.
From a practical standpoint, I don't think individuals should care if models were scraped or whatever you want to call it. If it results in a superior or at least somewhat equivalent open model, great. I don't think it's affecting anyone's ability to download, use, or distribute the model. Maybe the worst result would be that problems with the closed models could be copied or amplified in an open one.
I looked at a few different articles before picking that one. It's a real struggle to run this thing, as it's big (284B for the "small" one) and so the requirements are immense. The picture will become more clear as the days go on, but there's no cheap path here at the moment, and that will push users to rental instead.jp7189 said:Thanks for that link. According to it, on a single 80GB card, the model would have to be cut down to int4... whereas qwen3.6 27B dense runs unquantized in half that much vram. I'll have to try to find some benchmarks, but personally have never found an int4 model that I was impressed with, so I'm skeptical.
It will be interesting to see how the local AI scene evolves alongside DRAM developments. In the short term, we may see Medusa Halo with at least 192 GB of LPDDR6. Intel is launching a Crescent Island inference GPU with 160 GB of LPDDR5X this year. In the long term, it's all about 3D DRAM. Mega APUs could end up shipping with over a terabyte of memory a decade from now, but will it be enough to keep up with model sizes? Or will models hit a plateau? -
Shiznizzle I am just wondering how Meta/Facehell was allowed to get away with the IP theft when they scraped that library.Reply
When american companies do it its called something other than theft. If an individual does they are thrown in jail. If foreign governments are smeared this is also called theft.