DeepSeek launches 1.6 trillion parameter V4 on Huawei chips as U.S. escalates AI theft accusations — U.S. gov't alleges IP theft by DeepSeek and other Chinese AI firms

The DeepSeek logo against a hexagonal textured background
(Image credit: DeepSeek)

DeepSeek on Friday released a preview of its V4 large language model, the Hangzhou-based startup's most powerful to date, with 1.6 trillion parameters and a 1 million token context window. The model is the first major frontier release optimized for Huawei's Ascend AI processors rather than Nvidia hardware, and it arrived on the same day Reuters reported that the U.S. State Department had sent a diplomatic cable to embassies worldwide instructing staff to warn foreign governments about alleged IP theft by DeepSeek and other Chinese AI firms.

V4 comes in two variants: V4-Pro, the flagship, which costs $3.48 per million output tokens, and V4-Flash, a smaller 284 billion parameter version, which costs $0.28. OpenAI currently charges $30 per million output tokens for GPT-5.4, and Anthropic charges $25 for Claude Opus 4.6. DeepSeek, however, acknowledges V4 “falls marginally short” of those closed-source models by roughly three to six months of development, but outperforms every other open-source competitor in agentic coding and reasoning benchmarks.

Latest Videos From

The diplomatic cable, per Reuters, instructed embassy staff to speak to their foreign counterparts about “concerns over adversaries’ extraction and distillation” of U.S. models, naming DeepSeek alongside Moonshot AI and MiniMax. Two days earlier, the White House Office of Science and Technology Policy published a memo accusing Chinese entities of running "deliberate, industrial-scale campaigns" to distill American frontier AI systems.

China's foreign ministry called the accusations "groundless," according to Reuters, and DeepSeek has previously said its V3 model relied on naturally occurring data collected through web crawling and didn’t intentionally use synthetic data generated by OpenAI. The diplomatic cable and the V4 launch both come just weeks before President Trump is scheduled to visit Chinese President Xi Jinping in Beijing for a summit expected to cover semiconductor export controls and IP disputes.

Google Preferred Source

Follow Tom's Hardware on Google News, or add us as a preferred source, to get our latest news, analysis, & reviews in your feeds.

Luke James
Contributor

Luke James is a freelance writer and journalist.  Although his background is in legal, he has a personal interest in all things tech, especially hardware and microelectronics, and anything regulatory. 

  • jp7189
    I hope they'll release an open source/weights version. Things are really heating up in that space. Openai's entry was really good, then gemma4 dropped and blew that away while being far more tunable than gemma3. Glm 4.7 flash is doing quite well though I wish 5 were a bit lighter weight.. but the real star is qwen 3.6 both their moe and dense models are stunning. All of this has happened within the last couple of weeks. Really exciting.

    After all that noise xai made.. they have been absent from the race.
    Reply
  • vanadiel007
    1.6 Trillion. Staggering number.
    Reply
  • usertests
    You could write an article about the hardware requirements alone. Here's one: https://apidog.com/blog/how-to-run-deepseek-v4-locally/
    You might be able to get away with running V4-Flash on a single H100 80 GB. I think you need workstation levels of RAM (e.g. >256 GB) because of how MOE works.

    The models are multimodal and can take video input.
    Reply
  • hotaru251
    every major model is made from IP theft. you can't let some do it w/o issue then have an issue when china does it.
    Reply
  • alan.campbell99
    hotaru251 said:
    every major model is made from IP theft. you can't let some do it w/o issue then have an issue when china does it.
    Yup. using 'IP theft' and genAI together in these kinds of accusatory statements is always interesting to say the least.
    Reply
  • zsydeepsky
    Not really all trained on Huawei chips.
    There are 2 models:
    1.6T v4-pro is probably trained on Nvidia chips
    285B v4-flash is probably trained on Huawei chips
    The guess was that DeepSeek had the big v4-pro earlier, while Huawei hadn't prepared their chips; later, they trained the small v4-flash to test the new platform.
    Both can do inference on Huawei chips, though, and probably already serve as API on Huawei Ascend clusters.

    And it's kinda hard to estimate their inference efficiency overall, since the API pricing has been constantly dropping every day since launch...quite unexpected.

    The latest api pricing is (2026/04/26):
    input (cache hit) / million tokens: (v4-flash)$0.0028 (v4-pro)$0.003625
    input (cache miss) / million tokens: (v4-flash)$0.14 (v4-pro)$0.435
    output / million tokens: (v4-flash)$0.28 (v4-pro)$0.87

    Usually, DeepSeek sells API quota with a profit margin, thus they are the only lab trained their models with no outside investment money. So if this v4 api price is still profitable for them...then it's nuts for me, like it's already 30-80 times cheaper than GPT/Claude counterparts.

    and DeepSeek has already open-sourced the weights under the MIT license:
    https://huggingface.co/deepseek-ai/DeepSeek-V4-Flashhttps://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
    along with a comprehensive tech report:
    https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro/blob/main/DeepSeek_V4.pdf
    So...you can evaluate the "IP-theft" allegations with the context above, and get your own conclusions.
    Reply
  • justrudi
    I logged-in just to say how ridiculous it sounds.
    You do nothing with IP theft by AI companies and buddy with their CEOs.
    Someone steals your "stolen-made IP" and you accuse of stealing.
    Clean your own house first, please. Or get over consequences.
    Reply
  • jp7189
    usertests said:
    You could write an article about the hardware requirements alone. Here's one: https://apidog.com/blog/how-to-run-deepseek-v4-locally/
    You might be able to get away with running V4-Flash on a single H100 80 GB. I think you need workstation levels of RAM (e.g. >256 GB) because of how MOE works.

    The models are multimodal and can take video input.
    Thanks for that link. According to it, on a single 80GB card, the model would have to be cut down to int4... whereas qwen3.6 27B dense runs unquantized in half that much vram. I'll have to try to find some benchmarks, but personally have never found an int4 model that I was impressed with, so I'm skeptical.
    Reply
  • usertests
    zsydeepsky said:
    So...you can evaluate the "IP-theft" allegations with the context above, and get your own conclusions.
    Most of the comments have dismissed the allegations under an "AI models are already theft" argument.

    From a practical standpoint, I don't think individuals should care if models were scraped or whatever you want to call it. If it results in a superior or at least somewhat equivalent open model, great. I don't think it's affecting anyone's ability to download, use, or distribute the model. Maybe the worst result would be that problems with the closed models could be copied or amplified in an open one.
    jp7189 said:
    Thanks for that link. According to it, on a single 80GB card, the model would have to be cut down to int4... whereas qwen3.6 27B dense runs unquantized in half that much vram. I'll have to try to find some benchmarks, but personally have never found an int4 model that I was impressed with, so I'm skeptical.
    I looked at a few different articles before picking that one. It's a real struggle to run this thing, as it's big (284B for the "small" one) and so the requirements are immense. The picture will become more clear as the days go on, but there's no cheap path here at the moment, and that will push users to rental instead.

    It will be interesting to see how the local AI scene evolves alongside DRAM developments. In the short term, we may see Medusa Halo with at least 192 GB of LPDDR6. Intel is launching a Crescent Island inference GPU with 160 GB of LPDDR5X this year. In the long term, it's all about 3D DRAM. Mega APUs could end up shipping with over a terabyte of memory a decade from now, but will it be enough to keep up with model sizes? Or will models hit a plateau?
    Reply
  • Shiznizzle
    I am just wondering how Meta/Facehell was allowed to get away with the IP theft when they scraped that library.

    When american companies do it its called something other than theft. If an individual does they are thrown in jail. If foreign governments are smeared this is also called theft.
    Reply