Why is Anthropic's CEO so hostile towards DeepSeek and Chinese AI?

DeepSeek has brought an existential crisis to Anthropic.

and

Feb 03, 2025

A few months ago, a scientist wrote an article declaring his and his company's benevolent intentions to solve humanity's challenges in physical and mental health, poverty, peace, and life's meaning through powerful artificial intelligence. Fast forward to today, and the same scientist suddenly published another article strongly advocating against exporting any American chips to China to restrict Chinese AI development and maintain an "unipolar world" in AI (I'm shocked he would use such explicit terminology). This disconnect feels jarring and carries a hint of hypocrisy.

This person is Dario Amodei, the founder and CEO of Anthropic. He's an Italian-American, holds a Ph.D. in neurophysics, and is a veteran AI scientist. He was once a key figure in OpenAI's research team, an early employee at Baidu's Deep Learning Lab, and is now the founder of OpenAI's most significant competitor. He claims to be an idealist focused on building the most powerful and safe AI. Currently, he's become the most vocal advocate for comprehensive AI export controls against China, bar none.

Although Anthropic and its Claude series models have limited public recognition in China, it remains one of the world's most popular large language model providers among AI developers. It has garnered considerable following among Chinese AI researchers and developers. However, overnight, many Chinese AI practitioners publicly stated that they had lost their basic respect for both Anthropic and Amodei personally.

This is the effect of what amounts to a "declaration of war."

In his article titled "On DeepSeek and Export Controls," Dario Amodei casually dismisses claims about DeepSeek's achievements threatening America's AI advantage. While acknowledging innovation in the DeepSeek-V3 model, he adamantly refuses to recognize breakthroughs in DeepSeek's reasoning model R1, which has garnered more attention (his motivations on this point will be a key focus later in this article). Even more reluctant is his acknowledgment of DeepSeek's achievements in computational cost and algorithmic efficiency. He uses an admittedly "unverified" rumor about DeepSeek possessing 50,000 smuggled NVIDIA A100, H100, and H800 GPUs to argue that DeepSeek-V3 couldn't possibly have been trained for just $6 million. Clearly, Amodei cannot accept the increasingly acknowledged path of DeepSeek replacing computational brute force with algorithmic efficiency innovations. He relies on an unverified premise about DeepSeek smuggling high-end GPUs to make his point. Yet he also claims U.S. export controls on computing power to China haven't failed—apparently forgetting that his earlier argument was predicated on DeepSeek's alleged smuggling activities.

Image Source：https：//darioamodei.com/on-deepseek-and-export-controls

Let's break down his logical progression: DeepSeek's influence is exaggerated—V3 is innovative but couldn't be that cheap—they allegedly smuggled chips—so they must have spent more on training—DeepSeek isn't original, they built on our research so of course costs are lower—R1 reasoning model absolutely isn't innovative, just reproducing o1's results (ignoring OpenAI's acknowledgment that DeepSeek's reasoning achievements were independent discoveries)—export controls haven't failed, they're working (forgetting his earlier premise about DeepSeek obtaining smuggled GPUs)—we must create a unipolar AI world, China absolutely cannot produce models on par with ours (forgetting his opening claim that DeepSeek isn't threatening)—therefore, not just H100 and H800, even the entry-level H20 shouldn't be exported to China, this way China can't win.

You see, when a scientist who emphasizes logic and reasoning attempts to use a 10,000-word article to argue for a self-contradicting conclusion while maintaining the appearance of logical reasoning, he ends up appearing both clumsy and disingenuous.

This isn't Dario Amodei's first call for strengthened computing power controls against China, and one can't expect an American AI scientist to have inherent goodwill toward China. However, his timing in advocating for further export controls amid Silicon Valley's widespread attention, approval, and certain degree of panic over DeepSeek, while strongly denying DeepSeek's innovations in computational efficiency optimization and model reasoning methods, is a phenomenon worth analyzing. No one expects his goodwill toward China, but the depth of his hostility and resentment toward China and Chinese AI company DeepSeek is particularly intriguing.

Why Does Dario Amodei "Look Down" on DeepSeek-R1?

While strongly speculating that DeepSeek-V3's training costs must exceed $6 million, Amodei does acknowledge V3 as genuinely innovative, but he insists on emphasizing that it's not breakthrough innovation, merely "an expected point on the continuous cost reduction curve." He argues that "the difference is that a Chinese company was the first to demonstrate expected cost reduction, which has never happened before and carries geopolitical significance." This manner of praising while clearly wanting to withhold genuine appreciation is exhausting to witness. I'd rather see Amodei directly say, "American companies are all innovating in model cost reduction, DeepSeek just happened to be the first to achieve it," but straightforwardness apparently isn't one of his qualities.

When it comes to DeepSeek-R1, Amodei becomes remarkably direct, absolutely refusing to acknowledge R1 as a breakthrough achievement. He leaves no room for ambiguity on this point, disregarding even OpenAI's acknowledgment that R1 made original breakthroughs in reinforcement learning methods. He ignores research findings indicating that DeepSeek's reinforcement learning has freed itself from human feedback intervention, marking what some call large language models' "AlphaGo moment." He insists that R1 merely applied reinforcement learning on top of V3, that all its actions merely reproduce o1, that every American AI company is attempting similar reasoning experiments, that this is a technological trend unrelated to open source, and that DeepSeek simply happened to achieve it first.

We needn't be indignant about Amodei's stubborn stance. After all, as a widely recognized accomplished researcher in the AI field, Amodei's views on key issues can significantly influence how the AI industry, venture capital, Wall Street, and even Washington D.C. view the DeepSeek phenomenon. This explains why he felt compelled to speak out. He's not defending OpenAI (his grudges with OpenAI run deep), but rather, at this moment, he must come forward to lay groundwork for Anthropic's next moves.

A notable fact is that Anthropic has yet to formally release any reasoning model. Although Dario Amodei has publicly expressed disdain for standalone reasoning models in interviews—at the time, primarily targeting OpenAI.

Amodei's view is that reasoning isn't that difficult, and foundation models are more important. Similar to his subtle praise of DeepSeek-V3's innovation while noting its weakness compared to his Claude 3.5 Sonnet model in programming and other evaluations, he has publicly acknowledged o1's breakthroughs while maintaining that Claude 3.5 Sonnet, as a pre-trained model, has demonstrated reasoning capabilities no weaker than o1 in specific scenarios and applications. Therefore, he doesn't believe reasoning models and general models should be separate, arguing that foundation models based on pre-training remain more important and can incorporate reasoning capabilities.

Thus, what's likely to happen is: Anthropic plans to achieve a leap in model reasoning capabilities in a way different from or similar to OpenAI and DeepSeek, likely manifesting in Claude's next-generation flagship foundation model with notable reinforcement learning capabilities. Don't forget, three months ago, DeepMind's reinforcement learning key figure and core contributor to the Alpha series research results, Julian Schrittwieser, joined Anthropic.

Anthropic, completely independent from OpenAI and viewing it as its most direct (almost only) competitor, is in some sense the most orthodox follower of OpenAI's pre-GPT-4 era large language model principles. Amodei has repeatedly denied phenomena of "hitting the wall" and diminishing scale effects in pre-training due to data exhaustion, repeatedly emphasizing the importance of classical "Scaling Law" (that model performance enhancement requires continuous expansion of model scale). AI researchers and developers are genuinely anticipating Anthropic to break through the bottlenecks of Scaling Law and pre-trained models, launching a new generation flagship pre-trained model with stronger reasoning capabilities.

But so far, Anthropic hasn't released this. Given its excellent model training track record and history of never making premature announcements, there's reason to believe Anthropic is intensely preparing this pre-trained model with enhanced reasoning capabilities to prove that OpenAI's o1 isn't the optimal path to improving reasoning capabilities. In fact, Amodei has already previewed this in his interview with The Wall Street Journal.

However, with DeepSeek-V3's release, they suddenly have much more to prove.

First, DeepSeek-R1, following o1, further proves that the path of independent reasoning models through reinforcement learning is viable, possibly even optimal. Second, DeepSeek-R1 verifies that reinforcement learning can enable AI to think deeply independently without human feedback (Dario Amodei is one of the primary inventors of human feedback-based reinforcement learning). Third, DeepSeek-R1 proves that achieving all this can be done at significantly reduced training costs.

This means that once Anthropic releases its new foundation model with enhanced reasoning capabilities, it must answer more complex questions than before: How does its reinforcement learning capability compare to R1? What advantages does human feedback-based reinforcement learning really have over R1's autonomous reinforcement learning? And what are the training costs? Are there cheaper, more efficient methods? Can API prices be reduced? (Claude API is the world's most expensive, while DeepSeek is among the cheapest)

All these thorny issues and troubles come from DeepSeek.

Therefore, before launching their own new model with enhanced reasoning capabilities, Anthropic's "soul figure" Dario Amodei can only proactively step forward to strongly diminish and dispel people's favorable first impressions of DeepSeek-R1: acknowledging it as innovation and breakthrough is absolutely unacceptable, admitting its costs are truly reduced is also hard to accept.

This is a matter of two competing paths, with somewhat "life-or-death" stakes. These two paths, to some degree, also represent different manifestations of classical Silicon Valley-style model training versus Chinese-style model training in the "post-pre-training era" of large language models: the former relies on computational resource advantages, improving model performance through computational brute force and raw power aesthetics; the latter focuses on algorithmic efficiency, improving model performance while reducing training costs through architectural and engineering innovations.

Anthropic represents an even stronger advocate than OpenAI for computational scale, model scale, and raw power aesthetics, which has led to Dario Amodei's newly published article not only subtly releasing hostility toward DeepSeek but also openly projecting this hostility onto China's entire AI field.

Why is Anthropic's CEO so hostile towards DeepSeek and Chinese AI?

DeepSeek has brought an existential crisis to Anthropic.

Why Does Dario Amodei "Look Down" on DeepSeek-R1?

Discussion about this post