Key facts
- Anthropic accused Alibaba of illicitly extracting capabilities from its Claude AI model.
- Anthropic identified "industrial-scale campaigns" by DeepSeek, Moonshot AI, and MiniMax to illicitly extract Claude's capabilities.
- These campaigns involved over 16 million exchanges generated through approximately 24,000 fraudulent accounts.
- The attacks violated Anthropic's terms of service and regional access restrictions.
- Anthropic warned that illicitly distilled models pose national security risks and could enable authoritarian governments to deploy AI for malicious activities.
U.S. AI giant Anthropic has accused Chinese e-commerce conglomerate Alibaba Group of "brazenly" and "illicitly" distilling its Claude AI model, describing it as "the largest known distillation attack" on the company to date. Anthropic also identified "industrial-scale campaigns" by DeepSeek, Moonshot AI, and MiniMax to illicitly extract Claude's capabilities.
These "distillation attacks" involve training a less capable AI model on the outputs of a stronger one. Anthropic claims these attacks violate its terms of service and regional access restrictions, as Claude's services are prohibited in China. The company warned that illicitly distilled models lack necessary safeguards, posing significant national security risks and potentially enabling authoritarian governments to deploy AI for malicious activities such as cyber operations, disinformation campaigns, and mass surveillance.
Alibaba-linked operators reportedly targeted Claude's ability to handle complex tasks, its decision-making approach, its reasoning capabilities, and its tool use. Anthropic is implementing new systems to identify and prevent future distillation attacks.
