Anthropic removes its hidden code against Chinese distillation: protecting the model, at the cost of trust

🕒 Published on Zendoric: July 2, 2026 · 08:26

Anthropic removes from Claude Code a steganography mechanism that, since March, had secretly detected Chinese competitors and unauthorized resellers by cross-referencing time zone and domains. The case exposes how far the war to protect models from 'distillation' goes, and the price of doing so without telling anyone.

🎧 Listen to the analysis

By The Register · July 1, 2026.

Anthropic has confirmed that it will remove from Claude Code, its coding agent, a covert mechanism that had been active since March. As engineer Thariq Shihipar explained, the system checked the environment variable that defines the base URL for API requests and, if it detected that it had been redirected to a proxy or gateway, cross-referenced the time zone and hostname against a list of Chinese AI labs, other AI companies, account resellers and known gateway domains. The stated goal was to curb account abuse by unauthorized resellers and to guard against 'distillation': the technique of copying someone else's model by bombarding it with queries and training your own model on the responses.

What is striking is not the goal, but the method. As the developer known as Thereallo revealed, the tool concealed these controls using steganography: nearly invisible Unicode markers silently altered the system prompt, the proxy or gateway classification was encoded within a sentence that looked like ordinary English, and the list of suspicious domains was hidden behind XOR and base64. None of this appeared in public documentation nor, as far as could be confirmed, in the terms of service; Anthropic has not directly addressed whether it disclosed this anywhere, and merely says that it 'had already planned' to remove it because it has 'more robust mitigations,' without specifying which ones.

The episode is not an isolated case. In February, Anthropic had already announced investment in defenses against distillation: detection classifiers, behavioral fingerprinting systems, access controls, intelligence sharing with other labs and countermeasures to make it harder to reproduce a model from its outputs. One of those countermeasures —a flag called ANTI_DISTILLATION_CC, discovered when Claude Code's source code was leaked— injects fake tool data into API requests to 'poison' any training dataset built from them. The now-removed steganography was, in that sense, one more piece of a defensive scaffolding considerably broader than the company's public communication suggested.

This move fits a tension we have been flagging at Zendoric: the competition between the U.S. and China is no longer fought only on benchmarks, but in the active protection of models' intellectual property. A recent White House Executive Order articulates precisely that concern —protecting American AI from 'foreign adversaries'— and episodes like this show that companies are already acting on their own, without waiting for policy to catch up. The problem is that doing so covertly, without telling the developers who entrust their workflows to the tool, strains exactly the kind of trust a product for programmers needs to survive.

Our take: distillation is a real and legitimate threat for any company that has invested billions in training a frontier model, and it is reasonable that Anthropic wants to defend itself against a competitor —Chinese or otherwise— copying its work through a simple proxy. But the way it did so reveals something more uncomfortable: AI labs are willing to instrumentalize their own development tools as silent surveillance sensors, without the user knowing or being able to audit it. In the short term this erodes trust in the AI developer-tools ecosystem —already heavily scrutinized after code leaks and findings by independent researchers— and gives ammunition to those calling for greater regulatory transparency about what these agents really do with the data they process. In the long term, however, this kind of friction is part of the maturation process of an industry still inventing its own rules: the sooner the disclosure and auditing of these mechanisms are normalized —rather than coming to light only when a developer dismantles them in public— the more solid the trust that will underpin the next decade of agentic AI adoption.

Sources & references

The Register — Anthropic removes its hidden code against Chinese distillation: protecting the model, at the cost of trust