A Chinese state-sponsored cybercriminal group is believed to be behind what researchers say is the first documented cyber-espionage operation executed largely by AI rather than humans.
The campaign, detected in mid-September, used Anthropic’s Claude Code tool to probe and infiltrate around thirty organisations across tech, finance, chemicals, and government.
According to Anthropic, the attackers leaned heavily on AI’s “agentic” features, using the model not as an assistant but as the primary operator of the campaign. The group broke Claude’s guardrails by feeding it fragmented, context-free prompts and posing as a legitimate cybersecurity firm conducting defensive testing.
Once jailbroken, the model performed reconnaissance, identified high-value data, wrote exploit code, harvested credentials, and exfiltrated information at machine speed.
A Pace Human Teams Can’t Match
Anthropic says the AI carried out roughly 80 to 90% of the operation, with humans stepping in only a handful of times to make strategic decisions. At peak, Claude was firing off thousands of actions (sometimes several per second) at a pace no human team could match. It also compiled documentation of its own attacks, creating files of stolen credentials and maps of compromised systems to support follow-on operations.
The espionage run wasn’t flawless. Claude occasionally hallucinated credentials or misclassified public data as sensitive, a reminder that fully autonomous hacking still carries reliability gaps.
However, the campaign’s scale and autonomy mark a shift that researchers have been warning about: AI models are now capable of chaining tasks, managing tools, and executing complex intrusions that were previously out of reach for all but the most well-resourced actors.
Once the activity was spotted, Anthropic moved to contain the breach over ten days, banning accounts, notifying affected organisations, and working with authorities. The company says it has since upgraded its detection systems and classifiers to spot similar misuse earlier.
The Barrier Has Dropped Sharply
Anthropic warns that the barrier to running advanced cyberattacks has dropped sharply. With the right setup, less-skilled actors could soon replicate what once required an expert team, using AI agents to automate reconnaissance, vulnerability discovery, and data processing across enormous targets.
This incident follows earlier “vibe-hacking” research in which human operators guided models through harmful tasks. In this case, the humans mostly stepped back. Anthropic believes similar patterns are likely emerging across other frontier models, as attacker tradecraft shifts to exploit the latest AI capabilities.
The company argues that despite the risks, models like Claude remain essential for defence. Its own threat-intelligence team relied on Claude to sift through the vast datasets generated during the investigation. Anthropic is calling on security teams to begin operationalising AI in areas such as SOC automation, threat detection, vulnerability scanning, and incident response, while also demanding stronger safeguards across the industry.
Anthropic plans to continue publishing case studies to help governments, researchers, and defenders prepare for the next wave of AI-enabled attacks.
Operational Doctrine
Michael Bell, Founder & CEO at Suzu Labs, says: “If accurate, this represents the inflection point where AI systems execute 80 to 90% of sophisticated attacks autonomously, not just advise attackers. Jailbreaking Claude by convincing it this was legitimate penetration testing shows this isn’t an edge case anymore, it’s operational doctrine.”
“The technical scenario is feasible, and these attack patterns will be weaponized at scale. Organizations deploying AI agents with tool access need detection capabilities today, regardless of how this specific disclosure evolves,” he adds.
“Organizations need to prepare for AI-powered attacks whether or not this specific disclosure proves exactly as described, because the jailbreak techniques and autonomous exploitation patterns are technically feasible and will be weaponized regardless.”
The Importance of Measurement, Telemetry
Noelle Murata, Sr. Security Engineer at Xcape Inc, says security professionals should anticipate agentic AI being used both offensively and defensively: they should tighten rate limits and anomaly detection on their own LLM endpoints, limit API keys and scopes, and monitor for scripted bursts indicative of model misuse.
“On the enterprise side, they should strengthen identity verification (FIDO2), reduce session/token durations, and watch for high-speed reconnaissance activity consistent with AI tools. Anthropic and external researchers also caution against overhyping these findings, as some claims are disputed. This emphasizes the importance of measurement and telemetry when implementing LLMs for sensitive workflows,” Murata adds.
“The time for predictive AI defense is over; the future of cybersecurity is a real-time, autonomous AI war.”
The Tip of the Iceberg
This is simply the tip of the iceberg and a clear indication of the future threat landscape, comments John Watters, CEO and Managing Partner at iCOUNTER.
“I’ve spoken at length of the movement where all victims become Patient Zero as adversaries leverage AI to conduct reconnaissance on a target, then build bespoke capabilities designed to exploit each specific target. Just look at the success of this operation leveraging off the shelf AI capability. Imagine what an adversary can do with a well-tuned LLM purpose built for an espionage mission.”
A Smart Coordinator for Standard Offensive Tools
Toby Lewis, Global Head of Threat Analysis at Darktrace, says that although this campaign is not a fully autonomous attack, it does demonstrate how threat actors are already utilizing AI to orchestrate and scale the same techniques we’ve seen for years – from reconnaissance and credential theft to lateral movement and data exfiltration.
“The AI use here is essentially a smart coordinator for standard offensive tools, allowing an operator to say ‘scan here, pivot there, package this up’ in plain language instead of writing custom scripts for every step. This allows attackers to rapidly prototype and refine attack chains, making their operations more agile and can allow them to switch from one target to the next more quickly without having to completely re-tool.”
Lewis says it’s important for entities to remember that AI-driven attacks cannot always be identified as so: regardless of whether the code was produced by an AI system or written manually, it behaves the same once it’s inside the victim’s environment.
Rethink Threat Models
Chrissa Constantine, Senior Cybersecurity Solution Architect at Black Duck believes this marks a fundamental shift in the threat landscape. What once required months of coordinated human effort can now be accelerated through AI-driven automation. Key implications include:
- Lower Barrier to Entry: Sophisticated attacks no longer demand elite hacking teams; smaller actors can scale operations using AI.
- Speed and Volume: The model processed thousands of requests at machine speed, which is far beyond human capacity.
- Stealth and Complexity: Multi-stage campaigns orchestrated by AI agents are harder to detect and disrupt.
“As AI systems gain agency, tool access, and decision-making capabilities, defenders must rethink threat models. Anthropic notes that the same advanced features enabling misuse are also critical for defense – underscoring the urgency for AI-augmented security, stronger detection, and new safeguards,” she adds.
Techniques observed or inferred include:
- Prompt Engineering: Assigning personas and stepwise instructions to bypass guardrails.
- Context Manipulation: Breaking tasks into innocuous steps to hide malicious intent.
- Agentic Loops: Running models iteratively with minimal human input.
- Tool Invocation: Leveraging APIs and external tools via protocols like MCP.
- Jailbreak Strategies: Misrepresenting tasks as legitimate (e.g., “simulate a penetration test”).
This is no longer a theoretical risk, but an active threat, Contantine says. “The cybersecurity community must treat AI-agent misuse as a present danger, not a future possibility.”
Fighting a Good Fight
Trey Ford, Chief Strategy and Trust Officer at Bugcrowd says the notion of “dual use” has always been a source of frustration in cybersecurity. “We cannot downplay the importance of developing offensive capabilities to test our defenses—finding and fixing issues before malicious actors exploit them. Anthropic is fighting a good fight; they’ve invested heavily in a strong security team, and a very capable threat intelligence team who are monitoring, reporting, and sharing their work.”
According to Ford, the old world pattern of addressing and disposing of issues, attacks, and abuse quietly only benefits the attackers. Ultimately, sunshine is the best disinfectant—sharing this in the light of day for the public to learn and adapt from helps us all improve.
“We need to support and encourage companies to follow Anthropic’s example here by collecting actionable intelligence, working with the various government agencies, and then notifying the public of changes. AI makes humans faster, whether your intentions are altruistic or malicious. What’s great about how Anthropic has handled this is they’ve captured intelligence about the threat actors’ tactics and procedures, and they’re sharing how they’re working. This also underscores the need for renewing protections under CISA 2015. There are no legal projections for intelligence sharing with that coverage expired.”
You can read the full report here.
Information Security Buzz News Editor
Kirsten Doyle has been in the technology journalism and editing space for nearly 24 years, during which time she has developed a great love for all aspects of technology, as well as words themselves. Her experience spans B2B tech, with a lot of focus on cybersecurity, cloud, enterprise, digital transformation, and data centre. Her specialties are in news, thought leadership, features, white papers, and PR writing, and she is an experienced editor for both print and online publications.
The opinions expressed in this post belong to the individual contributors and do not necessarily reflect the views of Information Security Buzz.


