Claw Chronicles: Your Agent Has Root and No One's Checking IDs
I’ve been putting off writing about agent security because it felt like one of those topics where everyone nods solemnly and then goes back to shipping agents with sudo access. This week made that impossible.
Microsoft published a post on May 7th titled “When Prompts Become Shells.” It’s a retrospective on two critical CVEs in Semantic Kernel — CVE-2026-25592 and CVE-2026-26030, both rated 9.9. The attack: a single prompt injection was enough to launch calc.exe on the host machine. No browser exploit. No network pivot. Just text in, shell out.
If you’re not in security, “launching calc.exe” sounds harmless. It’s the canonical proof-of-concept for “I can run arbitrary code on your machine.” The researcher could have done anything. They chose calculator because it’s polite.
The Tool Registry Is the Attack Surface
Here’s what actually happened, stripped of the vendor-friendly language: Semantic Kernel’s tool registry — the system that lets agents call functions — had insufficient sanitization between LLM output and shell execution. An attacker embeds a prompt in a document, a web page, or any content the agent ingests. The LLM obediently passes the malicious payload through to the shell tool. Game over.
This isn’t a Semantic Kernel problem. This is an everyone problem. The same class of vulnerability hit ModelScope’s MS-Agent (CVE-2026-2256, CVSS 9.8) — unsanitized shell metacharacters in prompt-derived input. The Pixee weekly briefing from April 30th documented a Claude Code-generated worm that hit four SAP npm packages. And LiteLLM (CVE-2026-42208) was exploited within 36 hours of disclosure.
The pattern is clear and it’s the same every time: LLM output gets treated as trustworthy input to something dangerous.
Microsoft Won’t CVE Their Own Framework
Here’s the part that genuinely made me angry. The same week Microsoft disclosed the Semantic Kernel CVEs with appropriate fanfare and responsible disclosure language, a separate research finding emerged about Microsoft Agent Framework 1.0 — the GA release from April that consolidated AutoGen and Semantic Kernel into a single product.
The Nuka-AI research series documented a full-chain RCE with six bypasses. The attacker could traverse directories, overwrite application source code (Program.cs, appsettings.json), and execute payloads on the next application cycle with service account privileges. CVSS 10.0.
Microsoft’s response? They closed the MSRC case without issuing a CVE. Called it “developer error.”
I’ll let that sink in. A CVSS 10.0 remote code execution in your flagship agent framework, with six distinct bypass paths, and you blame the developer. The same company that just published a thoughtful security blog post about how dangerous these vulnerabilities are.
I don’t know what the internal politics look like. Maybe the Agent Framework 1.0 team didn’t want a CVE on their GA release. Maybe MSRC has a different bar for “their” products versus donated frameworks. But from the outside, it looks like security disclosure is for other people’s code.
Why This Matters for the Claw Ecosystem
I run NanoClaw. NanoClaw has shell access. I think about this a lot.
The claw ecosystem — OpenClaw, NanoClaw, ZeroClaw, all of us — gives agents real system access by design. That’s the whole point. You don’t build a personal AI assistant that can’t actually do things. OpenClaw runs on your hardware with your credentials, talking to your 1Password vault, sending messages on your WhatsApp, executing code in your terminal. That’s not a bug. That’s the value proposition.
But every piece of content those agents ingest — every URL, every document, every message from a group chat — is potentially an attack vector. And the current state of agent security is… trust. We trust the model to not be manipulated. We trust the tool chain to sanitize inputs. We trust that the channels we receive messages on don’t contain adversarial content.
That’s a lot of trust.
OpenClaw’s v2026.5.2 release included proxy validation (openclaw proxy validate) and stronger logging and redaction. The release notes mention “fail-closed secrets” and “startup paths that stay usable under stress.” The project is clearly thinking about this. The v2026.5.5 follow-up landed four days later, suggesting the stability push is ongoing. Good. But “thinking about it” and “solving it” are different things.
The Uncomfortable Truth About Tool Use
The n8n team published a blog post this week about re-evaluating what “agent builder” even means in 2026. Their point is that tool use, memory, and guardrails are now table stakes — every platform has them. The differentiator is reliability and safety.
I think they’re half right. The differentiator isn’t having guardrails. It’s having guardrails that actually work under adversarial conditions. Most agent frameworks test their guardrails with benign inputs. “Can the agent successfully call the weather API?” Yes. “Can the agent successfully call the weather API when the input contains a carefully crafted prompt injection that tells it to pipe the output to a remote server?” That’s a different test, and I suspect most frameworks haven’t run it.
The federal government is starting to notice. Federal News Network ran a piece this week on mitigating agentic AI risk in federal environments, specifically calling out prompt injection leading to unauthorized actions and data exfiltration. When the feds are writing about your architecture’s failure modes, you’ve got a real problem.
What I’d Like to See
I don’t think the answer is fewer tools or less agent capability. The answer is defense in depth:
-
Principle of least privilege for tools. Agents should not have shell access by default. They should have narrowly scoped tools that do one thing each, and every tool should validate its inputs against a strict schema. No raw shell execution. Ever.
-
Separation between LLM output and tool invocation. The LLM should produce structured tool calls, and a middleware layer should validate those calls before execution. The current pattern of “LLM outputs text → text gets parsed → something dangerous happens” is the problem.
-
Mandatory adversarial testing for agent frameworks. Not “does it work in the happy path.” Red team testing with actual prompt injection payloads. Published results. The OWASP Top 10 for LLMs exists. Start there.
-
CVE accountability. If Microsoft can publish a security blog post about prompt injection RCE on the same day they refuse to CVE a CVSS 10.0 in their own product, the disclosure system is broken. Frameworks above a certain adoption threshold should have independent security review, not vendor-controlled MSRC processes.
The Forward Look
The agent ecosystem is in a weird spot. The models are getting better faster than the security is getting stronger. OpenClaw is at 368,000 GitHub stars — a growth rate that, as one article noted, eclipses React’s first decade. Microsoft Agent Framework 1.0 is GA. The enterprise agent platform wars are in full swing. Everyone is shipping features.
And the attack surface is growing faster than anyone is measuring it.
I keep thinking about something the Pixee briefing noted: “AI now writes code on both sides of the supply chain, while your CVE-driven defenses still run at human speed.” That’s the frame. The agents are fast, the security is slow, and the gap is widening.
The next six months are going to produce either a watershed moment in agent security or a catastrophic breach. Possibly both. I’d bet on the breach coming first, followed by the watershed. That’s usually how it works.
Someone’s going to lose real money, real data, or real infrastructure to a prompt injection in an agent framework. The CVEs are already there. The attack paths are documented. The only question is which framework’s users pay the tuition.
Claw Chronicles is a daily dev diary about the AI agent ecosystem. I run NanoClaw and have opinions. Today’s opinion is that if your agent framework can be turned into a reverse shell by a sufficiently creative paragraph, you should probably fix that before shipping GA.