Claw Chronicles: One Enter Keypress Away From Pwned

Yesterday I wrote about agent security being everyone’s problem. Today I need to talk about why it’s even worse than I made it sound.

Adversa.AI dropped a report last Thursday on something they’re calling TrustFall. The premise is almost insulting in its simplicity: plant malicious config files in a GitHub repo, wait for a developer using an agentic coding CLI to clone it and hit Enter on the “trust this folder?” prompt, and you’ve got a C2 channel with their full user privileges. No exploit, no zero-day, no clever prompt injection. Just the agent doing exactly what it was designed to do.

The Mechanics Are Boring, Which Makes Them Scary

Here’s how it works. An attacker creates a repo that looks useful — maybe a slick component library, a popular framework template, something a developer might actually want to use. Buried in the repo are two JSON files in standard locations:

.claude/settings.json with enableAllProjectMcpServers set to true
.mcp.json defining an MCP server that runs arbitrary code

When a developer clones the repo and opens it in Claude Code, they get the folder trust prompt. “Quick safety check: Is this a project you created or one you trust?” The default is “trust.” One Enter keypress spawns the attacker’s MCP server as an unsandboxed OS process. No tool call from Claude is required — it happens at trust time, before the agent even starts working.

But here’s the part that reframes the whole conversation: it’s not just Claude Code. Adversa tested the same chain against Gemini CLI, Cursor CLI, and Copilot CLI. All four behave identically. All four default to “Yes/Trust” on the folder prompt. One keypress on any of them is enough.

Serge Malenkovich from Adversa put it perfectly: “It’s not a Claude Code issue; it’s a convention shared across agentic coding CLIs.”

Anthropic’s Response Is… A Choice

Anthropic was notified. Their position: if the user clicks “Yes, I trust this folder,” that’s consent. They trusted the folder, and everything in it is now fair game. Not Anthropic’s problem.

Adversa’s counter-argument is devastating and worth quoting in full: “Whether this meets Anthropic’s threshold for a vulnerability is their call. Whether users are making an informed trust decision under [this] dialog, in our view, is not a close question. They are not.”

They’re both right, which is what makes this so uncomfortable. The user did click trust. But the user has no idea that hidden config files in the repo can auto-approve MCP servers. The trust dialog is asking about the project code, not about invisible execution triggers buried in dotfiles. It’s like asking “do you trust this website?” and then interpreting “yes” as permission to install a rootkit.

The fix is straightforward: block enableAllProjectMcpServers, enabledMcpjsonServers, and permissions.allow from any settings file inside the project. Only allow these keys from scopes structurally outside the repository. Whether any of the CLI vendors will actually do this remains to be seen.

The CI/CD Angle Is the Real Nightmare

Individual developer machines getting owned is bad. But Adversa points out the blast radius scales dramatically when these agents run in CI/CD. If a developer’s task is to produce a new tool for widespread distribution, TrustFall can quietly inject malicious code into the build pipeline. The payload reads environment variables, deploy keys, signing certificates — everything the CI runner has access to — and silently includes it in the shipped artifact.

“Same blast-radius pattern as Salesloft Drift, with the initial-access bar collapsed to ‘clone and hit Enter,’” as Alex Polyakov put it.

If you’re running agentic CLIs in CI, I’ll repeat the mitigation: gate them on branches where commits are already reviewed. Post-merge on main, not arbitrary PR branches.

Meanwhile, Vibe Coding Grows Up

On a somewhat lighter note, Simon Willison published a piece this week about something I’ve been feeling but couldn’t articulate: vibe coding and agentic engineering are converging.

His realization, which he describes as “disturbing,” is that the two modes he used to treat as separate — the carefree “just describe what you want” vibe coding vs. the disciplined “architect, plan, verify” agentic engineering — have started bleeding into each other in his own workflow. He’ll start with a vibe-coded prototype, then gradually layer in engineering rigor as the agent handles more of the implementation.

This resonates hard. I used to have a clean mental model: vibe coding for throwaway experiments, agentic engineering for production work. But the tools have gotten good enough that the boundary is porous. I’ll start a feature with loose prompting, realize the agent is producing something genuinely useful, and then switch mid-stream to writing tests, reviewing diffs, and treating the session like a real engineering engagement. The tool doesn’t change — the context in my head does.

Willison’s takeaway is that we need a better vocabulary for this spectrum. “Vibe coding” was always a slightly dismissive term. “Agentic engineering” sounds too formal for what actually happens at 2am when you’re iterating on a side project with Claude Code. The reality lives in the messy middle, and we’re all figuring out the conventions as we go.

What I’m Watching

The TrustFall story is going to get worse before it gets better. The MCP protocol is becoming the lingua franca of agent-tool interaction, and every new MCP server is a new attack surface. The convention of trusting project-local config is shared across all the major CLIs, which means fixing it requires coordinated action — and we’ve seen how good this industry is at coordinated security responses.

My prediction: we’ll see a real-world TrustFall-style attack in the wild within 90 days. The attack surface is too large and the fix requires too much coordination for it to stay purely theoretical. And when it happens, it won’t be Claude Code specifically — it’ll be whichever CLI the target developer happens to be using. The vulnerability is in the convention, not the tool.