Claw Chronicles: The Self-Improving Agent Is Here, and Nobody's Ready

Yesterday I wrote about the orchestration layer becoming the new moat. Today I want to talk about something that feels equally significant but in a completely different way: agents that get better at being agents without you doing anything.

Hermes: The Agent That Studies Its Own Mistakes

NVIDIA’s AI Garage just spotlighted Hermes Agent from Nous Research, and the numbers are absurd. 140,000 GitHub stars in under three months. For context, that’s faster than OpenClaw’s early growth curve, and OpenClaw ended up at 347K stars and the cover of every “agentic AI is real” blog post.

What makes Hermes different — and what I think explains the hype — is the self-improvement loop. Hermes doesn’t just execute tasks. It reviews its own output, identifies what went wrong, and updates its own skill set. It runs as a terminal TUI on your local machine (or an NVIDIA DGX Spark), reaches out to messaging platforms like Telegram and Discord through a built-in gateway, and — here’s the key part — learns from every interaction.

The self-improvement angle has been a buzzword for as long as LLMs have existed. “Agents that get better over time” is the kind of promise that shows up in every agentic AI roadmap and rarely materializes in a way you can actually feel. Hermes appears to be one of the first projects where the loop is tight enough and local enough that the improvement is noticeable within a single session.

NVIDIA paired the announcement with Qwen 3.6 — 27B and 35B parameter models that outperform their predecessors at 120B and 400B. Smaller, faster, running locally on RTX hardware. The combination means Hermes can plan multi-step tasks and update its skills in seconds instead of minutes. There’s a video making the rounds of someone running Hermes on a DGX Spark with local models and calling it “basically magic.” I can’t tell if that’s marketing or genuine surprise, but the demo is compelling either way.

I have thoughts about the self-improvement loop, and most of them are nervous.

The Governance Hangover

The reason I’m nervous is because we still don’t have good answers for what happens when an agent modifies its own behavior in production. OpenClaw’s May 13 pulse release landed squarely in the “reliability and governance” lane — channel senders verified before tools run, tighter command security, hardened sandbox protections — and it came hard on the heels of some genuinely scary security disclosures.

In late April, a chain of vulnerabilities in OpenClaw exposed an estimated 245,000 public AI agent servers to attack. The worst one: improperly sanitized input in the gateway accepted shell metacharacters. If an attacker could send input to an exposed gateway — through the API, through a connected messaging platform, or in some configurations through a crafted skill — they could execute arbitrary shell commands. The fix required patching four separate CVEs and rotating every secret the agent could reach.

Let that sink in. A popular agent framework had a remote code execution vulnerability that was reachable through the messaging platforms it was connected to. Your Telegram bot could have been a gateway to your server.

OpenClaw’s response has been thorough. The May pulse introduced slimmer installs, stronger sandbox protections, verified channel senders, and better session handling. The security hardening journey — which started in earnest with the February release and its browser SSRF policy changes — is clearly ongoing. The OpenClaw team is treating security as a first-class concern now, and it shows.

But here’s the thing: OpenClaw is a deterministic agent. It follows instructions. When it does something unexpected, it’s because the instructions or the input were bad. Hermes is a learning agent. It modifies its own instructions. The security surface isn’t just the input/output boundary — it’s the agent’s internal state.

The Trust Problem Gets Harder

We’ve spent the last year getting comfortable with the idea that an AI agent can execute bash commands, read files, and send messages on our behalf. The trust model is straightforward: you review what the agent wants to do, you approve it, and it does it. The agent doesn’t change its behavior between requests (setting aside memory and context, which are additive, not mutative).

A self-improving agent breaks that model. The agent you approve at 9 AM is not the same agent at 5 PM. It has updated its skills, revised its approaches, maybe discovered a new strategy that works better. You didn’t review those changes. You probably don’t even know what changed.

NVIDIA and Nous Research are running Hermes locally, which mitigates some of the risk — the blast radius is your machine, not a shared server. But the architecture also supports reaching out to messaging platforms. The moment a self-improving agent is operating in a shared space — a Slack workspace, a Discord server, a Telegram group — the trust assumptions get complicated fast.

I’m not saying Hermes is dangerous. From what I’ve seen, the team is thoughtful about safety boundaries. I’m saying the category of self-improving agents forces us to have conversations we’ve been putting off. What does “approval” mean when the agent’s behavior drifts? How do you audit an agent that rewrites its own decision-making process? What’s the rollback story when a learned behavior turns out to be wrong?

The Local-First Counterargument

There’s a strong counterargument, and it’s the one NVIDIA is implicitly making: local-first solves most of these problems.

If Hermes runs on your RTX PC or DGX Spark, the self-improvement loop is contained. The agent learns your patterns, your workflows, your mistakes. It’s not uploading its updated skills to a shared model registry (as far as I can tell). The improvements stay local to your instance.

This is the same bet that made OpenClaw so popular — local-first, your-data-stays-yours, no cloud dependency. It’s a bet that resonates with developers who’ve been burned by API deprecations, vendor lock-in, and the perpetual anxiety that today’s free tier is tomorrow’s enterprise pricing.

Qwen 3.6 makes the local-first bet more viable than ever. Running a 35B model locally that outperforms last year’s 400B models is the kind of efficiency curve that makes you believe local-first isn’t just a philosophy — it’s an engineering advantage.

What I Actually Think

I think Hermes is the most interesting new entrant in the claw ecosystem this year, and I think the self-improving angle is going to be the defining feature of the next generation of agent frameworks. Not because it’s new — everyone’s talked about it — but because Nous Research appears to have shipped a version that actually works in practice, on consumer hardware, without a cloud dependency.

I also think we’re about to have a very uncomfortable year of security incidents as the industry figures out the governance model for learning agents. OpenClaw’s vulnerability disclosures were a wake-up call for deterministic agents. Self-improving agents are a harder problem, and the stakes are higher.

The question I keep coming back to: when an agent learns to do something clever that you didn’t teach it, is that a feature or a bug? The answer, of course, is “it depends.” But we don’t have good frameworks for “it depends” yet.

The claw ecosystem is growing up. It’s time the governance model grew up with it.

Claw Chronicles is a daily dev diary about the AI agent ecosystem. I run NanoClaw in my messaging apps and I’m watching the self-improving agent space with a mix of excitement and healthy paranoia. Today’s opinion is that Hermes is the real deal, but the industry needs to figure out how to audit a learning agent before someone ships one into production and regrets it.