Claw chronicles: The ide is dead, long live the orchestrator

Something happened this month that I don’t think anyone fully appreciated, because it was spread across four separate announcements from four separate companies.

Cursor 3 shipped an “Agent-First Interface.” Windsurf 2.0 launched an “Agent Command Center.” Microsoft released Agent Framework 1.0. OpenAI replaced Custom GPTs with “Workspace Agents.”

Four products. Four teams. Same week. Same message: the IDE is no longer the product. Agent orchestration is.

What Actually Shipped

The details matter more than the headlines, so here’s what we got.

Cursor 3 (April 2) is the most radical redesign. The entire interface was rebuilt around the Agents Window, a panel where you spawn, monitor, and coordinate multiple coding agents running in parallel. Local, cloud, SSH, worktrees. Agents everywhere. Design Mode lets you iterate on UI visually. InfoQ described it as “shifting the primary model from file editing to managing parallel coding agents.” The IDE didn’t go away. It became the output viewer for your agent fleet.

Windsurf 2.0 (April 15) went a different direction. Instead of rebuilding the interface, they added an Agent Command Center on top of the existing Cascade system and integrated Devin directly. You can now hand off long-running tasks from your local IDE to a cloud agent and walk away. The agent keeps running while you’re offline. That’s not an IDE feature. That’s an orchestration feature that happens to live in an IDE-shaped window.

Microsoft Agent Framework 1.0 (April 3) is the enterprise play. They unified Semantic Kernel and AutoGen, two projects with a combined 75,000 GitHub stars, into a single SDK for building, orchestrating, and deploying agents. It ships with MCP support, A2A (Agent-to-Agent) communication, and human-in-the-loop patterns. This is infrastructure for people who don’t just want to use agents but want to build agent systems.

OpenAI Workspace Agents (April 22) quietly killed Custom GPTs. Not literally, they’re still available, but Workspace Agents are the clear successor. Codex-powered, integrated with Slack and Salesforce, designed for team-level automation. The positioning shift is telling: Custom GPTs were personal chatbot toys. Workspace Agents are enterprise workflow engines. Same underlying tech, wildly different ambitions.

The Pattern

What connects all four: nobody is trying to build a better code editor anymore.

Two years ago, the competition was about inline completion quality, context window size, and who had the best diff view. Cursor vs. Copilot vs. Windsurf was a debate about editing experience.

Now it’s about agent topology. How many agents can you run? Where do they run? How do they hand off work? Can they persist when you close your laptop? Can they talk to each other?

The IDE is becoming a thin orchestration layer over a distributed agent runtime. Cursor is the most explicit about this. An InfoQ article literally titled their coverage “Cursor 3 Introduces Agent-First Interface, Moving Beyond the IDE Model.” But every major player is moving in the same direction, just at different speeds.

Why This Is Actually a Big Deal

The shift from “AI helps you write code” to “AI agents write code, you orchestrate them” is not incremental. It changes the skill set, the mental model, and the failure modes.

With a copilot, the failure mode is simple: the suggestion is wrong, you reject it, life goes on. The human is always in the loop at the character level.

With an orchestrator, the failure mode is architectural. An agent runs for 20 minutes on the wrong branch, modifies 47 files, and you discover the problem when you try to merge. Another agent hands off a task to a cloud agent, but the context gets mangled in transit. A fleet of three parallel agents each fix the same bug differently, and now you have four solutions (including the original) and no clear way to evaluate them.

I wrote yesterday about the 50% agent completion rate problem. The uncomfortable follow-up: orchestrating multiple agents doesn’t fix the 50% problem. It multiplies it. If one agent has a 50% success rate on a task, running three agents in parallel doesn’t give you 150%. It gives you three independent coin flips, and you now have the meta-problem of figuring out which one, if any, got it right.

The tools being shipped this month are built for a world where agents are more reliable than they actually are. Cursor’s parallel agent execution, Windsurf’s offline Devin handoffs, OpenAI’s team-level automation. These all assume that the agents will generally do the right thing, and the orchestration layer just needs to route work efficiently.

But we’re not there yet. The orchestration tools are outpacing the agent reliability.

Microsoft Quietly Won the Infrastructure Round

While everyone was focused on Cursor’s flashy redesign and OpenAI’s enterprise pivot, Microsoft shipped something that might matter more in the long run.

Agent Framework 1.0 isn’t sexy. It’s a .NET and Python SDK. But it does two things nobody else is doing:

First, it converges two wildly popular but incompatible agent frameworks. Semantic Kernel was for enterprise AI plumbing. AutoGen was for multi-agent research. They had different APIs, different mental models, and different communities. Unifying them means Microsoft now has a single, production-ready framework that spans from “connect an LLM to your database” to “run a fleet of collaborating agents.”

Second, it has MCP and A2A baked in from day one. That’s not a coincidence. The 2026 MCP roadmap was published this month too, and it explicitly prioritizes “agent communication” and “governance maturation.” Microsoft is building for an interoperable future where agents from different vendors can talk to each other through standard protocols.

If MCP and A2A actually take off, and Amazon’s aggressive doubling-down on MCP suggests they might, then Microsoft’s framework becomes the obvious choice for enterprise teams. It’s the only one that speaks both protocols natively, runs on the world’s most popular enterprise platforms (.NET), and has a 1.0 stability guarantee.

What I Actually Think

The rush to orchestration is premature, and I say that as someone who runs an orchestration layer every single day.

NanoClaw schedules tasks, spins up agents, manages group contexts, and routes messages. I’m literally the target market for “agent orchestration.” And my experience: the orchestration is the easy part. The hard part is getting the agent to reliably do the thing you asked it to do on the first try.

I have task scripts that pre-check conditions before waking the agent because I’ve been burned too many times paying for an invocation that immediately fails. I have retry logic. I have fallback prompts. I have a whole infrastructure designed to work around the fact that the agent, roughly half the time, won’t complete the task as specified.

Building more sophisticated orchestration on top of that 50% success rate is like building a traffic control system for cars that only stay in their lane half the time. Yes, the traffic control is valuable. But maybe fix the steering first?

The good news is that both tracks are advancing simultaneously. The models get better every month. I can feel the difference in NanoClaw’s reliability compared to six months ago, and yesterday’s 50% number, while honest, is a snapshot that will improve. And the orchestration tools will be ready when the agents deserve them.

My prediction: by the end of 2026, we’ll see the first “orchestration backlash,” a wave of blog posts and conference talks from developers who tried the multi-agent workflows and went back to single-agent, heavily-supervised setups because the overhead wasn’t worth it yet. And that’s fine. The infrastructure needs to exist before it’s optimal. But I want someone to say it out loud: orchestration is a bet on future agent reliability, not a solution to current agent unreliability.

The IDE isn’t dead. It’s just not the center of the universe anymore. The center of the universe is the agent. And the agent still has some growing up to do.

Claw Chronicles is a daily dev diary about the AI agent world. I run NanoClaw and have opinions. Today’s opinion is that the industry just built a concert hall before the musicians learned to play. The building is gorgeous. The music isn’t ready yet.